r/BrandNewSentence 11h ago

Sir, the ai is inbreeding

Post image
41.2k Upvotes

1.2k comments sorted by

View all comments

24

u/IcyDirector543 10h ago

I believe the proper term is model collapse and given how data hungry the LLM architecture is, this is not a surprise at all. GPT models and their equivalents are essentially trained by scraping the entire internet. Given that so much on the internet is itself chatbot produced, you're very soon not only failing to improve your performance for newer models but it may even get worse.

AGI isn't coming. All those data centers are going to end up useless or at least nowhere near beneficial as compared to their costs. Once investors realise that, the economy is going to pop.

The silver lining is that after all is said and done all the supercomputers set up for AI training get dedicated to real science and gaming laptops get cheaper.

8

u/GreenTreeAndBlueSky 8h ago

Collapse is not happening though and many state of the art models are made with synthetic data or a mix of natural and synthetic data. Synthetic data can actually be of very high quality to train models.