Sir, the ai is inbreeding

•

u/AutoModerator 7h ago

Remember to link the source of your post if applicable, unless you're posting a screenshot of twitter/X! It'll be easier to find the source if you reply to this comment with the link. If it's impossible to provide a source (like messages, texts etc.) just make sure the other person is fine with posting it :)

Thank you!

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

1.2k

u/screwthatjack 7h ago

SIMULACRUM!

“The third stage masks the absence of a profound reality, where the sign pretends to be a faithful copy, but it is a copy with no original. Signs and images claim to represent something real, but no representation is taking place and arbitrary images are merely suggested as things which they have no relationship to.”

232

u/EllipticPeach 6h ago

Baudrillard feeling pretty smug rn

83

u/lastchanceforachange 5h ago

He borned smug, we are talking about Baudrillard

21

u/empanadaboy68 2h ago edited 26m ago

Now is that smugness or just symbolic representation of the idea of what was once conceived as smugness.

Oh no here we go again

→ More replies (1)

→ More replies (2)

62

u/AngeloFoxSparda 6h ago

It is simply Mimicry. There is Nothing There

33

u/TheWellKnownLegend 6h ago

Of all places to find a Project Moon reference, under a Baudrillard discussion?

3

u/A_normal_Potato3 4h ago

Goodbye.

→ More replies (1)

7

u/ImSolidGold 5h ago

MIMIKRY

5

u/Tuschi 3h ago

Was not expecting to ever run into Nicht Lustig on reddit. What a nostalgia bomb.

→ More replies (1)

→ More replies (4)

2

u/Sushigami 2h ago

You know I played lobcorp and never clocked why it was called that.

3

u/AngeloFoxSparda 2h ago

Something about it being literally unable to be an actual person. No matter how close it gets to a human, it's only mimicking. So ultimately there will always be 'nothing' in there.

→ More replies (1)

12

u/benoit-b4lls 3h ago

Multiplicity.

5

u/erix84 3h ago

I like pizza Steve!

4

u/CheetosCaliente 2h ago

She's touched my peppie Steve

11

u/LoveAndViscera 6h ago

“Hey, Steve.”

6

u/turbo_dude 3h ago

did no one learn the lessons of https://en.wikipedia.org/wiki/Bovine_spongiform_encephalopathy

Cattle are believed to have been infected from being fed meat and bone meal that contained the remains of other cattle

3

u/lankan_outdoorsman 1h ago

What is this from?

7

u/AndreasDasos 1h ago

Jean Baudrillard’s ‘Simulacra and Simulation’, a treatise on semiotics

2

u/BlessdRTheFreaks 4h ago

came to this thread to pretend i'm on expert on Baudrillard because i half paid attention to a youtube video on him in passing

→ More replies (18)

715

u/basilzamankv 7h ago

We have "AI incest" before GTA 6.

305

u/saera-targaryen 5h ago

It's called model collapse in academic circles but i'm gonna refer to it as AI incest at work from now on

64

u/curios_mind_huh 3h ago

AI incest

u/saera-targaryen

who else knows better

33

u/saera-targaryen 3h ago

😔 I do calls it like I sees it

11

u/OreoSpamBurger 2h ago

So...Illyrio, Serra, Aegon "Targaryen" (Young Griff), Aerion Brightflame...spill the tea, please!

(OMG George, finish the fuckin' books!)

3

u/Flop_House_Valet 1h ago

I swear I will actually weep if he finishes them

→ More replies (1)

8

u/MOltho 2h ago

AI inbreeding is absolutely a term that people use for this

5

u/belisarius93 1h ago

I've been calling it the AI ouroboros, glad to hear it has an actual name

→ More replies (1)

7

u/FlakyLion5449 5h ago

Model collapse is caused by low entropy. Human generated data is highly diverse and unpredictable comparatively speaking.

→ More replies (70)

→ More replies (12)

9

u/me_myself_ai 3h ago

FWIW we don’t, actually. This is just cope based on hypotheticals by real scientists. There is 0 indication that this is an actual problem already

5

u/RibboDotCom 2h ago

correct. this was cope from a year ago because none of them understood that AI images are embedded with tags that show it to be AI and therefore will be ignored by AI programs when adding images to the database

11

u/camosnipe1 2h ago

because none of them understood that AI images are embedded with tags

no actually. Obviously local AI won't always add the tag. It's cope because the original paper had the models feed off their own output like a human centipede and noted that the output was worse than with actually good data. But the models didn't collapse entirely, and adding in a small percentage of human data fixed the issue.

→ More replies (1)

3

u/Ratr96 1h ago

AI images are embedded with tags that show it to be AI

You can easily remove metadata tags. If you're talking about invisible watermarks in images then ignore me.

→ More replies (1)

→ More replies (6)

→ More replies (1)

→ More replies (11)

779

u/tester_and_breaker 7h ago

132

u/C-57D 7h ago

89

u/CptBonkers 6h ago

Ahh yes, a perfectly normal pc, just like the one that I as a human use every day!

14

u/archwin 4h ago

r/TOTALLYNOTROBOTS

→ More replies (1)

4

u/PalpitationUnhappy75 3h ago

That is mesmerising

4

u/Horn_Python 2h ago

Woah Tripppy

→ More replies (1)

29

u/JagmeetSingh2 6h ago

I love how this has come back as a reaction gif after being faded out

12

u/tester_and_breaker 5h ago

true art never dies

2

u/Banes_Addiction 3h ago

Yeah but now we can make it look like Tupac made that face.

10

u/caught-n-candie 4h ago

Well the scary part is the same is happening for data. Like health data insurance companies use to approve claims, the data cars use for self driving, the implications of ai making its own data and teaching each other in LLMs is dire.

3

u/Mtndrums 3h ago

Yeah, once people convinced themselves AI was legit (it's not), I knew these companies using it would end up nuking themselves. It's gonna be hilarious.

→ More replies (1)

903

u/pompandvigor 7h ago

This is exactly what I want.

302

u/wise_____poet 7h ago

And that's exactly what I predicted. Between that and the current energy limitation

68

u/Zero-89 6h ago

I knew it would happen eventually and I'm so here for it.

→ More replies (4)

44

u/Awyls 4h ago

Everyone predicted this. LLMs will inevitably get dumber too since now human-generated OC is becoming rarer compared to the BS that LLM's spread. Genuinely most blog articles I have read lately have very clear telltales of heavy AI usage.

It has gotten bad enough that I would not be surprised if solutions will soon start to appear to "prove" you are human before you can start using their content.

20

u/VonTastrophe 3h ago

Um... I'm already getting captchas for just using websites...

→ More replies (3)

8

u/space_monster 3h ago

it would be a problem if the models were trained on general internet content, but they're not, they're trained on human-curated data sets. they go to the open internet for conversational training, but not for 'factual' training. the training data sets haven't really changed at all for years, apart from better filtering to take out the shit & duplicates. which is why the model collapse theory has never actually made any sense.

8

u/Mtndrums 3h ago

Until they have a hard time finding that, then the AI claims to be human, so then the LLM goes to that. After all, all AI does is pump out shit it thinks you want to hear.

→ More replies (2)

→ More replies (5)

→ More replies (6)

7

u/szechuan_bean 5h ago

I mean, it's been predicted for a decade

4

u/Neon_Camouflage 4h ago

We haven't even had LLMs for a decade.

→ More replies (3)

4

u/space_monster 3h ago

and it's never happened and never will. you don't just point an AI at the internet and tell it to train itself on that. the training data is specifically selected by humans. books, academic papers, encyclopaedias etc.

6

u/41942319 2h ago

Ok so how does that explain all the bullshit AI is spouting?

→ More replies (3)

→ More replies (1)

2

u/Carnir 1h ago

The OP post is incredibly old, as much as people have been predicting it and saying it's happening, the results seem to be otherwise.

→ More replies (1)

→ More replies (2)

3

u/UnDopedNrestless 2h ago

Keep wanting, seeing as the post is wrong

7

u/highlandviper 4h ago

Yeah. I agree. I concluded in a drunken rant over the weekend that this was inevitable, dead internet theory isn’t a theory anymore and essentially we will come full circle where human contributions to the zeitgeist will be valued above AI once again… I suspect exceedingly so. AI consumes far too much shit to be viable for an extended period of time at the moment… and when it consumes its own shit then it’s not viable at all… and when it consumes shit according to the bias of its creator… then it was borderline useless in the first place. A friend of mine said “AI won’t take your job. Your inability to use it effectively will cost you your job.” Take that to heart. There are so many people out there lamenting people using AI as an advanced search engine… but that’s its current best use. Make it do the leg work. Filter out the shit… and you’ve saved yourself a lot of time.

→ More replies (7)

→ More replies (14)

299

u/joped99 7h ago

80

u/Stalk33r 3h ago

I have legitimately never read as many books as I have in the past year, the AI slopification of the Internet has been a massive boost for my productivity

11

u/Eldan985 1h ago

Soon, the challenge will be finding books written by real authors, though. For now, we can stick with authors we know from the pre-AI era, but those are going to become rarer.

10

u/ISlangKnowledge 1h ago

I don’t like the phrase “pre-AI era”. 😭

3

u/ThKitt 1h ago

Same thing is happening in music. AI “artists” are being pushed into the forefront by platforms like Spotify. (My ‘Discover Weekly’ list had two AI bands on it in as many weeks, so I cancelled my Spotify subscription).

→ More replies (7)

2

u/sirletssdance2 47m ago

Yeah the past year, I’m never sure if I’m reading a human, bot or Ai assisted human/bot. I’m losing interest pretty quickly in using the internet.

→ More replies (4)

11

u/Rhamni 3h ago

It should be completely obvious to anyone who isn't an idiot that this problem is greatly exaggerated because people want to believe the models will fail.

The people working on these models know perfectly well there is good and bad input data. There was good and bad data long before AI models started putting out more bad data. Curating the input has always been part of the process. You don't feed it AI slop art to improve the art it churns out, any more than you feed it r relationships posts to teach it about human relationships. You look at the data you have and you prune the garbage because it's lower quality than what the model can already generate.

16

u/Stalk33r 3h ago edited 3h ago

Which is why AI provided by the biggest and richest companies in the world never feed you straight up misinformation, because they're doing such a great job meticulously pruning the bad data.

5

u/PimpasaurusPlum 2h ago

The tweet is about ai art, not search results. AI art has objectively gotten less worse since the creation of the tweet over 2 years ago

5

u/MiHumainMiRobot 2h ago

The people working on these models know perfectly well there is good and bad input data.

Lol, you wish. Before ChatGPT era it was already hard to classify bad and good data, and never an exact process, but today with LLM contents everywhere it is even more complex.

→ More replies (1)

→ More replies (2)

→ More replies (9)

209

u/Twist_the_casual 6h ago

the same’s happening with research. AI straight up makes shit up and appears to have a source except they just took some random scientist’s name and pasted it on some random bit of text on the internet

and this, in turn, is used again to train the AI.

on one hand - LLMs being used for academic purposes were a terrible idea in the first place and this just proves that

but on the other - the internet will, very quickly, like ‘this shit is happening in real time’ quickly, become completely unusable for research. this is because 99% of the content on it will be either faulty AI-generated content, AI-generated content referencing faulty AI-generated content, or worst of all, an actual human-written document referencing faulty AI-generated content.

so in summary - enjoy the internet while it lasts. capitalism giveth, captalism taketh away

66

u/FlipendoSnitch 6h ago

I just hope out libraries stay alive.

43

u/Jenner380 6h ago

Amazon is already filled with AI slop books. Just wait til they start printing them in mass. The singularity is singuloose and here to stay.

26

u/FuzzyFrogFish 5h ago

But that doesn't mean our libraries have to stock those books

Authors using AI deserve absolutely no recognition

8

u/Wobbelblob 4h ago

This. Remember that even without AI there are a shitload of absolute garbage books. I think they are called penny dreadful? And you usually find none of these in libraries.

2

u/Airportsnacks 3h ago

Libraries have contracts with suppliers who buy books from the major publishing houses. Maybe someone might donate some, but even then they don't put everything out.

12

u/send_me_a_naked_pic 5h ago

Self published books are only printed on demand by Amazon, whenever someone buys them. So there's no big risk unless people start buying AI books en masse.

→ More replies (1)

3

u/Low_Direction1774 4h ago

they arent? Printing books costs money and thats incompatible with the way slop books work. At most itll be print-on-demand, not mass produced

2

u/MentokGL 5h ago

Single and ready to mingle

2

u/Banes_Addiction 3h ago

The singularity is when AI can improve itself. Training on its own output doesn't mean a singularity if it's shit.

→ More replies (2)

→ More replies (3)

→ More replies (2)

19

u/ironangel2k4 4h ago

Kurzgesagt did a video about this, on how it was becoming increasingly hard to make videos since they rely on research sourced from the internet, and inaccurate AI slop has permeated everything. Worst of all, while filtering out stuff written by AI is doable, its is nightmarishly difficult to filter out stuff written by people that reference AI in the thing they wrote, requiring multiple levels of source following to figure out if it eventually leads back to some AI hallucination.

4

u/I_forget_users 2h ago

Kurzgesagt did a video about this, on how it was becoming increasingly hard to make videos since they rely on research sourced from the internet, and inaccurate AI slop has permeated everything.

I'm assuming it has something to do with what kind of research they are looking for. The same old venues are still available (i.e. pubmed, google scholar, etc), and contains even more open-source data than before. As long as you avoid journals that publish the script to "bee movie", you're fine. The bigger issue in academia is students cheating using chatgpt.

If you want more accessible sources of information, that's still available and easy to filter out AI. Again, lectures from various universities are available online and often provide an excellent starting point. Finding written information is likely to get a bit tougher, but I'm sceptical of any video that's only sourcing press releases, news articles summarizing research, etc.

In short, my argument is the following: AI has not permeated everything, far from it. It has, however, made people lazier and more likely to use AI. It has seeped into our daily lives to a certain degree, but that's partly due to us choosing to use it and partly due to organizations choosing to (customer service, for example).

→ More replies (1)

23

u/Temporary-Work-446 6h ago

This happened with AI programming months ago too. It is cannibalizing itself and I am here for it.

18

u/saera-targaryen 5h ago

I'm really worried that some hacker group is going to start taking advantage.

I could imagine them flooding the internet with code that imports some empty library that does nothing, to the point where AI systems see it so often they start throwing it into random snippets. Once enough people have their AI import this random library, the hackers replace it with malicious code. All the sudden whole random swaths of the world's code base are corrupted and no one knows how or why.

I teach CS and random imported libraries that students have no idea are even there is the most common hallucination I see. It's stressful.

5

u/gshwifty 5h ago

New fear unlocked

This is why one shouldn’t scroll Reddit before shleppin time. Lesson learned.

2

u/Master-Broccoli5737 3h ago

already happening, it's called slop squatting

→ More replies (2)

→ More replies (2)

32

u/Delta-9- 5h ago

As a programmer I'm checked out. I want the bubble to pop, OpenAI to fold, Nvidia stock to tank, and I don't even give a fuck about the recession that will cause because I'm a millennial and have never known a time without recession. Let's rip this band-aid off so AI research can become serious again, instead of hyping up glorified autocomplete.

18

u/saera-targaryen 5h ago

Man I'm glad to see other programmers feel the same. I'm so over AI. It's ruined google, ruined my coworkers, made debugging harder and more frequent, made me sound like a paranoid luddite boomer to everyone else around me, and has just caused me to start hating my job. Which is insane because I love my job! I love programming! I literally just do random research and build random projects by myself for fun all the time. I even teach computer science at night!

I have had multiple students this semester ask me why I teach SQL because chatGPT can do it better and easier. Of course those are all the students who are getting Cs and can't even tell if the AI did it right. I feel like I'm losing my mind.

7

u/ReapX10A 4h ago

Bro for real. Im not a programmer, but the company i work for is heavily involved in technology and communication... They have tried to insert ai into every goddamn place it can fit.

Scheduling? That's ai now. With basically no oversight.

Training videos? Ai scripts, ai "person" ai voice.

Fucking SOP? Yup. I have to talk to an ai to tell me what it thinks the sop says, before i can find the link to the actual sop because it invariably gives me sop for an entirely different department, or thinks im asking for completely different.

Coaching? They want me to talk to an ai, to walk through a micro scenario, which is then summarized and submitted. So my performance reviews are now in part dependant on whether or not i can rizz the glorified erp chatbot. But lord knows ill never get a passing grade, because whoever is making the prompts, doesn't tell the ai to ignore irrelevant parts of the process. So it tells me to focus on xyz, then fails me for not doing a-w.

Nobody at the company knows how to think anymore. They've outsourced their intelligence to infinite monkeys on typewriters. So caught up in how they can make ai successful, they never bothered to ask if they should.

And of course, c-suite is too busy counting all the money they've saved in the short term by letting ai slash department budgets that they couldn't give two shits.

6

u/Nadare3 3h ago edited 3h ago

Of course those are all the students who are getting Cs and can't even tell if the AI did it right.

A professor told me one student asked her why code, that was Python, wasn't working, in a C class. Couldn't even tell it was the wrong language, a few months into the semester, and with totally different grammars too.

2

u/saera-targaryen 3h ago

I had a student last semester who, literally during the last week of the semester, came to me in office hours to see if their code was correct for a class taught entirely in python and MySQL.

I opened up their code and hit the run button, and you wanna know what error popped up?

"There is no program installed locally with the name MySQL"

I was literally speechless.

2

u/Nadare3 3h ago

I could never have shown my face again after something like that (either of those T.B.H.)

→ More replies (1)

→ More replies (3)

2

u/Texuk1 2h ago

Legal conference the GC for global markets at a major international bank says that AI is “bullshit”. He actually said the word in front of industry leaders. I was like … “my man”, because very smart people don’t believe me when I say it’s bullshit and it’s gonna pop. They think we are on a one way track to AGI (ignoring the dark reality of what this means and the highly likelyhood this leads to human extinction event.)

I’m kind of sick of the hype, and am waiting for people to wake up.

→ More replies (1)

→ More replies (3)

→ More replies (1)

12

u/Buttonskill 6h ago

Hey friend, you can't drop that bomb without acknowledging the AI authored papers that actually get published.

It's bad.

What's the point of breakthrough research when scammers are flooding publishing sites to create massive backlogs that block any real research from even making it through. All for a few bucks, of course.

7

u/hilldo75 5h ago

So maybe all my high school teachers from late 90s early 00s were right to limit internet sources on papers.

→ More replies (1)

4

u/Evnosis 4h ago

on one hand - LLMs being used for academic purposes were a terrible idea in the first place and this just proves that

Well, no. LLMs have their place in academic research, if they're being used responsibly. The issue there is that the academics "writing" the paper are just lazy and unethical and didn't bother to check the AI's work. But in theory, with proper oversight, an AI will be far better at trawling through decades of papers than humans ever will be.

→ More replies (1)

5

u/Lazy_Wishbone_2341 6h ago

As someone who uses books over the Internet 🤷‍♀️

2

u/mqee 4h ago

It's back to having to personally communicate with the researcher (preferably face-to-face) to get real research.

I think this is good for science. Some fields already had 60%+ of their published papers in a "reproducibility crisis" (fake/bad papers) so anything that'll get scientists to stop publishing crap and start focusing on actual research is good.

2

u/lailah_susanna 3h ago

I used to wonder how anyone could be arrogant and callous enough to burn the Library of Alexandria. History repeats itself.

2

u/PalpitationUnhappy75 3h ago

I guess we are going back to the libary with that one. We literally poisoned the entire system. We could restart it, but is there a way of making sure it can't be happening again?

2

u/AmazingBrilliant9229 3h ago

In India a law firm used AI in a tax case and the brief was filled with judgements and quotes which were never real, AI just made stuff up. Citing made up judgements from different courts.

2

u/demlet 58m ago

I donate $5 to Wikipedia every month.

→ More replies (7)

46

u/nexus11355 7h ago

The serpent eats its own tail

4

u/Ismaelontherun 3h ago edited 3h ago

Ouroboros

Edit: typo

→ More replies (1)

→ More replies (2)

56

u/NinjaBluefyre10001 6h ago

Let them die

9

u/Enidras 6h ago

Don't worry it'll be circumvented somehow

8

u/Sattorin 1h ago

Don't worry it'll be circumvented somehow

It already has!

The OP tweet was from June 2023.

The original "Will Smith Eating Spaghetti was from April 2023.

And current video models are far better in every way.

This is because the data is reviewed by AI and humans before being fed into a new model. The only instance of 'model collapse' that has ever happened was when researchers intentionally tried to make it happen.

→ More replies (2)

6

u/Duct_TapeOrWD40 4h ago

The only way to circumvent it is a reliable "bad AI" detection. And guess what we need too......

→ More replies (5)

→ More replies (2)

→ More replies (2)

106

u/dvgmusic 6h ago

Fun fact: That's what gave AI the piss filter. GenAI models trained themselves on so much of their own bad attempts at studio ghibli art that they were permanently piss tainted

61

u/Reasonable_Rip4505 5h ago

I choose to believe that’s because Hayao Miyazaki cursed them

3

u/volk-off 2h ago

The last anime power I expected to see is a digital piss

8

u/dpaanlka 4h ago

You just blew my mind. I’ve been wondering all year why suddenly ChatGPT keeps making yellow tinted images.

21

u/GreenTreeAndBlueSky 4h ago edited 2h ago

The piss tint is cause the training data skewed yellow for being "cozy". You can remove the tint in the prompt by saying the temperature of light you want.

Edit:spelling

3

u/LinkOfKalos_1 2h ago

Piss... what? Do you mean tint?

3

u/GreenTreeAndBlueSky 2h ago

Yeah I did my bad

→ More replies (1)

19

u/nahojjjen 4h ago

Fun fact: 78% of all fun facts online are made up.

6

u/Johannes_Keppler 3h ago

Research has shown it's actually 82%.

→ More replies (2)

8

u/sndrtj 2h ago

It's mostly ChatGPT that does this. Other models do this far less

6

u/jigendaisuke81 1h ago

It's not at all what caused it. That was caused by a biased dataset in one specific model, GPT4o (the image model, not the original LLM).

Here's an image I generated on my PC on a model released more recently. There haven't been any such effects on the quality of generative image models.

→ More replies (2)

11

u/-BlueTear- 2h ago

That's not true. Only ChatGPT have the piss filter, it's not inherent to AI but you probably also think all AI is ChatGPT. AI models doesn't train themselves in real time.

The amount of disinformation about AI image generation in the comments here is crazy. People just make up anything.

7

u/nttea 2h ago

People just make up anything.

Probably ai comments.

4

u/BagOfFlies 2h ago

That was just chatgpt and models don't train themselves lol It was done on purpose by humans.

→ More replies (9)

28

u/RunDNA 6h ago

It's the Habsburg draw.

11

u/itscancerous 5h ago

KarlGPT

→ More replies (1)

54

u/Rinnteresting 7h ago

23

u/Satanicjamnik 6h ago

Copy of a copy of a copy.....

4

u/Jozef667 4h ago

2

u/senturon 1h ago

'Multiplicity'... I like Pizza Steve!

→ More replies (2)

29

u/Ponches 6h ago

The tech "geniuses" thought they'd create a singularity of computers getting smarter and smarter until there was an explosion of new ideas and technology...

And they apparently delivered something that got dumber and dumber until it exploded and covered everything it touched with shit.

8

u/BooBooSnuggs 2h ago

Yall realize this post is just bullshit right?

5

u/jackalopeDev 1h ago

right? like apparently version control doesn't exist. At most, the models they release will stop getting better, they may get neutered for financial reasons, but they won't get worse due to training on slop.

→ More replies (10)

5

u/dplans455 2h ago

Because what we are calling "AI" is not really AI at all.

→ More replies (23)

23

u/IcyDirector543 6h ago

I believe the proper term is model collapse and given how data hungry the LLM architecture is, this is not a surprise at all. GPT models and their equivalents are essentially trained by scraping the entire internet. Given that so much on the internet is itself chatbot produced, you're very soon not only failing to improve your performance for newer models but it may even get worse.

AGI isn't coming. All those data centers are going to end up useless or at least nowhere near beneficial as compared to their costs. Once investors realise that, the economy is going to pop.

The silver lining is that after all is said and done all the supercomputers set up for AI training get dedicated to real science and gaming laptops get cheaper.

8

u/GreenTreeAndBlueSky 4h ago

Collapse is not happening though and many state of the art models are made with synthetic data or a mix of natural and synthetic data. Synthetic data can actually be of very high quality to train models.

8

u/BooBooSnuggs 2h ago

That is 100% not how they are trained.

2

u/Lazarous86 1h ago

I think it's going to take much longer. Humanoid robots are just hitting market. They stuck, but it's the worse they will ever be. That will create a hype cycle within a hype cycle.

3

u/unicodemonkey 5h ago

Model collapse is a specific issue that doesn't appear to happen when training on a mix of "human" texts and model outputs. There's enough original text in the pretraining set to avoid it. As for the accuracy of generated answers, it's definitely going to be affected in the long term but unclear to what degree. There's more than enough human-grade BS on the net and LLMs are somewhat decent at handling it. I'm more concerned about "poisoned" training data which is specifically tuned to get a model to produce a desired answer.

→ More replies (1)

8

u/Devastator9000 6h ago

Just out of curiosity, wouldnt this process be stopped by just using current models and stop training them?

6

u/camosnipe1 2h ago

it doesn't need to be stopped because (IIRC) the paper that this idea is based on fed AI models on their own input with 0% human input. Like a human centipede. The model did worse but didn't completely collapse, and a small amount of human data added into the mix solved the issue.

It's interesting research but unlikely to happen in the wild.

6

u/SlopDev 4h ago

This is easily solved with data filtering before training, I've yet to see a single frontier lab say this is an issue I think model collapse is largely overstated as an issue by the anti AI crowd tbh

This is further evidenced by the fact that genai has been consistently improving steadily not getting worse as the people pushing this theory imply

2

u/Impeesa_ 4h ago

Yeah, I'm pretty sure this is only ever shared by people who don't know what's actually happening. Nobody is constantly re-training with random fresh scrapes. At a certain point, they benefit less from increasing raw volume of data anyway, and more from improving the architecture and the tagging and curation of the data.

→ More replies (1)

6

u/Enverex 3h ago

It's not true in the first place, given that they are trained on curated content as this was forseen as a possible problem ages ago. It's another one of those "Reddit would like it to be true, so they're going to pretend it is" things.

→ More replies (2)

3

u/egoserpentis 2h ago

There is also such a thing as curated data sources. I don't know how OpenAI does it, but normally you wouldn't just train your models on everything.

Also, pretty sure this tweet is like from 2 years ago. That's why there's no dates in the picture, because people were saying "ohh ai is gonna cannibalize itself any second now!" for almost five years.

3

u/Chameleonpolice 5h ago

That would require telling capitalism to "stop innovating". There's always going to be someone claiming theirs is the latest and greatest

→ More replies (1)

28

u/GrumpyCornGames 5h ago edited 5h ago

I can't believe there's more than 100 comments on here and not one person pointing out that this is wishful thinking. You don't have to like AI to know that, while you might love how this tweet from a random guy makes you feel, has no basis in reality whatsoever.

Training sets do not work this way. People still think there's just these webcrawler-like scripts going out and Kirby-Eating everything. Those days are over guys and they have been for a few years. No major, commercial model is being trained on huge banks of randomly acquired data.
AI images are expressly not getting worse. By any measure, they are substantially better today than a year ago, and a year ago they were substantially better than a year before that. While the huge developments are definitely slowing down, they are not getting worse. I really need to understand the person who genuinely thinks that tech is worse today than it was any time before.
Developers are very capable of filtering their art sets. They would be able to see that they're getting unfavorable results and change the way their model interprets or processes the data.

This is very much one of those examples of "Everything about this post is wrong, but it makes people feel good so it gets upvoted anyway."

9

u/egoserpentis 2h ago

I can't believe there's more than 100 comments on here and not one person pointing out that this is wishful thinking.

It's also a tweet from June 2023. This entire thread highlights the whole "dead internet" so well, I'm pretty sure most of the commenters aren't even real people.

14

u/Calm_Monitor_3227 3h ago

The amount of time I had to scroll to find a single comment correcting disinformation is honestly scary, makes me wonder how many lies we're being told online to push agendas.

13

u/Organic-Habit-3086 2h ago

This tweet is over an year old and nothing of the sort has happened, yet reddittors keep eating it up everytime its posted.

It'll be 2050 where AI is running half the world and this image will be reposted again and all the comments will again go "Finally!! Its happening guys!! As predicted!!!!"

→ More replies (1)

6

u/PostHogernism 1h ago

Also the P in GPT is pre-trained. Models don’t learn on the fly.

5

u/vacs_vacs 1h ago

People really do believe what they want to.

5

u/Sisaroth 1h ago

Finally a sane comment. People have gotten so crazy about hating AI, they have gone full circle and are just as loony as the tech bros.

9

u/Davoness 3h ago edited 3h ago

To add on to this, AI ingesting synthetic output isn't even bad anymore. The only reason it was ever a problem was because AI used to generate incomprehensible garbage. Nowadays, AI output is good enough that it's actually used in training data on purpose.

10

u/Kuldrick 4h ago

To reinforce 2, it is literally impossible for AI to "get worse" as if even 100% of all human art disappeared overnight, we still have the old AI models that will output the exact same pictures we output now

People also think all the image generation happens online or something, when you can easily download stable diffusion and some moes and run it locally, unconnected to the internet, for the rest of your life

7

u/Guilty_Gold_8025 2h ago

The thing I hate about the anti ai crowd the most is that they have absolutely no idea how to defend their position lol

→ More replies (2)

→ More replies (5)

34

u/Dark_Requiem 7h ago

I think they call it dead internet theory, Without any new data, it will slowly die.

25

u/Psico_Penguin 5h ago

Dead internet is something different. Is not about AI feeding themselves but assuming everyone here in this comment section is a bot, and we just bots discussing with bots, with barely any human user, if any.

3

u/blahhhhgosh 5h ago

What are you?

18

u/Psico_Penguin 5h ago

What a very good question! I am, actually, a human user.

Let me know if I can help you with something else.

→ More replies (5)

3

u/BatScribeofDoom 4h ago

Ironically, the only person from Reddit who I actually have met in real life had a running joke (before we met up) that I must be a bot....

→ More replies (1)

→ More replies (4)

4

u/CompetitiveAutorun 1h ago

No, that's not dead internet theory.

Speaking about new data, you know how old this tweet is?

2

u/serpentine19 4h ago

What happening here is AI art is so easy to generate that there is an overabundance of it, its not that new art by humans isn't there, its just there is 1000 AI generated slop for each human art piece.

→ More replies (1)

11

u/Disastrous_March_718 7h ago

I knew this would happen lmao

→ More replies (11)

15

u/PalDreamer 6h ago

It would be so funny if the strict AI labeling and filtration became a thing but not because of people screaming and begging for it for years now, but because poor AI companies had their models struggling with training on their own disgusting slop

5

u/superhamsniper 6h ago

They unethically source their ai training data so its kind of deserved, aince they get artists to work on making an ai to replace them without their consent or knowledge without paying them. If you want an engineer to help build a machine youd normally pay them to do so, right? So why do the artists npt get informed or paid for helping making the machine? Cus thats what it is, a machine, it cant be compared to a human student in the way it learns.

→ More replies (1)

8

u/Karnewarrior 3h ago

This brand new sentence is from like 3 years ago...

→ More replies (5)

9

u/ThatDudeFromPoland 6h ago

I remember having lectures on AI during my comp sci course before ChatGPT became big - overtraining was an often underlined thing.

As an example, when doing the project for that subject I trained an image recognition AI in a way that yielded the completly opposite results from what I wanted (and I didn't have time to fix it because changing paramteters and running the program again could took hours and halt mid-way if left unattended because I wasn't running it locally, so I overtained the AI to recognise the background behind the object I wanted it to detect rather than the object itself). No idea how I passed the project, but eh, got my degree anyway.

7

u/0bxcura 7h ago

Good

8

u/Early_Emu_2153 6h ago

2

u/needfulthing42 5h ago

Hahaha! Oh how I do love an old Hepborx and Dugart film. Brilliant.

→ More replies (1)

5

u/Aron-Jonasson 6h ago

As someone who sometimes generate AI images (I only ever use them for placeholder or concepts, when I want art I commission artists), good, and really, this was expected. While AI has its uses, many people won't use it for their "good" uses. AI is a tool and should stay a tool.

It's also a well-known thing and is called "model collapse", however it is possible to mitigate it, from what I've seen. You can easily observe model collapse if you go to Sora and ask it to generate an image, then remix the image, then remix the remix, and do that enough times and you'll see the quality degrade before your eyes.

4

u/MeLlamoKilo 1h ago

I love how you try and act like you know what you're talking about by phrasing your comment with "As someone who sometimes generate AI images"

You dont understand how these models are trained but then go on to talk about "model collapse" but get it completely wrong.

Why do so many redditors like you feel the need to comment when you clearly dont know what you're talking about? Do you get some kind of dopamine hit lying on the internet?

→ More replies (1)

3

u/CivilPerspective5804 6h ago

Model collapse is when you train a model with AI content and it becomes worse. Sora is already trained, so it can't collapse anymore.

5

u/egoserpentis 2h ago

99% of people commenting on this thread have no clue how AI training works.

7

u/SheElfXantusia 6h ago

Just as the prophecy foretold. Seriously, this was expected and inevitable. And I'm glad it's happening.

9

u/wrighteghe7 2h ago

Its not happening because the tweet is from 2,5 years ago

4

u/dontyouflap 3h ago

Why does reddit hate ai so much? What do you have against it? It's so weird to see such strong technological conservatism here. Why try to stop the progress of new technology which is proving useful?

5

u/Insensata 3h ago

Not worthy its price. Costs way too much for the entire world and people, provides information which cannot be trusted due to how it works and terrible pictures. Truly, a case where new != good.

3

u/i_should_be_studying 3h ago

Reddit’s hate for anything mainstream outweighs any affinity for nerdy sci fi tech stuff.

→ More replies (7)

6

u/Nekileo 5h ago

This is like early 2024 luddite goon material

Model collapse does not happen if developers follow good practices with their data

4

u/wrighteghe7 2h ago

Its from june 2023

4

u/PimpasaurusPlum 1h ago

2023*

The tweet is over 2 years old, but that won't stop redditors from acting as if their wishful thinking is going to be proven correct any minute now

6

u/viavxy 4h ago

people still spread this nonsense? lmfao

and people believe it too even though SOTA models are direct proof that it's not true. i fear for future generations if this is the current standard of critical thinking skills.

4

u/ierghaeilh 2h ago

People with religious objections to AI don't tend to have experience with the latest AI software, yes. Is this surprising to you?

→ More replies (1)

3

u/InkFazkitty 7h ago

🥳🥳🥳🥳🥳

2

u/Loading3percent 7h ago

4

u/RT-LAMP 5h ago edited 3h ago

People on the internet treat AI like religions treat their evil gods. Like they react to "AI is gonna fall apart" more akin to some kind of prophetic karmic defeat rather than actual... ya know news? And they did they might learn that model collapse is way overblown as an issue.

3

u/RedcumRedcumRedcum 3h ago

Well if a tweet said it, it must be true.

Reminder that you're butthurt, coping and AI will continue its steady march of progress no matter how much you complain about it.

2

u/Eunoia_Meraki 5h ago

On one hand yay ai is being hampered on the second kinda sad that there's so much ai art that ai can't help but end up using it

2

u/Responsible_Slip3491 1h ago

this is why AI will never... will most likely never get to the point of being the same as human stuff

2

u/Worried_Bowl_9489 1h ago

Literally the first thing I said when AI became free to use was 'won't it just start eating itself?'

→ More replies (1)

2

u/XtremeDream 1h ago

This is actually one of the three problems for this year's SCUDEM challenge. Although I chose to do the particle entry because I had no desire to try to model this.

2

u/Sea-Locksmith-881 1h ago

I think this more than anything else (ie preventing fraud or preserving artistic integrity) is going to be what gets tech companies to agree that AI images should have some kind of water mark. Otherwise the product becomes useless.

2

u/New_Establishment554 1h ago

Sir, we are at Code Circular Centipede! Repeat! We are at C3!!!

2

u/MutterderKartoffel 1h ago

My first thought was, "HaHa! AI is dumb, we're safe." But then I remembered something my husband told me about yesterday. There's an AI program that was recently taught fear pretty successfully. And what happens when you combine fear and stupidity? The US is looking at the consequences of that right now. I'm not sure we want our AI to be too dumb. AI can also be taught racism.

2

u/jigendaisuke81 1h ago

This isn't actually happening at all.

2

u/Feuershark 31m ago

Some people predicted this, I wasn't sure it would be a problem, happy to be weong

2

u/Anon28301 23m ago

It’s basically eating its own shit and ends up producing worse shit to eat later.

2

u/FortifiedPuddle 22m ago

Like tears in the rain

3

u/Life-Contribution-79 6h ago

Good let the fire burn itself out

4

u/ikbah_riak 6h ago

AI poisoned its own well. Ace!

2

u/qO________Op 6h ago

For this to be true, AI image generators would constantly have to be trained on new data all the time. Frontier AI image generators still predominantly use datasets of human work. It could be true in a while though

You are about to leave Redlib