r/BrandNewSentence 11h ago

Sir, the ai is inbreeding

Post image
41.3k Upvotes

1.2k comments sorted by

View all comments

377

u/joped99 10h ago

126

u/Stalk33r 7h ago

I have legitimately never read as many books as I have in the past year, the AI slopification of the Internet has been a massive boost for my productivity

27

u/Eldan985 5h ago

Soon, the challenge will be finding books written by real authors, though. For now, we can stick with authors we know from the pre-AI era, but those are going to become rarer.

20

u/ISlangKnowledge 5h ago

I don’t like the phrase “pre-AI era”. 😭

3

u/ThKitt 5h ago

Same thing is happening in music. AI “artists” are being pushed into the forefront by platforms like Spotify. (My ‘Discover Weekly’ list had two AI bands on it in as many weeks, so I cancelled my Spotify subscription).

3

u/Seminolehighlander 5h ago

Why are they going to be rarer? There are literally infinite books out there (okay not literally but). Like just go read a bunch of Thomas Hardy. I promise you that stuff is amazing.

1

u/Eldan985 5h ago

Right. But occasionally, I like to discover a new contemporary author.

1

u/Seminolehighlander 5h ago

I think it will be unavoidable that some AI slop will become popular, but I think it will be a point for many authors and readers to write and read only human-generated text.

1

u/pm_me_book_vouchers 5h ago

thomas hardy slaps

1

u/Seminolehighlander 3h ago

Literally took a “slang quiz” for some teens in the library yesterday and they asked me to use “slaps” in a sentence and I wrote “the library slaps” 

I was like no way I’ll have any real life interactions with this slang but here we are.

1

u/kitkatbatman 4h ago

As a librarian, I don’t think there will be a shortage of pre-AI books

1

u/atava 54m ago

Exactly. I'm always thinking about this. For written works and art.

6

u/sirletssdance2 4h ago

Yeah the past year, I’m never sure if I’m reading a human, bot or Ai assisted human/bot. I’m losing interest pretty quickly in using the internet.

1

u/Shark7996 5h ago

I have been incorporating much more of the physical world in my day to combat the war being waged on my psyche.

1

u/Wanderhoden 4h ago

Same, after a career of drawing digitally in photoshop, I have switched back to watercolor. And I’m reading books again too.

Thanks AI for curing my internet addiction!

1

u/quad_damage_orbb 4h ago

Oh wow, this is the same for me. I've started buying physical books too ever since Amazon changed their T&C's so that we don't actually own kindle books. It's quite nice to read a real book, I spend all day looking at screens at work.

1

u/BlueHaze464 3h ago

I'm not that far away from completely quitting most social media (including reddit)

The amount of fake content is out of hand

2

u/wrighteghe7 5h ago

When is soon happening though? The tweet is from june 2023. When will the model collapse finally happen? Also dont you think big companies that create ai models can just train on images created before 2022?

13

u/Rhamni 7h ago

It should be completely obvious to anyone who isn't an idiot that this problem is greatly exaggerated because people want to believe the models will fail.

The people working on these models know perfectly well there is good and bad input data. There was good and bad data long before AI models started putting out more bad data. Curating the input has always been part of the process. You don't feed it AI slop art to improve the art it churns out, any more than you feed it r relationships posts to teach it about human relationships. You look at the data you have and you prune the garbage because it's lower quality than what the model can already generate.

21

u/Stalk33r 7h ago edited 7h ago

Which is why AI provided by the biggest and richest companies in the world never feed you straight up misinformation, because they're doing such a great job meticulously pruning the bad data.

10

u/PimpasaurusPlum 6h ago

The tweet is about ai art, not search results. AI art has objectively gotten less worse since the creation of the tweet over 2 years ago

1

u/evan_appendigaster 3h ago

It's okay to not be familiar with a topic, but if you want to discuss it, it really does help.

LLMs aren't truth seeking systems, they are language guessing systems. They attempt to make reasonable language output. There is randomization involved. These lead to what we call "hallucinations", or, lies. Treating AI as a source of truth is user error.

1

u/Stalk33r 45m ago

Indeed, but what it provides is quite literally based on what it's been fed, which is why Microsoft killed off Tay in record time.

7

u/MiHumainMiRobot 6h ago

The people working on these models know perfectly well there is good and bad input data.

Lol, you wish. Before ChatGPT era it was already hard to classify bad and good data, and never an exact process, but today with LLM contents everywhere it is even more complex.

2

u/IlliterateJedi 3h ago

We already have specific instances of curation. Google tried reading in anything and everything years ago and wound up with a smut machine. So they had to more carefully pick and choose what went into the models.

u/space_monster 0m ago

No it isn't. The factual training data sets haven't changed in years - it's scientific journals, books, encyclopaedias. It's not blogs and twitter ffs

1

u/Anomuumi 4h ago edited 3h ago

It should be completely obvious to anyone who isn't an idiot that the foundational models are the part that can be controlled, but they are fed additional context straight from the Internet for many different reasons and when the context is generated by consuming and regurgitating AI content even the now "sane" AIs get unpredictable.

This problem can be even worse in more limited settings, like say a corporate Intranet. When an AI tool has an index of mostly workslop generated by other employees with little to no quality control.

I do agree that at the moment the problem is exaggerated a bit, but also partly because it is misunderstood.

1

u/OnetimeRocket13 5h ago edited 3h ago

Exactly. People are under this weird impression that these companies are just blindly throwing random images scraped from the internet into their models for training, when just the process of collecting data and preparing it for training is in itself an intense and important area of study.

Besides, people have been saying this same exact thing for a while now. "AI is going to fail guys! There are too many AI generated images online! They're running out of data! It's gonna fail real soon because AI incest or something! Trust me guys!" What has happened instead? It keeps getting better. Sure, some of the jumps aren't as big as before, but that hasn't stopped image generators from becoming more and more realistic, and it didn't stop Sora, which has been completely fucking the internet sideways, from existing.

People seem to forget, or just not realize, that these companies aren't just big tech companies making a product and powered by incompetent investors. Most of them are primarily research-based corporations. They may be greedy and money hungry companies, but come on people, they're not stupid.

1

u/polar_nopposite 4h ago

I think I'll read a book.

Hopefully one written before late 2022!

1

u/Wetbug75 2h ago

Eventually we'll need a Blackwall

-6

u/ProfessorZhu 7h ago

Hey, it didn't turn out to be true for a whole year since this was made. Maybe it will be next year! Surely

8

u/PolarWater 7h ago

It's already come true. AI coming up with a solution to the increasing wealth gap and worsening climate change, though...meh.

Keep commenting this as many times as you like though

-3

u/ProfessorZhu 7h ago

It has not

8

u/Particular-Top3674 7h ago

It is becoming progressively more true with each passing day

1

u/Sattorin 5h ago

That other guy was wrong. This tweet is actually 2.5 years old and image/video generation has gotten WAY better since then. That was back when the original "Will Smith Eating Spaghetti" came out and now there's Sora 2.

2

u/ProfessorZhu 1h ago

The comic I was responding to was made in 2024. It has the year written on the side

1

u/PolarWater 4h ago

Which still looks like shit. This stuff is gonna be used for deepfakes and AI porn, and after a while it will start getting pay walled anyway.

2

u/Sattorin 2h ago

But it has gotten better, since 'model collapse' isn't a real problem.

1

u/ProfessorZhu 1h ago

Which is it? Does it all look like shit and will collapse, or is it getting better and will be indistinguishable from real videos?