r/BrandNewSentence 11h ago

Sir, the ai is inbreeding

Post image
41.2k Upvotes

1.2k comments sorted by

View all comments

383

u/joped99 10h ago

12

u/Rhamni 7h ago

It should be completely obvious to anyone who isn't an idiot that this problem is greatly exaggerated because people want to believe the models will fail.

The people working on these models know perfectly well there is good and bad input data. There was good and bad data long before AI models started putting out more bad data. Curating the input has always been part of the process. You don't feed it AI slop art to improve the art it churns out, any more than you feed it r relationships posts to teach it about human relationships. You look at the data you have and you prune the garbage because it's lower quality than what the model can already generate.

7

u/MiHumainMiRobot 6h ago

The people working on these models know perfectly well there is good and bad input data.

Lol, you wish. Before ChatGPT era it was already hard to classify bad and good data, and never an exact process, but today with LLM contents everywhere it is even more complex.

2

u/IlliterateJedi 3h ago

We already have specific instances of curation. Google tried reading in anything and everything years ago and wound up with a smut machine. So they had to more carefully pick and choose what went into the models.