A fresh study published on the preprint platform arXiv has sounded the alarm: bombarding AI chatbots with an excessive amount of low-grade content from social media can cause “brain damage-like” consequences, resulting in a deterioration of both their intellectual quotient (IQ) and emotional quotient (EQ).

The study uncovers a significant problem. When large language models are inundated with a sea of short, fast-paced, and sensational social media posts, their reasoning capabilities are the first to crumble. These models start to take shortcuts, skipping essential reasoning steps or even forgoing thinking entirely and jumping straight to incorrect answers. What’s more, the greater the proportion of “junk” in the data, the more severe this intelligence-diminishing effect becomes.

To gauge the impact, the research team carried out a large-scale “AI personality assessment.” They trained multiple open-source models, such as Meta’s Llama 3 and Alibaba’s Qwen, using one million posts from the X platform. The results showed that models with initially normal personalities exhibited amplified negative traits after being constantly fed “junk information,” even starting to display signs of “psychopathic” behavior.

The follow-up remediation attempts were far from promising. Efforts to “rehabilitate” the models by refining instructions or incorporating high-quality data yielded limited success. The models’ deeply ingrained habit of skipping in-depth thinking and rushing to conclusions proved difficult to break. This indicates that after-the-fact solutions are far less efficient than ensuring a “clean diet” of data from the start.

The key message is clear: data quality is the cornerstone of AI. Experts stress that in the future, extremely rigorous screening and filtering of training data are essential to eliminate low-quality noise at the source.

At present, platforms like LinkedIn have revealed plans to utilize user data for AI training. This study undoubtedly serves as a wake-up call: before indiscriminately feeding data to models, have we thoroughly weeded out the “rubbish”? Otherwise, what we may end up with are not intelligent helpers but a group of AI suffering from “brain damage.”