AI model failures are not what we expected
Signs of a potential collapse in AI models are starting to emerge as an increasing number of AI systems appear to be producing subpar results. While AI technology has been touted as a game-changer in various fields, recent observations suggest that AI-enabled search engines, such as Perplexity, are failing to deliver accurate and reliable information.
In particular, when searching for precise data like market-share statistics or financial figures, users are often presented with information from questionable sources. Instead of obtaining data from highly reliable sources like 10-Ks, mandated annual financial reports filed with the US Securities and Exchange Commission, search results frequently derive figures from sites that claim to summarize business reports. While these summaries may seem accurate at first glance, they often fall short of providing authentic data. This pattern is not exclusive to Perplexity but extends to other major AI search engines, highlighting a concerning trend in the accuracy of AI-generated information.
This phenomenon is commonly referred to as AI model collapse, a term used in AI circles to describe the gradual decline in the accuracy, diversity, and reliability of AI systems over successive generations. With errors compounding and distorting data distributions, AI models become poisoned by their own projections of reality, leading to irreversible defects in performance. A Nature paper published in 2024 highlighted this issue, emphasizing the detrimental effects of AI model collapse on the overall reliability of AI systems.
AI model collapse typically results from three key factors. Error accumulation leads to the amplification of flaws from previous models, causing outputs to deviate from original data patterns. Loss of tail data erases rare events from training data, leading to the blurring of entire concepts. Furthermore, feedback loops reinforce narrow patterns, resulting in repetitive or biased recommendations.
Aquant aptly summarizes this phenomenon by stating that when AI is trained on its own outputs, the generated results can drift further away from reality. Recent studies, such as Bloomberg Research’s analysis of Retrieval-Augmented Generation (RAG), have further underscored the declining quality of AI-generated content. The study examined 11 leading large language models (LLMs) that produced subpar results when exposed to harmful prompts, highlighting the susceptibility of AI systems to external influences.
While RAG offers the potential to improve AI-generated content by incorporating external knowledge sources, such as databases and documents, it also poses significant risks. Despite reducing AI hallucinations, RAG-enabled LLMs face challenges such as leaking private client data, generating misleading market analyses, and producing biased investment advice. As stated by Amanda Stent, Bloomberg’s head of AI strategy & research, AI practitioners must exercise caution when utilizing RAG-based systems in gen AI applications to ensure responsible and ethical AI implementation.
The proliferation of fake content generated by AI systems raises concerns about the future reliability and utility of AI technology. As AI users increasingly rely on AI systems for various tasks, the prevalence of inaccurate information and fake results threatens to undermine the credibility and effectiveness of AI systems. Researchers suggest that incorporating synthetic data with fresh human-generated content may help mitigate the risks associated with AI model collapse. However, the challenge lies in sourcing authentic human-generated content to counterbalance the shortcomings of AI-generated data effectively. As the debate surrounding responsible AI usage continues, the need for robust safeguards and reliable content generation mechanisms becomes increasingly essential to preserve the integrity and accuracy of AI systems.