Generative AI has made incredible strides in recent years, but its reliance on massive datasets still presents challenges. Large language models (LLMs) like OpenAI’s GPT-3 were trained on enormous data sets—such as the CommonCrawl data set, which contained 570 gigabytes of data and 400 billion tokens. These datasets are extensive but also static, meaning they can’t accommodate real-time information or adapt to new events. As a result, AI responses can become outdated, or worse, include hallucinations—plausible-sounding information that is, in reality, inaccurate. Even the best-performing LLMs, like OpenAI’s, still grapple with hallucination rates around 1.5 to 1.9 percent, according to Vectara’s Hallucination Leaderboard.
The challenge with using LLMs on their own is twofold: the data can be stale, and the responses can be factually incorrect. However, companies have found a way to mitigate these issues by using data streaming to continually update their datasets and deploying retrieval-augmented generation (RAG). RAG combines the power of generative AI with real-time, relevant data, encoding a company’s business data in a way that enhances AI responses. By leveraging RAG, companies can ensure that AI models respond with more timely and accurate information.
RAG works by creating a data set that can be searched for semantic matches to a user’s query. When the AI model receives a request, it searches for these matches and includes them in its response. The beauty of RAG lies in its ability to evolve over time. New data can be added to the vector data set, ensuring that AI responses stay up-to-date with the latest information. This process allows companies to harness the power of their own business data while reducing the risks of outdated or incorrect AI responses.
However, implementing RAG isn’t without its challenges. One of the main obstacles occurs when dealing with large volumes of documents that contain similar or identical information. RAG may struggle when multiple documents share overlapping data, making it harder to retrieve the most relevant information. Additionally, RAG can face difficulties when the answer to a query spans multiple documents that cross-reference each other. This is where traditional RAG falls short, as it lacks an understanding of the relationships between documents. To overcome these limitations, Microsoft Research has proposed a solution called GraphRAG, which combines the strengths of knowledge graphs and RAG. By using knowledge graphs, which map relationships between different pieces of information, GraphRAG can ensure that the AI system pulls the most accurate, context-aware data, even in complex or cross-referenced scenarios. This hybrid approach promises to enhance the accuracy and reliability of AI-generated responses in real-world applications.