As the momentous first year of ChatGPT comes to a close, it’s clear that generative AI (genAI) and large language models (LLMs) are exciting technologies. But are they ready for prime-time enterprise use?
There are well-understood challenges with ChatGPT, where its responses have poor accuracy. Despite being based on sophisticated computer models of human knowledge like GPT-4, ChatGPT rarely wants to admit ignorance, a phenomenon referred to as AI hallucinations, and it often struggles with logical reasoning. Of course, this is because ChatGPT doesn’t reason—it operates like an advanced text auto-complete system.
This can be hard for users to accept. After all, GPT-4 is an impressive system: It can take a simulated bar exam and pass with a score in the top 10% of entrants. The prospect of employing such an intelligent system to interrogate corporate knowledge bases is undoubtedly appealing. But we need to guard against both its overconfidence and its stupidity.
To combat these, three powerful new approaches have emerged, and they can offer a way to enhance reliability. While these approaches may differ in their emphasis, they share a fundamental concept: treating the LLM as a “closed box.” In other words, the focus is not necessarily on perfecting the LLM itself (though AI engineers continue to improve their models considerably) but on developing a fact-checking layer to support it. This layer aims to filter out inaccurate responses and infuse the system with a “common sense.”
Let’s look at each in turn and see how.
A wider search capability
One of these approaches involves the widespread adoption of vector search. This is now a common feature of many databases, including some databases that are specialized solely to vectors.
A vector database is intended to be able to index unstructured data like text or images, placing them in a high-dimensional space for search, retrieval, and closeness. For example, searching for the term “apple” might find information about a fruit, but nearby in the “vector space” there might be results about a technology company or a record label.
Vectors are useful glue for AI because we can use them to correlate data points across components like databases and LLMs, and not just use them as keys into a database for training machine learning models.
From RAGs to riches
Retrieval-augmented generation, or RAG, is a common method for adding context to an interaction with an LLM. Under the bonnet, RAG retrieves supplementary content from a database system to contextualize a response from an LLM. The contextual data can include metadata, such as timestamp, geolocation, reference, and product ID, but could in theory be the results of arbitrarily sophisticated database queries.
This contextual information serves to help the overall system generate relevant and accurate responses. The essence of this approach lies in obtaining the most accurate and up-to-date information available on a given topic in a database, thereby refining the model’s responses. A useful by-product of this approach is that, unlike the opaque inner workings of GPT-4, if RAG forms the foundation for the business LLM, the business user gains more transparent insight into how the system arrived at the presented answer.
If the underlying database has vector capabilities, then the response from the LLM, which includes embedded vectors, can be used to find pertinent data from the database to improve the accuracy of the response.
The power of a knowledge graph
However, even the most advanced vector-powered, RAG-boosted search function would be insufficient to ensure mission-critical reliability of ChatGPT for the business. Vectors alone are merely one way of cataloging data, for example, and certainly not the richest of data models.
Instead, knowledge graphs have gained significant traction as the database of choice for RAG. A knowledge graph is a semantically rich web of interconnected information, pulling together information from many dimensions into a single data structure (much like the web has done for humans). Because a knowledge graph holds transparent, curated content, its quality can be assured.
We can tie the LLM and the knowledge graph together using vectors too. But in this case once the vector is resolved to a node in the knowledge graph, the topology of the graph can be used to perform fact-checking, closeness searches, and general pattern matching to ensure what’s being returned to the user is accurate.
Getting the context back in
We have seen that knowledge graphs enhance GPT systems by providing more context and structure through RAG. We’ve also seen the evidence mount that by using a combination of vector-based and graph-based semantic search (a synonym for knowledge graphs), organizations achieve consistently high-accuracy results.