Augmenting large language models (LLMs) with knowledge beyond their training data is a very important area of interest, particularly for enterprise applications.
The hottest option to integrate domain and customer-specific knowledge into LLMs is thru using Retrieval-Augmented Generation (RAG). However, easy RAG techniques usually are not sufficient in lots of cases.
Building effective data-driven LLM applications requires careful consideration of several aspects. In one recent paperresearchers at Microsoft Propose a framework for categorizing several types of RAG tasks based on the style of external data they require and the complexity of the reasoning involved.
“Data-enhanced LLM applications usually are not a one-size-fits-all solution,” the researchers write. “The requirements of the true world, particularly within the expert domain, are extremely complex and might vary significantly of their relationship to given data and the issue of reasoning required.”
To address this complexity, the researchers propose a four-tier categorization of user queries based on the style of external data required and the cognitive processing required to generate accurate and relevant responses:
– Explicit facts: Queries that require retrieving explicitly stated facts from the info.
– Implicit facts: Queries that require inferences about information that isn’t explicitly contained in the info and infrequently require basic reasoning or common sense.
– Interpretable justifications: Queries that require the understanding and application of domain-specific reasoning or rules which might be explicitly provided in external resources.
– Hidden reasons: Queries that require uncovering and exploiting implicit domain-specific reasoning methods or strategies that usually are not explicitly described in the info.
Each query level presents unique challenges and requires specific solutions to effectively address them.
Explicit fact queries
Explicit fact queries are the best type and concentrate on retrieving factual information contained directly in the info provided. “The defining feature of this layer is the clear and direct dependence on certain external data,” the researchers write.
The commonest approach to answering these queries is to make use of a straightforward RAG, where the LLM retrieves relevant information from a knowledge base and uses it to generate a response.
However, even with explicit factual queries, RAG pipelines face several challenges at every stage. For example, within the indexing phase, where the RAG system creates a store of knowledge blocks that may later be retrieved as context, the RAG system must cope with large and unstructured data sets which will contain multimodal elements corresponding to images and tables. This will be addressed through multimodal document parsing and multimodal embedding models, which may represent the semantic context of each textual and non-textual elements in a standard embedding space.
When retrieving information, the system must be certain that the info retrieved is relevant to the user's query. Here, developers can reap the benefits of techniques that improve query alignment with document stores. For example, an LLM can generate synthetic answers to the user's query. The answers themselves is probably not correct, but their embeds will be used to retrieve documents that contain relevant information.
During the reply generation phase, the model must determine whether the data retrieved is sufficient to reply the query and find the suitable balance between the given context and its own internal knowledge. Special fine-tuning techniques will help the LLM learn to disregard irrelevant information from the knowledge base. Training the retriever and responder together can even result in more consistent performance.
Implicit fact queries
Implicit fact queries require the LLM to transcend simply retrieving explicitly stated information and apply some extent of reasoning or inference to reply the query. “Queries at this level require collecting and processing information from multiple documents inside the collection,” the researchers write.
For example, a user might ask, “How many products did Company X sell last quarter?” or “What are the important thing differences between the strategies of Company X and Company Y?” Answering these questions requires information from multiple sources inside the knowledge base be combined. This is typically known as “multi-hop query answering.”
Implicit fact queries introduce additional challenges, including the necessity to coordinate multiple contextual queries and effectively integrate reasoning and retrieval functions.
These queries require advanced RAG techniques. For example, techniques like interleaving retrieval with chain-of-thought (IRCoT) and Retrieval Augmented Thought (RAT) use thought chain prompts to guide the retrieval process based on previously retrieved information.
Another promising approach is combining knowledge graphs with LLMs. Knowledge graphs represent information in a structured format, making it easier to perform complex reasoning and connect different concepts. Graph RAG systems can convert the user's query into a series containing information from different nodes of a graph database.
Interpretable justification queries
Interpretable justification queries require that LLMs not only understand factual content but additionally apply domain-specific rules. These justifications is probably not present within the LLM pre-training data, but they usually are not difficult to seek out within the body of information either.
“Interpretable reason queries represent a comparatively easy category inside applications that depend on external data to supply reasoning,” the researchers write. “The supporting data for most of these queries often comprises clear explanations of the thought processes used to resolve problems.”
For example, a customer support chatbot might have to integrate documented policies for processing returns or refunds into the context provided by a customer's grievance.
One of the essential challenges in coping with these requests is to effectively integrate the provided justifications into the LLM and be certain that it may well follow them accurately. Prompt tuning techniques, corresponding to those using reinforcement learning and reward models, can improve the LLM's ability to stick to specific rationales.
LLMs may also be used to optimize your personal prompts. For example, DeepMind's OPRO technique uses multiple models to guage and optimize one another's prompts.
Developers can even leverage the chain-of-thought reasoning capabilities of LLMs to process complex reasoning. However, manually designing thought-chain prompts for interpretable justifications will be time-consuming. Techniques corresponding to Automate CoT will help automate this process through the use of the LLM itself to create examples of thought chains from a small labeled data set.
Hidden reason queries
The biggest challenge is hidden reasoning queries. These queries are domain-specific reasoning methods that usually are not explicitly stated in the info. The LLM must uncover these hidden reasons and apply them to reply the query.
For example, the model could have access to historical data that implicitly comprises the knowledge needed to resolve an issue. The model must analyze this data, extract relevant patterns and apply them to the present situation. This may involve adapting existing solutions to a brand new coding problem or using documents from previous legal cases to attract conclusions a few recent one.
“Navigating hidden logic queries…requires sophisticated analytical techniques to decipher and exploit the latent wisdom contained in disparate data sources,” the researchers write.
The challenges of hidden queries include retrieving information that’s logically or thematically related to the query, even when it isn’t semantically similar. Additionally, the knowledge needed to reply the query often must be consolidated from multiple sources.
Some methods utilize the contextual learning skills of LLMs to show them to pick and extract relevant information from multiple sources and to formulate logical reasoning. Other approaches concentrate on generating logical justification examples for few-shot and many-shot prompts.
However, effectively managing hidden motivations often requires fine-tuning, especially in complex areas. This fine-tuning is often domain-specific and involves training the LLM with examples that allow it to think in regards to the query and determine what style of external information it needs.
Impact on the creation of LLM applications
The survey and framework compiled by the Microsoft Research team shows how far LLMs have are available leveraging external data for practical applications. However, it is usually a reminder that many challenges still must be overcome. Companies can use this framework to make more informed decisions about one of the best techniques for integrating external knowledge into their LLMs.
RAG techniques can go a great distance in overcoming most of the shortcomings of vanilla LLMs. However, developers also need to pay attention to the restrictions of the techniques they use and when to upgrade to more complex systems or avoid using LLMs.