
Large Language Models (LLMs) can generate human-comprehensible content (text or figures) based on user queries.
Building an accurate LLM (such as GPT-4, LLaMA, or BLOOM) requires vast resources in terms of data and computational cost. Often, constructing an LLM can be outsourced to a third party to overcome such cost and infrastructure burdens.
Several companies now offer customized LLM-building services, and the prices for such services are decreasing, making LLMs
an affordable tool to automate business processes even for small organizations.
A business owner can leverage a third-party-built LLM by contextualizing it with local data generated from day-to-day business operations. Methods such as fine-tuning [1] and Retrieval Augmented Generation (RAG) [2] can be used for this purpose. In RAG, the local data is mapped to a vector database, and the query to the LLM is also mapped to a vector database. The semantic distance between the local data and the query in the vector database is used by the LLM to generate a contextualized response. These contextualization methods can efficiently digest the local data of the business owner and generate queries
on the local data.
Figure 1: Overview of a RAG Agent
We will use the term RAG agent to refer to software that is equipped with (or has access to) an LLM and has the ability
to digest local contextual data to generate a human-comprehensible response. An organization can use RAG agents to automate its business processes. A business process can be modeled using a business....