Deconstructing the RAG flow
Let us now deconstruct the building blocks of a RAG model and help you understand how it functions.
First, we will take a look at the regular LLM application flow. Figure 2.2 illustrates this basic flow.

Figure 2.2 — The basic flow of information in a chat application with an LLM
Here is what happens when a user prompts an LLM
- User sends a prompt: The process begins with a user sending a prompt to an LLM chat API. This prompt could be a question, an instruction, or any other request for information or content generation.
- LLM API processes the prompt: The LLM chat API receives the user’s prompt and transmits it to an LLM. LLMs are AI models trained on massive amounts of text data, allowing them to communicate and generate human-like text in response to a wide range of prompts and questions.
- LLM generates a response: The LLM then processes the prompt and formulates a response. This response is sent back...