Spring AI RAG and OpenAI: ChatBot for Uploaded Files

Learn to build a fully-functional chatbot application (with UI) with RAG (Retrieval-Augmented Generation) capabilities using Spring AI and Spring Web.

Learn to build a fully functional chatbot application with RAG (Retrieval-Augmented Generation) capabilities using Spring AI. The demo application has the following tech stack:

1. Prerequisites

Before starting the demo, you must have an OpenAI API key that will authenticate your requests to OpenAI LLM models text-embedding-ada-002 (for vector embeddings) and gpt-3.5-turbo (for chat support).

Add the API key and model names in the application properties.

spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.model=gpt-3.5-turbo

spring.ai.openai.embedding.enabled=true
spring.ai.openai.embedding.model=text-embedding-ada-002

2. Maven

After creating a new Spring boot project, add the following dependencies. For a complete reference of setting up the dependencies, refer to getting started with Spring AI.

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-security</artifactId>
</dependency>
<dependency>
  <groupId>org.thymeleaf.extras</groupId>
  <artifactId>thymeleaf-extras-springsecurity6</artifactId>
</dependency>

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-tika-document-reader</artifactId>
</dependency>

3. Vector Store

Spring AI, after detecting the available classes, automatically creates a bean of type EmbeddingModel with default values. We can customize the embedding options using vendor-specific properties. For example, for the OpenAI embedding model, we can override these customization-related properties.

spring.ai.openai.embedding.options.model=text-embedding-ada-002
spring.ai.openai.embedding.metadata-mode=EMBED
spring.ai.openai.chat.embeddings-path=/v1/embeddings

After the EmbeddingModel is configured, we can use a VectorStore. Spring supports several vector stores. For the demo purpose, we are using Spring-provided in-memory vector store SimpleVectorStore.

import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.vectorstore.SimpleVectorStore;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class VectorStoreConfig {

  @Bean
  VectorStore simleVectorStore(EmbeddingModel embeddingModel) {
    return new SimpleVectorStore(embeddingModel);
  }
}

4. Uploading Files and Generating Embeddings

In the context of LLMs, the RAG is essentially the process of ingesting and processing external information before generating a response, making it more informative and relevant. The most common approach to data ingestion in UI-based applications is uploading the files. In the backend, we read the file content, generate the embeddings using the configured EmbeddingModel, and store them in the vector database using vectorStore.add() method.

import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;
import java.util.List;
import org.slf4j.Logger;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.tika.TikaDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;

@RestController
public class UploadController {

  private static final Logger LOG = org.slf4j.LoggerFactory.getLogger(UploadController.class);

  private final VectorStore vectorStore;

  public UploadController(VectorStore vectorStore) {
    this.vectorStore = vectorStore;
  }

  @PostMapping("/upload")
  public UploadResponse upload(@RequestParam("file") MultipartFile file) throws IOException {

  	//Read file to a temporary location
    Path destinationFile = Paths.get("/temp").resolve(
      Paths.get(file.getOriginalFilename())).normalize().toAbsolutePath();

    try (InputStream inputStream = file.getInputStream()) {
      Files.copy(inputStream, destinationFile, StandardCopyOption.REPLACE_EXISTING);
    }

    //Read and split the document contents
    TikaDocumentReader documentReader = new TikaDocumentReader(destinationFile.toUri().toString());
    List<Document> documents = documentReader.get();
    List<Document> splitDocuments = new TokenTextSplitter().apply(documents);

    vectorStore.add(splitDocuments);  // generate and store embeddings in vector store
    return new UploadResponse(file.getOriginalFilename(), file.getContentType(), file.getSize());
  }

  private static record UploadResponse(String fileName, String fileType, long fileSize) {
  }
}

We can control the size of uploaded files using the spring.servlet.multipart.* properties:

spring.servlet.multipart.max-file-size=50MB
spring.servlet.multipart.max-request-size=50MB

5. Chat Controller

The chat application invokes REST APIs to send user queries and display the AI response. The following REST API accepts the user prompt in the Question object and sends back the response in the Answer object.

The API uses org.springframework.ai.chat.client.ChatClient for sending the user question to LLM (GPT model). Additionally, it uses QuestionAnswerAdvisor to enable the pattern of RAG by appending the additional prompt with context information related to the user text.

private static final String DEFAULT_USER_TEXT_ADVISE = """
		Context information is below.
		---------------------
		{question_answer_context}
		---------------------
		Given the context and provided history information and not prior knowledge,
		reply to the user comment. If the answer is not in the context, inform
		the user that you can't answer the question.
		""";

Additionally, InMemoryChatMemory provides storage for chat conversation history in memory that is used for further refining the context of user prompts in the whole conversation.

import com.howtodoinjava.ai.demo.web.model.Answer;
import com.howtodoinjava.ai.demo.web.model.Question;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.PromptChatMemoryAdvisor;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.chat.client.advisor.SimpleLoggerAdvisor;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.security.core.Authentication;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("/chat")
public class ChatController {

  private final ChatClient chatClient;

  public ChatController(ChatClient.Builder chatClientBuilder, VectorStore vectorStore) {

    this.chatClient = chatClientBuilder
      .defaultAdvisors(
        new SimpleLoggerAdvisor(),
        new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()),
        new PromptChatMemoryAdvisor(new InMemoryChatMemory()))
      .build();
  }

  @PostMapping
  public Answer chat(@RequestBody Question question, Authentication user) {
    return chatClient.prompt()
      .user(question.question())
      .advisors(
        advisorSpec -> advisorSpec.param(CHAT_MEMORY_CONVERSATION_ID_KEY, user.getPrincipal()))
      .call()
      .entity(Answer.class);
  }
}

public record Question(String question) {
}

public record Answer(String answer) {
}

6. Security Configuration

An important part of RAG applications is keeping track of users’ activities so they do not abuse system resources. To know which user is uploading files and doing chats, login-based security must be built so we know which user is logged in and using the application.

Spring Security integrates seamlessly with Spring Web and it can be used to provide the basic login functionality. We are configuring the minimum configuration with in-memory user authentication having a single user with username as ‘howtodoinjava’ and password as ‘password’.

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.core.authority.SimpleGrantedAuthority;
import org.springframework.security.core.userdetails.User;
import org.springframework.security.core.userdetails.UserDetails;
import org.springframework.security.core.userdetails.UserDetailsService;
import org.springframework.security.crypto.bcrypt.BCryptPasswordEncoder;
import org.springframework.security.crypto.password.PasswordEncoder;
import org.springframework.security.provisioning.InMemoryUserDetailsManager;
import org.springframework.security.web.SecurityFilterChain;

@Configuration
public class SecurityConfig {

  @Bean
  SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
    return http
      .authorizeRequests(authorize -> authorize
        .requestMatchers("/login").permitAll()
        .anyRequest().authenticated()
      )
      .formLogin(formLogin -> formLogin
        .loginPage("/login")
        .defaultSuccessUrl("/", true)
      )
      .csrf(csrf -> csrf.disable())
      .headers(headers -> headers.frameOptions(frameOptions -> frameOptions.sameOrigin()))
      .build();
  }

  @Bean
  UserDetailsService userDetailsService(PasswordEncoder passwordEncoder) {
    List<UserDetails> usersList = new ArrayList<>();
    usersList.add(new User("howtodoinjava", passwordEncoder.encode("password"),   // Username and password
      Arrays.asList(new SimpleGrantedAuthority("ROLE_USER"))));
    return new InMemoryUserDetailsManager(usersList);
  }

  @Bean
  PasswordEncoder passwordEncoder() {
    return new BCryptPasswordEncoder();
  }
}

We have created a simple HTML form with two text fields and one submit button for providing the username/password. Nothing fancy here.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:th="http://www.thymeleaf.org">
<head>
  <title>AI Chat Demo</title>
</head>

<body>
<h1>Login</h1>
<form method="POST" th:action="@{/login}">
  <table>
    <tr>
      <td>User:</td>
      <td><input type="text" name="username"/></td>
    </tr>
    <tr>
      <td>Password:</td>
      <td><input type="password" name="password"/></td>
    </tr>
    <tr>
      <td colspan="2"><input name="submit" type="submit" value="Submit"/></td>
    </tr>
  </table>
</form>
</body>
</html>

7. Chat Application UI

After the user is logged in, he will be directed to the chat application, where he can upload files and ask questions, as he can in other chat-based applications, such as chatGPT.

<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"
      xmlns:sec="http://www.thymeleaf.org/extras/spring-security">
<head>
  <title>AI Chat Demo</title>
  <link rel="stylesheet" th:href="@{/style.css}"/>
  <script th:src="@{/script.js}"></script>
  <script>
    var username = "[[${#authentication.principal.username}]]";
  </script>
</head>

<body>
<iframe name="hiddenUploadFrame" id="hiddenUploadFrame" style="display:none;"></iframe>
<div id="uploadModal" class="modal">
  <div class="modal-content">
    <span class="closeModalSpan">&times;</span>
    <h2>Upload a file</h2>
    <form id="uploadForm" method="post" th:action="@{/upload}" enctype="multipart/form-data" target="hiddenUploadFrame">
      <input type="file" name="file" id="file"/>
      <input type="submit" value="Upload" name="submit" id="submit"/>
    </form>
    <div class="loader" id="loader">
      <svg class="circular">
        <circle class="path" cx="50" cy="50" r="20" fill="none" stroke-width="5" stroke-miterlimit="10"></circle>
      </svg>
    </div>
  </div>
</div>
<div id="chatArea">
  <div id="header">
    <h2>AI Chat Demo</h2>
  </div>

  <div id="transcript">
  </div>

  <div id="controls">
    <button id="uploadFile">Upload File</button>
  </div>

  <textarea id="userInput" type="text" placeholder="How can I help?" rows="1"></textarea>
  <button id="typedTextSubmit" name="typedTextSubmit">Submit</button>
</div>
</body>
</html>

We are using simple JavaScript code to append the user questions, AI responses, file upload forms, and status. Every prompt/message is first converted into a new ‘DIV‘ element and then appended to the main chatbox.

const addToTranscript = (who, text) => {
    let b = document.querySelector('#transcript');
    let name = (who === "User") ? username : who;
    b.innerHTML += createTranscriptEntry(who, name, text);
    b.scrollTop = b.scrollHeight;
    console.log(text);
};

const createTranscriptEntry = (who, name, text) => {
    return `
    <div class="${who}Entry">
      <div><img src="/${who}.png" width="18" height="18" style="vertical-align: middle;"/> <span style="vertical-align: middle;"><b>${name}:</b> ${text}</span></div>
     </div>`
};

const handleResponse = (response) => {
    addToTranscript("AI", response.answer);
};

const postQuestion = (question) => {
    let data = {
        question: question
    };

    fetch("/chat", {
        method: "POST",
        headers: {
            "Content-Type": "application/json"
        },
        body: JSON.stringify(data)
    })
    .then(res => res.json())
    .then(handleResponse);
};

const submitTypedText = (event) => {
    let typedTextInput = document.querySelector('#userInput');
    let typedText = typedTextInput.value;

    // don't submit if empty
    if (typedText.trim().length === 0) {
        return false;
    }
    // submit it here
    addToTranscript("User", typedText);
    postQuestion(typedText);
    typedTextInput.value = '';
    return false;
};

const initUIEvents = () => {
    let t = document.querySelector('#typedTextSubmit');
    t.addEventListener('click', submitTypedText);
    var textarea = document.querySelector('textarea#userInput');
    textarea.addEventListener('keydown', e => {
        if (e.key === "Enter" && !e.shiftKey) {
            e.preventDefault();
            submitTypedText(e);
        }
    });
    var modal = document.getElementById("uploadModal");
    var openModalBtn = document.getElementById("uploadFile");
    openModalBtn.addEventListener('click', () => {
        modal.style.display = "block";
    });
    var closeModalSpan = document.getElementsByClassName("closeModalSpan")[0];
    closeModalSpan.addEventListener('click', () => {
        modal.style.display = "none";
    });
    uploadForm.addEventListener('submit', () => {
        var uploadForm = document.getElementById("uploadForm");
        var filename = uploadForm.elements[0].value;
        if (filename && filename.length > 0) {
            var loader = document.getElementById("loader");
            loader.style.visibility = "visible";
        }
    });
    var hiddenUploadFrame = document.getElementById("hiddenUploadFrame");
    hiddenUploadFrame.addEventListener('load', () => {
        var hiddenUploadFrame = document.getElementById("hiddenUploadFrame");
        var json = JSON.parse(hiddenUploadFrame.contentDocument.body.innerText);
        var fileName = json.fileName;
        var loader = document.getElementById("loader");
        loader.style.visibility = "hidden";
        modal.style.display = "none";
        addToTranscript("File", "Uploaded file : " + fileName + " ("+ json.fileSize + " bytes)");
    });
};

window.addEventListener('load', initUIEvents);

Feel free to play with the code and customize it for your requirements.

8. Demo

After all the files have been created, we build the project and run the application as a Spring Boot application.

@SpringBootApplication
public class ChatUiWithRagApplication {

  public static void main(String[] args) {
    SpringApplication.run(ChatUiWithRagApplication.class, args);
  }
}

Open the URL: http://localhost:8080/login in a new browser window. It will show the login screen.

Enter the username and password: howtodoinjava/password. The Chat application will be loaded.

Click on ‘Upload File’ button and select a file from the file system.

Now, you can ask any question related to the information in the uploaded document. Based on the context information, LLM will respond with the answer, which will be displayed in the chat window.

You can check the request and response in the logs:

2024-07-30T12:42:28.198+05:30 DEBUG 15892 --- [nio-8080-exec-6] o.s.a.c.c.advisor.SimpleLoggerAdvisor    : request: AdvisedRequest[chatModel=OpenAiChatModel [defaultOptions=OpenAiChatOptions: {"streamUsage":false,"model":"gpt-4o","temperature":0.7}], userText=List all the programming languages discussed the document., systemText=, chatOptions=OpenAiChatOptions: {"streamUsage":false,"model":"gpt-4o","temperature":0.7}, media=[], functionNames=[], functionCallbacks=[], messages=[], userParams={}, systemParams={}, advisors=[SimpleLoggerAdvisor, org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor@2e1df6eb, org.springframework.ai.chat.client.advisor.PromptChatMemoryAdvisor@44f66513], advisorParams={chat_memory_conversation_id=org.springframework.security.core.userdetails.User [Username=howtodoinjava, Password=[PROTECTED], Enabled=true, AccountNonExpired=true, CredentialsNonExpired=true, AccountNonLocked=true, Granted Authorities=[ROLE_USER]]}]

2024-07-30T12:42:30.575+05:30 DEBUG 15892 --- [nio-8080-exec-6] o.s.a.c.c.advisor.SimpleLoggerAdvisor    : response: {"result":{"metadata":{"contentFilterMetadata":null,"finishReason":"STOP"},"output":{"messageType":"ASSISTANT","metadata":{"finishReason":"STOP","id":"chatcmpl-9qbK1hORNUoUNEQb9J5RdNPYCZNAO","role":"ASSISTANT","messageType":"ASSISTANT"},"toolCalls":[],"content":"{\n  \"answer\": \"Machine Language, Assembly language, Third generation language, Fourth Generation language (4GL), Fifth Generation language (5GL)\"\n}"}},"metadata":{"id":"chatcmpl-9qbK1hORNUoUNEQb9J5RdNPYCZNAO","model":"gpt-4o-2024-05-13","rateLimit":{"requestsLimit":500,"requestsRemaining":499,"tokensLimit":30000,"tokensRemaining":29100,"requestsReset":0.120000000,"tokensReset":8.000000000},"usage":{"generationTokens":31,"promptTokens":775,"totalTokens":806},"promptMetadata":[],"empty":false},"results":[{"metadata":{"contentFilterMetadata":null,"finishReason":"STOP"},"output":{"messageType":"ASSISTANT","metadata":{"finishReason":"STOP","id":"chatcmpl-9qbK1hORNUoUNEQb9J5RdNPYCZNAO","role":"ASSISTANT","messageType":"ASSISTANT"},"toolCalls":[],"content":"{\n  \"answer\": \"Machine Language, Assembly language, Third generation language, Fourth Generation language (4GL), Fifth Generation language (5GL)\"\n}"}}]}

9. Summary

In this Spring AI RAG example, we built an end-to-end chat application capable of answering user questions against the information in an uploaded file. The demo uses a bare minimum technology stack and in-memory processing for most parts. You are advised to replace each component individually to learn the concepts in more detail and make it production-ready.

Happy Learning !!

Source Code on Github

Weekly Newsletter

Stay Up-to-Date with Our Weekly Updates. Right into Your Inbox.

Comments

Subscribe
Notify of
0 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.