Learn to build a fully functional chatbot application with RAG (Retrieval-Augmented Generation) capabilities using Spring AI. The demo application has the following tech stack:
- Framework: Spring AI and Spring Web (for file upload and REST APIs)
- Security: Spring Security
- File reader: Apache Tika
- Vector Database: In-memory SimpleVectorStore (use an enterprise solution in production such as Redis or PgVector)
- ChatBot UI: Thymleaf and JavaScript (use an enterprise UI framework such as Angular or React in production)

1. Prerequisites
Before starting the demo, you must have an OpenAI API key that will authenticate your requests to OpenAI LLM models text-embedding-ada-002 (for vector embeddings) and gpt-3.5-turbo (for chat support).
Add the API key and model names in the application properties.
spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.model=gpt-3.5-turbo
spring.ai.openai.embedding.enabled=true
spring.ai.openai.embedding.model=text-embedding-ada-002
2. Maven
After creating a new Spring boot project, add the following dependencies. For a complete reference of setting up the dependencies, refer to getting started with Spring AI.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security</artifactId>
</dependency>
<dependency>
<groupId>org.thymeleaf.extras</groupId>
<artifactId>thymeleaf-extras-springsecurity6</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-tika-document-reader</artifactId>
</dependency>
3. Vector Store
Spring AI, after detecting the available classes, automatically creates a bean of type EmbeddingModel with default values. We can customize the embedding options using vendor-specific properties. For example, for the OpenAI embedding model, we can override these customization-related properties.
spring.ai.openai.embedding.options.model=text-embedding-ada-002
spring.ai.openai.embedding.metadata-mode=EMBED
spring.ai.openai.chat.embeddings-path=/v1/embeddings
After the EmbeddingModel is configured, we can use a VectorStore. Spring supports several vector stores. For the demo purpose, we are using Spring-provided in-memory vector store SimpleVectorStore.
import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.vectorstore.SimpleVectorStore;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class VectorStoreConfig {
@Bean
VectorStore simleVectorStore(EmbeddingModel embeddingModel) {
return new SimpleVectorStore(embeddingModel);
}
}
4. Uploading Files and Generating Embeddings
In the context of LLMs, the RAG is essentially the process of ingesting and processing external information before generating a response, making it more informative and relevant. The most common approach to data ingestion in UI-based applications is uploading the files. In the backend, we read the file content, generate the embeddings using the configured EmbeddingModel, and store them in the vector database using vectorStore.add() method.
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;
import java.util.List;
import org.slf4j.Logger;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.tika.TikaDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;
@RestController
public class UploadController {
private static final Logger LOG = org.slf4j.LoggerFactory.getLogger(UploadController.class);
private final VectorStore vectorStore;
public UploadController(VectorStore vectorStore) {
this.vectorStore = vectorStore;
}
@PostMapping("/upload")
public UploadResponse upload(@RequestParam("file") MultipartFile file) throws IOException {
//Read file to a temporary location
Path destinationFile = Paths.get("/temp").resolve(
Paths.get(file.getOriginalFilename())).normalize().toAbsolutePath();
try (InputStream inputStream = file.getInputStream()) {
Files.copy(inputStream, destinationFile, StandardCopyOption.REPLACE_EXISTING);
}
//Read and split the document contents
TikaDocumentReader documentReader = new TikaDocumentReader(destinationFile.toUri().toString());
List<Document> documents = documentReader.get();
List<Document> splitDocuments = new TokenTextSplitter().apply(documents);
vectorStore.add(splitDocuments); // generate and store embeddings in vector store
return new UploadResponse(file.getOriginalFilename(), file.getContentType(), file.getSize());
}
private static record UploadResponse(String fileName, String fileType, long fileSize) {
}
}
We can control the size of uploaded files using the ‘spring.servlet.multipart.*‘ properties:
spring.servlet.multipart.max-file-size=50MB
spring.servlet.multipart.max-request-size=50MB
5. Chat Controller
The chat application invokes REST APIs to send user queries and display the AI response. The following REST API accepts the user prompt in the Question object and sends back the response in the Answer object.
The API uses org.springframework.ai.chat.client.ChatClient for sending the user question to LLM (GPT model). Additionally, it uses QuestionAnswerAdvisor to enable the pattern of RAG by appending the additional prompt with context information related to the user text.
private static final String DEFAULT_USER_TEXT_ADVISE = """
Context information is below.
---------------------
{question_answer_context}
---------------------
Given the context and provided history information and not prior knowledge,
reply to the user comment. If the answer is not in the context, inform
the user that you can't answer the question.
""";
Additionally, InMemoryChatMemory provides storage for chat conversation history in memory that is used for further refining the context of user prompts in the whole conversation.
import com.howtodoinjava.ai.demo.web.model.Answer;
import com.howtodoinjava.ai.demo.web.model.Question;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.PromptChatMemoryAdvisor;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.chat.client.advisor.SimpleLoggerAdvisor;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.security.core.Authentication;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
@RestController
@RequestMapping("/chat")
public class ChatController {
private final ChatClient chatClient;
public ChatController(ChatClient.Builder chatClientBuilder, VectorStore vectorStore) {
this.chatClient = chatClientBuilder
.defaultAdvisors(
new SimpleLoggerAdvisor(),
new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()),
new PromptChatMemoryAdvisor(new InMemoryChatMemory()))
.build();
}
@PostMapping
public Answer chat(@RequestBody Question question, Authentication user) {
return chatClient.prompt()
.user(question.question())
.advisors(
advisorSpec -> advisorSpec.param(CHAT_MEMORY_CONVERSATION_ID_KEY, user.getPrincipal()))
.call()
.entity(Answer.class);
}
}
public record Question(String question) {
}
public record Answer(String answer) {
}
6. Security Configuration
An important part of RAG applications is keeping track of users’ activities so they do not abuse system resources. To know which user is uploading files and doing chats, login-based security must be built so we know which user is logged in and using the application.
Spring Security integrates seamlessly with Spring Web and it can be used to provide the basic login functionality. We are configuring the minimum configuration with in-memory user authentication having a single user with username as ‘howtodoinjava’ and password as ‘password’.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.core.authority.SimpleGrantedAuthority;
import org.springframework.security.core.userdetails.User;
import org.springframework.security.core.userdetails.UserDetails;
import org.springframework.security.core.userdetails.UserDetailsService;
import org.springframework.security.crypto.bcrypt.BCryptPasswordEncoder;
import org.springframework.security.crypto.password.PasswordEncoder;
import org.springframework.security.provisioning.InMemoryUserDetailsManager;
import org.springframework.security.web.SecurityFilterChain;
@Configuration
public class SecurityConfig {
@Bean
SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
return http
.authorizeRequests(authorize -> authorize
.requestMatchers("/login").permitAll()
.anyRequest().authenticated()
)
.formLogin(formLogin -> formLogin
.loginPage("/login")
.defaultSuccessUrl("/", true)
)
.csrf(csrf -> csrf.disable())
.headers(headers -> headers.frameOptions(frameOptions -> frameOptions.sameOrigin()))
.build();
}
@Bean
UserDetailsService userDetailsService(PasswordEncoder passwordEncoder) {
List<UserDetails> usersList = new ArrayList<>();
usersList.add(new User("howtodoinjava", passwordEncoder.encode("password"), // Username and password
Arrays.asList(new SimpleGrantedAuthority("ROLE_USER"))));
return new InMemoryUserDetailsManager(usersList);
}
@Bean
PasswordEncoder passwordEncoder() {
return new BCryptPasswordEncoder();
}
}
We have created a simple HTML form with two text fields and one submit button for providing the username/password. Nothing fancy here.
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:th="http://www.thymeleaf.org">
<head>
<title>AI Chat Demo</title>
</head>
<body>
<h1>Login</h1>
<form method="POST" th:action="@{/login}">
<table>
<tr>
<td>User:</td>
<td><input type="text" name="username"/></td>
</tr>
<tr>
<td>Password:</td>
<td><input type="password" name="password"/></td>
</tr>
<tr>
<td colspan="2"><input name="submit" type="submit" value="Submit"/></td>
</tr>
</table>
</form>
</body>
</html>
7. Chat Application UI
After the user is logged in, he will be directed to the chat application, where he can upload files and ask questions, as he can in other chat-based applications, such as chatGPT.
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"
xmlns:sec="http://www.thymeleaf.org/extras/spring-security">
<head>
<title>AI Chat Demo</title>
<link rel="stylesheet" th:href="@{/style.css}"/>
<script th:src="@{/script.js}"></script>
<script>
var username = "[[${#authentication.principal.username}]]";
</script>
</head>
<body>
<iframe name="hiddenUploadFrame" id="hiddenUploadFrame" style="display:none;"></iframe>
<div id="uploadModal" class="modal">
<div class="modal-content">
<span class="closeModalSpan">×</span>
<h2>Upload a file</h2>
<form id="uploadForm" method="post" th:action="@{/upload}" enctype="multipart/form-data" target="hiddenUploadFrame">
<input type="file" name="file" id="file"/>
<input type="submit" value="Upload" name="submit" id="submit"/>
</form>
<div class="loader" id="loader">
<svg class="circular">
<circle class="path" cx="50" cy="50" r="20" fill="none" stroke-width="5" stroke-miterlimit="10"></circle>
</svg>
</div>
</div>
</div>
<div id="chatArea">
<div id="header">
<h2>AI Chat Demo</h2>
</div>
<div id="transcript">
</div>
<div id="controls">
<button id="uploadFile">Upload File</button>
</div>
<textarea id="userInput" type="text" placeholder="How can I help?" rows="1"></textarea>
<button id="typedTextSubmit" name="typedTextSubmit">Submit</button>
</div>
</body>
</html>
We are using simple JavaScript code to append the user questions, AI responses, file upload forms, and status. Every prompt/message is first converted into a new ‘DIV‘ element and then appended to the main chatbox.
const addToTranscript = (who, text) => {
let b = document.querySelector('#transcript');
let name = (who === "User") ? username : who;
b.innerHTML += createTranscriptEntry(who, name, text);
b.scrollTop = b.scrollHeight;
console.log(text);
};
const createTranscriptEntry = (who, name, text) => {
return `
<div class="${who}Entry">
<div><img src="/${who}.png" width="18" height="18" style="vertical-align: middle;"/> <span style="vertical-align: middle;"><b>${name}:</b> ${text}</span></div>
</div>`
};
const handleResponse = (response) => {
addToTranscript("AI", response.answer);
};
const postQuestion = (question) => {
let data = {
question: question
};
fetch("/chat", {
method: "POST",
headers: {
"Content-Type": "application/json"
},
body: JSON.stringify(data)
})
.then(res => res.json())
.then(handleResponse);
};
const submitTypedText = (event) => {
let typedTextInput = document.querySelector('#userInput');
let typedText = typedTextInput.value;
// don't submit if empty
if (typedText.trim().length === 0) {
return false;
}
// submit it here
addToTranscript("User", typedText);
postQuestion(typedText);
typedTextInput.value = '';
return false;
};
const initUIEvents = () => {
let t = document.querySelector('#typedTextSubmit');
t.addEventListener('click', submitTypedText);
var textarea = document.querySelector('textarea#userInput');
textarea.addEventListener('keydown', e => {
if (e.key === "Enter" && !e.shiftKey) {
e.preventDefault();
submitTypedText(e);
}
});
var modal = document.getElementById("uploadModal");
var openModalBtn = document.getElementById("uploadFile");
openModalBtn.addEventListener('click', () => {
modal.style.display = "block";
});
var closeModalSpan = document.getElementsByClassName("closeModalSpan")[0];
closeModalSpan.addEventListener('click', () => {
modal.style.display = "none";
});
uploadForm.addEventListener('submit', () => {
var uploadForm = document.getElementById("uploadForm");
var filename = uploadForm.elements[0].value;
if (filename && filename.length > 0) {
var loader = document.getElementById("loader");
loader.style.visibility = "visible";
}
});
var hiddenUploadFrame = document.getElementById("hiddenUploadFrame");
hiddenUploadFrame.addEventListener('load', () => {
var hiddenUploadFrame = document.getElementById("hiddenUploadFrame");
var json = JSON.parse(hiddenUploadFrame.contentDocument.body.innerText);
var fileName = json.fileName;
var loader = document.getElementById("loader");
loader.style.visibility = "hidden";
modal.style.display = "none";
addToTranscript("File", "Uploaded file : " + fileName + " ("+ json.fileSize + " bytes)");
});
};
window.addEventListener('load', initUIEvents);
Feel free to play with the code and customize it for your requirements.
8. Demo
After all the files have been created, we build the project and run the application as a Spring Boot application.
@SpringBootApplication
public class ChatUiWithRagApplication {
public static void main(String[] args) {
SpringApplication.run(ChatUiWithRagApplication.class, args);
}
}
Open the URL: http://localhost:8080/login in a new browser window. It will show the login screen.

Enter the username and password: howtodoinjava/password. The Chat application will be loaded.

Click on ‘Upload File’ button and select a file from the file system.


Now, you can ask any question related to the information in the uploaded document. Based on the context information, LLM will respond with the answer, which will be displayed in the chat window.

You can check the request and response in the logs:
2024-07-30T12:42:28.198+05:30 DEBUG 15892 --- [nio-8080-exec-6] o.s.a.c.c.advisor.SimpleLoggerAdvisor : request: AdvisedRequest[chatModel=OpenAiChatModel [defaultOptions=OpenAiChatOptions: {"streamUsage":false,"model":"gpt-4o","temperature":0.7}], userText=List all the programming languages discussed the document., systemText=, chatOptions=OpenAiChatOptions: {"streamUsage":false,"model":"gpt-4o","temperature":0.7}, media=[], functionNames=[], functionCallbacks=[], messages=[], userParams={}, systemParams={}, advisors=[SimpleLoggerAdvisor, org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor@2e1df6eb, org.springframework.ai.chat.client.advisor.PromptChatMemoryAdvisor@44f66513], advisorParams={chat_memory_conversation_id=org.springframework.security.core.userdetails.User [Username=howtodoinjava, Password=[PROTECTED], Enabled=true, AccountNonExpired=true, CredentialsNonExpired=true, AccountNonLocked=true, Granted Authorities=[ROLE_USER]]}]
2024-07-30T12:42:30.575+05:30 DEBUG 15892 --- [nio-8080-exec-6] o.s.a.c.c.advisor.SimpleLoggerAdvisor : response: {"result":{"metadata":{"contentFilterMetadata":null,"finishReason":"STOP"},"output":{"messageType":"ASSISTANT","metadata":{"finishReason":"STOP","id":"chatcmpl-9qbK1hORNUoUNEQb9J5RdNPYCZNAO","role":"ASSISTANT","messageType":"ASSISTANT"},"toolCalls":[],"content":"{\n \"answer\": \"Machine Language, Assembly language, Third generation language, Fourth Generation language (4GL), Fifth Generation language (5GL)\"\n}"}},"metadata":{"id":"chatcmpl-9qbK1hORNUoUNEQb9J5RdNPYCZNAO","model":"gpt-4o-2024-05-13","rateLimit":{"requestsLimit":500,"requestsRemaining":499,"tokensLimit":30000,"tokensRemaining":29100,"requestsReset":0.120000000,"tokensReset":8.000000000},"usage":{"generationTokens":31,"promptTokens":775,"totalTokens":806},"promptMetadata":[],"empty":false},"results":[{"metadata":{"contentFilterMetadata":null,"finishReason":"STOP"},"output":{"messageType":"ASSISTANT","metadata":{"finishReason":"STOP","id":"chatcmpl-9qbK1hORNUoUNEQb9J5RdNPYCZNAO","role":"ASSISTANT","messageType":"ASSISTANT"},"toolCalls":[],"content":"{\n \"answer\": \"Machine Language, Assembly language, Third generation language, Fourth Generation language (4GL), Fifth Generation language (5GL)\"\n}"}}]}
9. Summary
In this Spring AI RAG example, we built an end-to-end chat application capable of answering user questions against the information in an uploaded file. The demo uses a bare minimum technology stack and in-memory processing for most parts. You are advised to replace each component individually to learn the concepts in more detail and make it production-ready.
Happy Learning !!
Comments