Java Langchain MongoDB Example

Omozegie AziegbeJune 4th, 2025Last Updated: June 3rd, 2025

0 24 10 minutes read

This article explores how to build an AI-powered chatbot using Langchain4j, MongoDB Atlas, and Ollama running locally. This solution leverages the Retrieval-Augmented Generation (RAG) pattern to fetch relevant information from your data and use it to generate accurate and contextual responses.

This guide walks through the setup, from data ingestion to chat response generation, while integrating embeddings, vector search, and a local language model.

1. What is RAG?

Retrieval-Augmented Generation (RAG) is an AI architecture that enhances the output of large language models (LLMs) by combining them with an external knowledge retrieval component. It is a hybrid approach in natural language processing (NLP) that combines two powerful technologies: retrieval-based methods and generative language models. It aims to improve the factual accuracy, relevance, and contextual richness of responses produced by large language models (LLMs).

Traditional LLM Limitations

While LLMs like GPT, LLaMA, or Mistral are trained on massive corpora and can generate fluent responses, they come with key limitations:

Static Knowledge: LLMs cannot learn or access data after training unless fine-tuned or retrained.
Context Limitations: They operate solely on the input prompt and their internal weights.
Hallucinations: They may generate plausible-sounding but incorrect or fabricated information.

1.1 Benefits of RAG

RAG brings together the best of two worlds: the reasoning and language capabilities of LLMs with the factual grounding of a live knowledge base. Key advantages include:

Accuracy & Reliability: Retrieved documents provide grounded context, reducing hallucinations.
Overcomes LLM limitations: Helps models respond accurately even if the training data is outdated or lacks specific context.
Cost Efficiency: No need to retrain or fine-tune models every time the data changes. Update the database instead.
Explainability: Responses can be traced back to specific documents, aiding transparency and trust.
Domain-specific support: Easily integrates your own knowledge base (support docs, policies, product manuals).
Reduced hallucinations: Limits the risk of fabricated answers by grounding responses in real documents.

2. MongoDB for RAG

MongoDB is a document-based NoSQL database that offers a flexible and scalable foundation for building RAG applications. Its native JSON-like document structure and advanced features such as full-text search, Atlas Vector Search, and seamless integration with cloud services make it particularly well-suited for handling unstructured or semi-structured knowledge sources needed in RAG systems.

2.1 Why Use MongoDB in a RAG Architecture?

Document-Oriented Storage: MongoDB stores data as BSON documents, allowing rich text, metadata, and structured fields to coexist. This structure aligns perfectly with the “chunks” of information retrieved in RAG pipelines.
Full-Text and Vector Search: MongoDB Atlas supports full-text search via built-in Lucene integration, and more importantly for RAG, Vector Search for retrieving similar documents based on embeddings. This is essential for semantic search in retrieval pipelines.
Flexible Schema: RAG systems often evolve and require flexibility in the way documents are structured. MongoDB’s schema-less design enables you to update or extend documents without painful migrations.
Horizontal Scalability: As documents grows, whether it’s product manuals, legal documents, or research articles, MongoDB scales effortlessly through sharding and distributed clusters.

2.2 Example Use in RAG

Storage: Store each paragraph, answer, or snippet as a MongoDB document with metadata and embedding vectors.
Retrieval: Use Atlas Vector Search to find top-k relevant documents based on semantic similarity to a user query.
Augmentation: Feed retrieved documents to the LLM to ground the response in real data.

3. Prerequisites

For this article, you will need:

Java 21 or higher.
Maven for dependency management.
A MongoDB Atlas account with a live cluster.
Ollama installed locally with a LLaMA3 or orca-mini model.

Set Up Ollama Locally

To get started with Ollama, install it by following the official instructions for your operating system. Once installed, use the command line to pull and run a model, such as downloading and launching the LLaMA3 model.

ollama pull orca-mini
ollama run orca-mini

Set Up MongoDB Atlas

Sign in to MongoDB Atlas, create a free cluster, then set up a database named rag_app with a collection called documents.

4. Project Setup

Create a new Maven project and add the following dependencies to your pom.xml:

        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j</artifactId>
            <version>1.0.0-alpha1</version>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-ollama</artifactId>
            <version>1.0.0-alpha1</version>
        </dependency>
        
        <dependency>  
            <groupId>dev.langchain4j</groupId>  
            <artifactId>langchain4j-mongodb-atlas</artifactId>  
            <version>1.0.0-alpha1</version>  
        </dependency>

langchain4j
- This is the foundational library for working with LangChain4j in Java.
- It includes features for building chains, managing memory, and handling input/output.
langchain4j-ollama
- This module integrates LangChain4j with Ollama, a local runtime for LLMs such as LLaMA and Mistral.
- It allows your application to interact with models running directly on your machine without needing external APIs.
langchain4j-mongodb-atlas
- Adds support for using MongoDB Atlas as the vector store and retrieval database.
- Enables storing and querying documents using metadata and embeddings through Atlas Vector Search.

5. Implementing the RAG Application

Application Overview

The chatbot leverages Retrieval-Augmented Generation (RAG) using the following architecture:

Embedding Model: nomic-embed-text via Ollama creates vector representations of documents.
Storage: MongoDB Atlas stores the vector embeddings.
Chat Model: orca-mini via Ollama generates responses based on user input and retrieved documents.
LangChain4j ties everything together through document ingestion, vector search, and model interaction.

Sample Dataset (RAG Knowledge Base)

This use case involves an internal assistant designed for a Smart Home Energy System, capable of answering questions related to system components, installation steps, diagnostics, and routine maintenance. It serves as a helpful tool for customer support engineers, field technicians, and internal operations staff who require accurate and quick access to technical information.

Let’s simulate the contents of five documentation files located in the application’s resources folder:

doc1.txt – System Components Overview

The SmartHome Energy Hub consists of four core components:
1. Energy Meter – Tracks real-time energy usage.
2. Solar Inverter – Converts solar energy for home use.
3. Battery Storage Unit – Stores excess solar power.
4. Control Panel – Provides centralized system monitoring.

All components are connected via Zigbee protocol to the SmartHome central controller.

doc2.txt – Setup Instructions

To set up the system:
1. Mount the solar panels and connect them to the inverter.
2. Connect the inverter output to the Battery Storage Unit.
3. Link the Energy Meter to the Control Panel via Zigbee.
4. Configure the SmartHome app with the system ID to start monitoring.

doc3.txt – Faults and Errors

Common Errors:
- ERR001: Inverter overheating – Check ventilation.
- ERR002: Battery not charging – Inspect cable connections.
- ERR003: Zigbee connection lost – Re-pair the device using the Control Panel.

To reset errors, hold the Control Panel reset button for 10 seconds.

doc4.txt – Maintenance Guide

Monthly Maintenance Checklist:
- Inspect solar panels for debris or damage.
- Verify battery charge levels.
- Ensure firmware is up to date via the SmartHome app.
- Test Zigbee connections for all devices.

Report any physical faults using the issue tracker form in the dashboard.

doc5.txt – Operational Workflow

System Workflow:
1. Solar panels generate electricity.
2. The inverter converts it to AC power.
3. Power is supplied to the home or stored in the battery.
4. The Energy Meter records usage statistics.
5. The Control Panel displays system status and alerts.

Battery discharge occurs automatically during low solar input.

Setting Up MongoDB Atlas

The following snippet initializes the MongoDB client and embedding store:

// Set your MongoDB configuration
String mongodbUrl = "mongodb+srv://<user>:<password>@cluster.mongodb.net/?retryWrites=true&w=majority";

// Create MongoDB client
MongoClient mongoClient = MongoClients.create(mongodbUrl);

// Embedding Store
EmbeddingStore<TextSegment> embeddingStore = createEmbeddingStore(mongoClient);

This block sets up the MongoDB connection using the Atlas connection string. MongoClient is the main interface to interact with the database and will later store our text embeddings for similarity search. The createEmbeddingStore() method configures the MongoDB vector store, collection, and index used by LangChain4j for semantic search.

Defining the Embedding Store

The embedding store is backed by MongoDB’s vector capabilities using the MongoDbEmbeddingStore class:

    private static EmbeddingStore<TextSegment> createEmbeddingStore(MongoClient mongoClient) {
        String databaseName = "rag_app";
        String collectionName = "documents";
        String indexName = "document";
        Long maxResultRatio = 10L;
        CreateCollectionOptions createCollectionOptions = new CreateCollectionOptions();
        Bson filter = null;
        Set<String> metadataFields = new HashSet<>();
        IndexMapping indexMapping = new IndexMapping(768, metadataFields);
        Boolean createIndex = true;

        return new MongoDbEmbeddingStore(
                mongoClient,
                databaseName,
                collectionName,
                indexName,
                maxResultRatio,
                createCollectionOptions,
                filter,
                indexMapping,
                createIndex
        );
    }

This method configures a MongoDbEmbeddingStore with:

Database: rag_app
Collection: documents
Index: document
Embedding vector size: 768

It ensures proper indexing and schema for vector similarity search.

Configuring the Embedding Model

The app uses Ollama to run the embedding and chat models locally:

        // Create the embedding model with Ollama
        EmbeddingModel embeddingModel = OllamaEmbeddingModel.builder()
                .baseUrl("http://localhost:11434")
                .modelName("nomic-embed-text")
                .timeout(Duration.ofMinutes(10))
                .build();

We define an embedding model using OllamaEmbeddingModel. This model converts text into vectors. The nomic-embed-text model is a lightweight, open-source embedding model. Ollama runs it locally and serves it over HTTP.

Defining the Chat Language Model

        ChatLanguageModel chatModel = OllamaChatModel.builder()
                .baseUrl("http://localhost:11434")
                .timeout(Duration.ofMinutes(10))
                .modelName("orca-mini")
                .build();

Here, we initialize a chat model using Ollama again, but this time specifying a conversational LLM – orca-mini. It processes user queries and generates context-aware answers. Ensure you have pulled the required models.

Loading and Parsing Documents

We load multiple text documents from the src/main/resources directory for ingestion:

       // Load documents
        Document doc1 = loadDocumentFromResource("doc1.txt", new TextDocumentParser());
        Document doc2 = loadDocumentFromResource("doc2.txt", new TextDocumentParser());
        Document doc3 = loadDocumentFromResource("doc3.txt", new TextDocumentParser());
        Document doc4 = loadDocumentFromResource("doc4.txt", new TextDocumentParser());
        Document doc5 = loadDocumentFromResource("doc5.txt", new TextDocumentParser());

        List<Document> documents = List.of(doc1, doc2, doc3, doc4, doc5);

This section loads documents (help guides, internal documentation) from our application’s resources folder. These documents serve as the knowledge base for the assistant. The loadDocumentFromResource helper method reads the resource stream and parses it using LangChain4j’s TextDocumentParser.

Helper Method to Load Resources

private static Document loadDocumentFromResource(String resourceName, DocumentParser parser) throws IOException {
    try (InputStream inputStream = getResourceAsStream(resourceName)) {
        Objects.requireNonNull(inputStream, "Resource not found: " + resourceName);
        return parser.parse(inputStream);
    }
}

This method reads a resource file and parses it into a Document. It is used to load doc1.txt to doc5.txt from the resources folder.

Splitting and Ingesting Documents

To improve retrieval accuracy, documents are split into smaller chunks using a recursive strategy:

        DocumentSplitter splitter = DocumentSplitters.recursive(300, 30);

        EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
                .documentSplitter(splitter)
                .embeddingModel(embeddingModel)
                .embeddingStore(embeddingStore)
                .build();

        ingestor.ingest(documents);

This step generates vector embeddings for each segment and stores them in MongoDB. We split documents into manageable text segments using DocumentSplitters.recursive. Then, the EmbeddingStoreIngestor generates embeddings for each segment and stores them in MongoDB.

Setting Up the Retriever

This retriever performs semantic search to fetch the top 3 relevant chunks based on the input query:

        // Content Retriever
        ContentRetriever contentRetriever = EmbeddingStoreContentRetriever.builder()
                .embeddingStore(embeddingStore)
                .embeddingModel(embeddingModel)
                .maxResults(3)
                .minScore(0.6)
                .build();

This code snippet finds the most relevant document chunks based on a user’s query. It uses vector similarity and returns the top 3 matches above a score threshold of 0.6. (Only results with a semantic similarity score ≥ 0.6 are returned.)

Defining the Assistant Interface

public interface Assistant {

    @SystemMessage("""
        You are an AI assistant trained exclusively on a set of internal documents containing specialized knowledge. 
        Answer user queries accurately and clearly using only the information contained in these documents. 
        Do not use external data, make assumptions, or speculate. If the information is not covered in the documents, 
        respond with: "The documents do not contain enough information to answer this question."
        Keep responses factual, concise, and helpful.
    """)
    String answer(String question);
}

This interface defines the assistant’s contract. The @SystemMessage provides instructions to the LLM, restricting its responses strictly to the document set and promoting factual, grounded answers.

Creating the Assistant Service

Using LangChain4j’s AiServices, we bind the Assistant interface to the actual models:

       // Assistant
        Assistant assistant = AiServices.builder(Assistant.class)
                .chatLanguageModel(chatModel)
                .contentRetriever(contentRetriever)
                .build();

Langchain4j wires everything together into a single assistant object. It uses the chat model for answering and the content retriever for grounding the answers on internal documents.

Enabling Interactive Chat

The chatbot runs in a simple loop that reads user input and sends it to the assistant:

        Scanner scanner = new Scanner(System.in);

        System.out.println("AI Assistant is ready. Ask your questions (type 'exit' to quit):");

        while (true) {
            System.out.print("You: ");
            String input = scanner.nextLine();
            if ("exit".equalsIgnoreCase(input)) {
                break;
            }

            String response = assistant.answer(input);
            System.out.println("Assistant: " + response);
        }

This is a simple console interface. The user types a question, which is passed to the assistant. The assistant replies using only the embedded documents.

Full Main Class

public class ChatBotApp {

    public static void main(String[] args) throws IOException {
        // Set your MongoDB configuration
        String mongodbUrl = "mongodb+srv://<user>:<password>@cluster.mongodb.net/?retryWrites=true&w=majority";       // Replace with your MongoDB Atlas connection string

        // Create MongoDB client
        MongoClient mongoClient = MongoClients.create(mongodbUrl);

        // Embedding Store
        EmbeddingStore<TextSegment> embeddingStore = createEmbeddingStore(mongoClient);

        // Create the embedding model with Ollama
        EmbeddingModel embeddingModel = OllamaEmbeddingModel.builder()
                .baseUrl("http://localhost:11434")
                .modelName("nomic-embed-text")
                .timeout(Duration.ofMinutes(10))
                .build();

        ChatLanguageModel chatModel = OllamaChatModel.builder()
                .baseUrl("http://localhost:11434")
                .timeout(Duration.ofMinutes(10))
                .modelName("orca-mini")
                .build();

        System.out.println("LLaMA3 (Ollama) embedding model and MongoDB store initialized.");

        // Load documents
        Document doc1 = loadDocumentFromResource("doc1.txt", new TextDocumentParser());
        Document doc2 = loadDocumentFromResource("doc2.txt", new TextDocumentParser());
        Document doc3 = loadDocumentFromResource("doc3.txt", new TextDocumentParser());
        Document doc4 = loadDocumentFromResource("doc4.txt", new TextDocumentParser());
        Document doc5 = loadDocumentFromResource("doc5.txt", new TextDocumentParser());

        List<Document> documents = List.of(doc1, doc2, doc3, doc4, doc5);

        System.out.println("Loaded " + documents.size() + " documents");

        DocumentSplitter splitter = DocumentSplitters.recursive(300, 30);

        EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
                .documentSplitter(splitter)
                .embeddingModel(embeddingModel)
                .embeddingStore(embeddingStore)
                .build();

        ingestor.ingest(documents);

        // Content Retriever
        ContentRetriever contentRetriever = EmbeddingStoreContentRetriever.builder()
                .embeddingStore(embeddingStore)
                .embeddingModel(embeddingModel)
                .maxResults(3)
                .minScore(0.6)
                .build();

        // Assistant
        Assistant assistant = AiServices.builder(Assistant.class)
                .chatLanguageModel(chatModel)
                .contentRetriever(contentRetriever)
                .build();

        Scanner scanner = new Scanner(System.in);

        System.out.println("AI Assistant is ready. Ask your questions (type 'exit' to quit):");

        while (true) {
            System.out.print("You: ");
            String input = scanner.nextLine();
            if ("exit".equalsIgnoreCase(input)) {
                break;
            }

            String response = assistant.answer(input);
            System.out.println("Assistant: " + response);
        }

    }

    private static Document loadDocumentFromResource(String resourceName, DocumentParser parser) throws IOException {
        try (InputStream inputStream = getResourceAsStream(resourceName)) {
            Objects.requireNonNull(inputStream, "Resource not found: " + resourceName);
            return parser.parse(inputStream);
        }
    }

    protected static InputStream getResourceAsStream(String resourceName) {
        return ChatBotApp.class.getClassLoader().getResourceAsStream(resourceName);
    }

    private static EmbeddingStore<TextSegment> createEmbeddingStore(MongoClient mongoClient) {
        String databaseName = "rag_app";
        String collectionName = "documents";
        String indexName = "document";
        Long maxResultRatio = 10L;
        CreateCollectionOptions createCollectionOptions = new CreateCollectionOptions();
        Bson filter = null;
        Set<String> metadataFields = new HashSet<>();
        IndexMapping indexMapping = new IndexMapping(768, metadataFields);
        Boolean createIndex = true;

        return new MongoDbEmbeddingStore(
                mongoClient,
                databaseName,
                collectionName,
                indexName,
                maxResultRatio,
                createCollectionOptions,
                filter,
                indexMapping,
                createIndex
        );
    }
}

Compile and Run

Use Maven to compile and run the application:

mvn clean compile

mvn exec:java -Dexec.mainClass="com.jcg.langchain4j.chatbot.ChatBotApp"

You should now be able to interact with the chatbot, which retrieves context from MongoDB and generates responses using Ollama’s orca-mini model. When the chatbot is up and running, it will display:

AI Assistant is ready. Ask your questions (type 'exit' to quit):
You:

You can try asking questions based on the internal knowledge base, such as: “What should I do if the battery isn’t charging?”

Assistant Response:

Sample Response from the Assistant (Java + LangChain4j + MongoDB) — *Response from the Assistant (Java + LangChain4j + MongoDB)*

This response is generated based on vector similarity matching from the ingested document content, which includes one or more of the five text files. It is a well-informed answer that directly references the information available within the internal knowledge base.

By inspecting the MongoDB data using MongoDB Compass, you can view the full document content along with the corresponding generated embeddings.

6. Conclusion

In this article, we explored how to build an AI-powered chatbot in Java using LangChain4j and MongoDB Atlas. We walked through setting up the embedding model with Ollama, ingesting documents into a vector store backed by MongoDB, and configuring a conversational assistant that retrieves relevant context from internal documents. By integrating vector-based search with a local LLM, the chatbot can provide accurate, context-aware responses strictly based on your own knowledge base.

7. Download the Source Code

This article explored building applications with Java, LangChain, and MongoDB.

Download
You can download the full source code of this example here: java langchain mongodb

Java Langchain MongoDB Example

1. What is RAG?

1.1 Benefits of RAG

2. MongoDB for RAG

2.1 Why Use MongoDB in a RAG Architecture?

2.2 Example Use in RAG

3. Prerequisites

4. Project Setup

5. Implementing the RAG Application

Application Overview

Sample Dataset (RAG Knowledge Base)

Setting Up MongoDB Atlas

Defining the Embedding Store

Configuring the Embedding Model

Defining the Chat Language Model

Loading and Parsing Documents

Splitting and Ingesting Documents

Setting Up the Retriever

Defining the Assistant Interface

Creating the Assistant Service

Enabling Interactive Chat

Full Main Class

Compile and Run

6. Conclusion

7. Download the Source Code

Thank you!

Omozegie Aziegbe

Thank you!

1. What is RAG?

1.1 Benefits of RAG

2. MongoDB for RAG

2.1 Why Use MongoDB in a RAG Architecture?

2.2 Example Use in RAG

3. Prerequisites

4. Project Setup

5. Implementing the RAG Application

Application Overview

Sample Dataset (RAG Knowledge Base)

Setting Up MongoDB Atlas

Defining the Embedding Store

Configuring the Embedding Model

Defining the Chat Language Model

Loading and Parsing Documents

Splitting and Ingesting Documents

Setting Up the Retriever

Defining the Assistant Interface

Creating the Assistant Service

Enabling Interactive Chat

Full Main Class

Compile and Run

6. Conclusion

7. Download the Source Code

Thank you!

Related Articles

Thank you!