Building AI Applications using frameworks with MariaDB Vector Store

Vector databases have become essential for building intelligent applications. Since MariaDB Community Server version 11.7 and most recently in the MariaDB Community Server 11.8 LTS release, MariaDB has embraced this trend by introducing native vector store capabilities. While you could work with vectors using pure SQL, dedicated frameworks have made this process incredibly simple, allowing you to focus on building your AI applications rather than dealing with complex vector operations.
The Power of Dedicated Frameworks
While MariaDB Foundation’s blog entry and this video show how to work with vectors using pure SQL, modern frameworks improve the development experience. These frameworks handle all the complexity of:
- Vector operations and similarity calculations
- Document chunking and processing
- Embedding generation and management
- Database connection handling
- Query optimization
This means you can focus on building your AI applications without worrying about the underlying vector operations. Let’s see how these frameworks make it incredibly easy to work with MariaDB vector stores.
Processing PDF and Creating Embeddings
Thanks to dedicated frameworks, processing documents and creating embeddings is now a straightforward process. Here’s how simple it is in different languages:
Python Implementation
After installing dependencies:
pip install langchain-mariadb langchain_community pypdf pymysql langchain-text-splitters langchain-openai mariadb
export OPENAI_API_KEY=...
With LangChain, you can process documents and create embeddings in just a few lines of code:
from langchain_mariadb import MariaDBStore
from langchain_openai import OpenAIEmbeddings
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# 1. Load and split the PDF - just two lines!
loader = PyPDFLoader("mariadb-documentation.pdf")
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
texts = text_splitter.split_documents(documents)
# 2. Initialize embeddings and vector store - one line each!
vector_store = MariaDBStore(
embeddings=OpenAIEmbeddings(),
embedding_length=1536,
datasource="mysql+mariadbconnector://myuser:mypassword@localhost:3306/vectordb",
collection_name="mariadb_docs"
)
# 3. Add documents - just one line!
vector_store.add_documents(texts)
Node.js Implementation
After installing dependencies:
npm install @langchain/community @langchain/core @langchain/openai @langchain/textsplitters mariadb pdf-parse
export OPENAI_API_KEY=...
LangChain.js makes it equally simple with a modern, async-first approach:
import { MariaDBStore } from "@langchain/community/vectorstores/mariadb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import mariadb from "mariadb";
async function createData() {
// 1. Load and split the PDF - just a few lines!
const loader = new PDFLoader("C:/temp/mariadb_tutorial.pdf");
const docs = await loader.load();
const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200
});
const splitDocs = await textSplitter.splitDocuments(docs);
const vectorStore = await MariaDBStore.initialize(
new OpenAIEmbeddings(),
{
pool: mariadb.createPool("mariadb://myuser:mypassword@localhost:3306/mydb"),
distanceStrategy: 'COSINE',
}
);
await vectorStore.addDocuments(splitDocs);
}
createData();
Java Implementation
Spring AI brings the power of vector stores to Java with familiar Spring patterns. Here’s an exampleapplication.yml configuration file:
spring:
application:
name: mariadb-test
datasource:
url: jdbc:mariadb://localhost/spring-ai
username: myuser
password: mypassword
ai:
vectorstore:
mariadb:
initialize-schema: true
distance-type: COSINE
dimensions: 1536
openai:
api-key: ${OPEN_AI_KEY}
@Service
public class DocumentProcessingService {
@Autowired
private VectorStore vectorStore;
// Just one method to handle everything!
public void processPDF(String pdfPath) {
var reader = new PagePdfDocumentReader(pdfPath);
var splitter = new TokenTextSplitter();
List<Document> documents = splitter.apply(reader.get());
vectorStore.add(documents);
}
}
Performing Similarity Search
The frameworks make similarity search just as simple. Here’s how easy it is to query your vector store:
Python Search Example
from langchain_mariadb import MariaDBStore
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain import hub
llm = ChatOpenAI(temperature=0)
vector_store = MariaDBStore(
embeddings=OpenAIEmbeddings(),
embedding_length=1536,
datasource="mysql+mariadbconnector://root:@localhost:3306/vectordb",
collection_name="mariadb_docs"
)
prompt = hub.pull("rlm/rag-prompt")
rag_chain = (
{"context": vector_store.as_retriever(), "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
response = rag_chain.invoke("How do I create a new table in MariaDB?")
print(f"Response: {response}")
Node.js Search Example
import {MariaDBStore} from "@langchain/community/vectorstores/mariadb";
import {OpenAIEmbeddings} from "@langchain/openai";
import mariadb from "mariadb";
import { ChatOpenAI } from "@langchain/openai";
import { pull } from "langchain/hub";
async function search() {
const vectorStore = await MariaDBStore.initialize(
new OpenAIEmbeddings(),
{
pool: mariadb.createPool("mariadb://myuser:mypassword@localhost:3306/mydb"),
distanceStrategy: 'COSINE',
}
);
let question = "How do I create a new table in MariaDB?";
const llm = new ChatOpenAI({temperature: 0});
const retrievedDocs = await vectorStore.similaritySearch(question);
const docsContent = retrievedDocs.map((doc) => doc.pageContent).join("\n");
const promptTemplate = await pull("rlm/rag-prompt");
const messages = await promptTemplate.invoke({
question: question,
context: docsContent,
});
const answer = await llm.invoke(messages);
console.log(answer.content);
}
search();
Java Search Example
@Service
public class SearchService {
@Autowired
private VectorStore vectorStore;
@Autowired
private ChatModel chatModel;
public String search(String query) {
var chatClient = ChatClient.builder(chatModel)
.defaultAdvisors(new QuestionAnswerAdvisor(vectorStore))
.build();
return chatClient
.prompt()
.user(query)
.call()
.content();
}
}
Conclusion
These frameworks really make MariaDB’s vector capabilities shine. They take care of all the complicated vector stuff behind the scenes, and here’s the best part: you don’t need to mess around with plugins, extra connectors to set up, no additional APIs to learn, no separate licensing headaches or documentation to wade through. By leveraging MariaDB’s native vector storage and search functionality, developers can seamlessly integrate AI frameworks without the overhead of managing a separate vector database, eliminating the need for specialized expertise while maintaining familiar SQL operations and your existing MariaDB setup.
Whether you’re working in Python, Node.js, or Java, these frameworks give you a straightforward way to work with vectors right alongside your regular data. You can spend your time actually building cool AI features instead of juggling multiple database systems or getting bogged down in vector operation details.
Resources
- Watch an Introduction to AI Powered Applications: LLMs, Vector Search and MariaDB
- Download MariaDB Community Server 11.8 LTS release for access to MariaDB’s vector search capabilities
- Customers have vector search available in the MariaDB Enterprise Platform 2025 version (MariaDB Enterprise Server 11.4)
- MariaDB Vector Documentation
- LangChain mariadb
- Spring AI Documentation