Retrieval-Augmented Generation (RAG) with MongoDB and Spring AI: Bringing AI to Your Java Applications
Welcome to the RAG App GitHub repository! This project demonstrates how to build a Retrieval-Augmented Generation (RAG) system using Spring Boot, MongoDB Atlas, and OpenAI. With RAG, you can use your own data to supplement the responses generated by a large language model (LLM), ensuring more accurate, relevant, and up-to-date answers.
Retrieval-Augmented Generation (RAG) is a technique that combines vector search and large language models (LLMs) to generate context-aware answers based on proprietary or external data that was not part of the model’s initial training. RAG consists of three main components:
- Pre-trained LLM: We use OpenAI's GPT model to generate responses.
- Vector search: We retrieve relevant documents from a MongoDB database.
- Vector embeddings: Numerical representations of your data that capture semantic meaning.
- Integrates with MongoDB Atlas for vector search capabilities.
- Uses OpenAI for embeddings and generating smart, context-driven answers.
- Implements a simple vector store and document embedding service.
- Includes REST endpoints for document loading and AI-powered question-answering.
To run this project, you need:
- Java 21 or higher.
- Maven (for dependency management).
- MongoDB Atlas (M10+ cluster required for vector search).
- OpenAI API key.
Make sure you have these tools installed and accounts configured before proceeding.
git clone https://github.com/timotheekelly/RagApp.git
cd RagApp
Edit the application.properties
file with your MongoDB URI and OpenAI API key.
spring.application.name=RagApp
spring.ai.openai.api-key=<Your-API-Key>
spring.ai.openai.chat.options.model=gpt-4o
spring.ai.vectorstore.mongodb.initialize-schema=true
spring.data.mongodb.uri=<Your-Connection-URI>
spring.data.mongodb.database=rag
Use Maven to build the project.
mvn clean install
Start the Spring Boot application:
mvn spring-boot:run
Navigate to the following endpoint to load documents into the vector store:
http://localhost:8080/api/docs/load
You can ask questions by sending a request to this endpoint:
http://localhost:8080/question?message=Your question here
For example:
http://localhost:8080/question?message=How to analyze time-series data with Python and MongoDB? Explain the steps
- EmbeddingModel: Configured to use OpenAI for generating document embeddings.
- VectorStore: Utilizes MongoDB Atlas for storing and retrieving vectors for similarity searches.
- DocsLoaderService: Reads and embeds documents, then stores them in the MongoDB vector store.
- RagController: Handles question-answering requests by performing vector searches and interacting with the LLM.
This project demonstrates how to integrate a retrieval-augmented generation system with MongoDB Atlas and OpenAI to enhance Java applications. By combining vector search and generative AI, you can create smart, context-aware applications tailored to your own data.