The Multi-Files QueryBot is a Python-based tool that allows users to interact with multiple document types, including PDFs
, .docx
, and .json
files, through natural language queries. Users can ask questions based on the content of these documents, and the app provides accurate, context-aware responses.\n* It’s designed to help users efficiently navigate and extract insights from large sets of documents, training them to ask more effective and precise questions.
The application follows these steps to respond to your questions:
- Files Loading: The app reads multiple documents and extracts their text content.
- Text Chunking: The extracted text is divided into smaller, manageable chunks for efficient processing.
- Language Model: The application employs a language model to create vector representations (embeddings) of the text chunks.
- Similarity Matching: When a question is asked, the app compares it to the text chunks and identifies those with the highest semantic similarity.
- Response Generation: The selected chunks are input into the language model, which generates a response based on the relevant content from the PDFs.
To install the MultiPDF Chat App, please follow these steps:
| git clone https://github.com/Bhavik-Jikadara/multiple-pdfs-querybot.git
| cd multiple-pdfs-querybot/
| pip install virtualenv
| virtualenv venv
| source venv/Scripts/activate
| pip install -r requirements.txt
| streamlit run app.py
The Multiple PDFs QueryBot is released under the Apache License 2.0.