PDF Analyzer

Overview

PDF Analyzer is a web application built using Streamlit that allows users to upload PDF documents, extract text, and interactively ask questions about the content of those documents. Leveraging natural language processing (NLP) capabilities, this application provides answers based on the context derived from the uploaded PDF files.

Features

PDF Upload: Upload multiple PDF files for analysis.
Text Extraction: Automatically extracts text from PDF files using the PyPDF2 library.
Natural Language Processing: Uses the Google Generative AI to answer questions about the extracted text.
Vector Store: Utilizes FAISS for efficient similarity searches against the extracted text chunks.

Technologies Used

Streamlit: For building the web application.
PyPDF2: For reading and extracting text from PDF files.
Langchain: For creating embeddings and handling question-answering chains.
Google Generative AI: For generating answers to user questions.
FAISS: For efficient similarity search and retrieval of text embeddings.

Installation

To run the application locally, follow these steps:

Clone the repository:

git clone https://github.com/keshav-kh/PDF-Analyzer.git
cd PDF-Analyzer

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```
Set up your Google API key. Create a .env file in the project root and add your API key:
```
GOOGLE_API_KEY=your_api_key_here
```

Usage

Run the application:
```
streamlit run app.py
```
Open your web browser and navigate to http://localhost:8501.
Upload one or more PDF files and use the input field to ask questions about the content.

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue for any improvements or bugs you find.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Acknowledgments

Thanks to the developers of Streamlit, Langchain, and Google Generative AI for providing the tools that made this project possible.
Inspiration for the project came from the need to analyze and extract information from PDF documents easily.

Feel free to modify any sections, add more details, or customize the formatting to better suit your project's identity!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.devcontainer		.devcontainer
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Analyzer

Overview

Features

Technologies Used

Installation

Usage

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

License

Keshav-kh/PDF-Analyzer

Folders and files

Latest commit

History

Repository files navigation

PDF Analyzer

Overview

Features

Technologies Used

Installation

Usage

Contributing

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages