This repository contains a collection of applications that leverage Google's Gemini AI models for various document and image processing tasks.
A Streamlit application that allows users to have conversations with PDF documents using Gemini AI.
Features:
- Upload multiple PDF documents
- Extract and process text from PDFs
- Create vector embeddings for efficient text search
- Ask questions about the PDF content
- Get detailed AI-powered responses
A multi-language invoice processing application that uses Gemini's vision capabilities.
Features:
- Upload invoice images (JPG, JPEG, PNG)
- Extract invoice information using AI
- Support for multiple languages
- Interactive Q&A about invoice contents
A general-purpose image analysis application powered by Gemini AI.
Features:
- Upload images (JPG, JPEG, PNG)
- Add custom prompts for specific analysis
- Get AI-generated descriptions and insights
- Interactive user interface
- Clone the repository
git clone https://github.com/aqib0770/Gemini-Projects.git
cd Gemini-Projects
- Install required dependencies
pip install -r requirements.txt
- Obtain API Key
- Sign up for a Gemini API key at Google AI Studio
- Enable the Gemini API in your project
- Environment Setup
- Create a
.env
file in the root directory - Add your Gemini API key:
GEMINI_API_KEY=your_api_key_here
streamlit run chatWithPdf.py
- Upload PDF files using the sidebar
- Click "Submit and process" to analyze the documents
- Ask questions in the text area
- Get AI-powered responses based on the PDF content
streamlit run InvoiceExtractor.py
- Upload an invoice image
- (Optional) Add specific prompts
- Click "Tell me about the invoice" to get analysis
streamlit run vision.py
- Upload any image
- (Optional) Add custom prompts
- Click "Tell me about the image" for AI analysis
- Never commit your
.env
file - Keep your API keys secure
- Use appropriate file permissions