Click on a logo to download the latest version of the app apk file:
Description: Voice Genie is a Flutter-based mobile application that acts as a voice-powered AI assistant, designed for quick, intuitive, and hands-free interactions. Leveraging the power of Gemini API and Imagine API, this single-screen app can respond to both text-based and art/image-based prompts, making it a unique blend of conversational AI and visual creativity. Through its streamlined interface, users can quickly send queries and receive spoken or visual responses, enhanced by intuitive speech-to-text and text-to-speech capabilities.
Text Queries: Using Gemini API with the Gemini 1.5 Flash model, Voice Genie answers user questions or prompts with insightful responses related to organization, creativity, or general information.
Art/Image Prompts: For creative visual queries, Imagine API generates relevant images or artwork, offering a unique AI art experience.
Speech-to-Text: The app converts spoken queries into text, allowing users to interact hands-free.
Text-to-Speech: Users can listen to AI responses, making information accessible and providing a conversational experience.
Animated Text Display: Responses are displayed in rounded containers with animated text, enhancing readability and engagement.
Response Options: After receiving an AI-generated response, users can choose to:
Ask another question, Listen to the AI response from start to finish, Clear previous interactions and reset the screen for new prompts.
Permission and Connectivity Messages: The app provides clear feedback for any permission issues (e.g., audio recording) or connectivity errors, helping users troubleshoot effortlessly.
To run this project locally:
-
Clone the repository: git clone https://github.com/ArpitAswal/Voice-Genie.git
-
Navigate to the project directory: cd Voice-Genie
-
Install dependencies: flutter pub get
-
Set up API Keys (Optional, depending on external services used): Obtain an API key from Gemini AI Studio & Imagine API.
Create a new file named lib/config.dart in the project directory.
Add the following code, replacing 'YOUR_API_KEY' with your News API key:
class Config { static const String apiKey = 'YOUR_API_KEY'; }
Or directly used in API networking calls.
-
Run the app: flutter run
. Users must grant audio record permission for the app to function, as it uses, the voice speech functionality to record the user prompt.
. ImagineAPI service sometimes is not available, and for more styling images response try different style_id parameter, for more detail visit the ImagineAPI site.
Flutter: The primary framework for building the mobile application.
Dart: The programming language used with Flutter.
Imagine API. for generating AI-driven image responses based on user prompts.
Gemini API: Gemini API for processing text-based queries and providing informative answers.
Speech-To-Text: Ensuring it listens to all the words/sentences of the user and performs well.
Text-To-Speech: Managing the response speech by single response or full messages responses.
Image-Based Query Analysis: A new feature will allow users to upload images to the app. Gemini AI will then analyze the uploaded image, describing its contents to provide deeper insights or contextual explanations about the image.
Enhanced AI Art Capabilities: Future updates will improve the app’s art-based prompts with more creative or style-based responses to user queries.
Contributions are always welcome!
Please follow these steps:
-
Fork the repository.
-
Create a new branch (git checkout -b feature-branch).
-
Make your changes and commit them (git commit -m 'Add new feature').
-
Push the changes to your fork (git push origin feature-branch).
-
Create a pull request.
Starting the App: Voice Genie opens on a single main screen where users can immediately interact with AI by pressing the microphone button. Providing a Query: Users can speak their questions or prompts. The app detects the type of request:
Text-Based: The app processes queries with Gemini AI to provide textual answers.
Art/Image-Based: Imagine API is used to generate visual answers.
Displaying Results: The response, either in text or image form, appears in an animated container. Users can then: Ask a new question, Listen to the response through text-to-speech, and Refresh the app to reset for new queries.
Screenrecorder-2024-11-09-14-43-28-776.mp4
Screenrecorder-2024-11-09-14-43-49-346.mp4
Screenrecorder-2024-11-09-14-49-30-711.mp4
Handling Errors: If permissions are missing or connectivity fails, the app displays clear, specific messages to guide users in troubleshooting.
Screenrecorder-2024-11-09-14-50-47-609.mp4
Screenrecorder-2024-11-09-14-50-18-22.0.mp4
If you have any feedback, please reach out to me at arpitaswal995@gmail.com
If you face an issue, then open an issue in a GitHub repository.
Voice Genie is a sophisticated AI assistant designed to be an intuitive, accessible way for users to explore information and art with minimal effort. With future updates, it aims to become an even more interactive and personalized companion.