Merge pull request #2 from waveupHQ/feat-workers

Implement SAAsWorkers for parallel task processing
waveupHQ · Jul 5, 2024 · 51f626b · 51f626b
2 parents 54590f1 + 2947dc3
commit 51f626b
Show file tree

Hide file tree

Showing 14 changed files with 553 additions and 228 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,7 +5,27 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
-## [Unreleased]
+## [0.1.2] - 2024-07-05
+
+### Added
+
+- Implemented SAAsWorkers class for parallel task processing
+- Added asynchronous execution capabilities
+- Integrated SAAsWorkers with the existing Orchestrator class
+- Implemented structured output for task planning using Pydantic models
+- Added new tests for SAAsWorkers and updated existing tests
+- Improved error handling and logging throughout the project
+
+### Changed
+
+- Refactored Orchestrator to use SAAsWorkers for task planning and execution
+- Updated main assistant to generate JSON responses for task planning
+- Modified configuration to include settings for SAAsWorkers
+
+### Fixed
+
+- Resolved issues with JSON parsing in task planning
+- Improved error handling in worker task execution
 
 ## [0.1.1] - 2024-06-30
 

diff --git a/README.md b/README.md
@@ -8,10 +8,6 @@
   - [Features](#features)
   - [Project Structure](#project-structure)
   - [Module Descriptions](#module-descriptions)
-    - [1. src/assistants.py](#1-srcassistantspy)
-    - [2. src/config.py](#2-srcconfigpy)
-    - [3. src/main.py](#3-srcmainpy)
-    - [4. src/orchestrator.py](#4-srcorchestratorpy)
   - [Dependencies](#dependencies)
   - [Setup and Installation](#setup-and-installation)
   - [Configuration](#configuration)
@@ -25,12 +21,13 @@
 
 ## Introduction
 
-The Smart Autonomous Assistants (SAAs) project is a sophisticated AI-driven system designed to orchestrate multiple AI assistants to accomplish complex tasks. By leveraging the power of large language models and a modular architecture, this system can break down objectives, execute sub-tasks, and refine results to produce coherent outputs.
+The Smart Autonomous Assistants (SAAs) project is a sophisticated AI-driven system designed to orchestrate multiple AI assistants to accomplish complex tasks. By leveraging the power of large language models and a modular architecture, this system can break down objectives, execute sub-tasks in parallel, and refine results to produce coherent outputs.
 
 ## Features
 
 - Multi-assistant orchestration for complex task completion
 - Support for multiple LLM providers (Claude, GPT, Gemini)
+- Parallel processing of subtasks using SAAsWorkers
 - Modular architecture for easy extension and customization
 - Automated task breakdown and execution
 - Integration with external tools (e.g., TavilyTools)
@@ -47,124 +44,72 @@ smart-autonomous-assistants/
 │   ├── assistants.py
 │   ├── config.py
 │   ├── main.py
-│   └── orchestrator.py
+│   ├── orchestrator.py
+│   ├── workers.py
+│   └── utils/
+│       ├── exceptions.py
+│       └── logging.py
 ├── tests/
 │   ├── __init__.py
-│   └── test_orchestrator.py
+│   ├── test_orchestrator.py
+│   └── test_workers.py
 ├── .github/
 │   └── workflows/
 │       └── ci.yml
 ├── output/
 ├── README.md
+├── CHANGELOG.md
 ├── setup.py
 ├── pyproject.toml
 ├── requirements.txt
+├── requirements-dev.txt
 ├── .gitignore
 └── .env
 ```
 
 ## Module Descriptions
 
-### 1. src/assistants.py
-
-- Implements dynamic assistant creation supporting multiple LLM providers
-- Manages file operations and tool integration
-
-### 2. src/config.py
-
-- Handles configuration settings and environment variables
-- Implements API key management and validation
-
-### 3. src/main.py
-
-- Provides the command-line interface using Typer
-- Handles workflow initialization and error reporting
-
-### 4. src/orchestrator.py
-
-- Implements the core workflow management logic
-- Coordinates interactions between assistants and manages the overall process
+1. **src/assistants.py**: Implements dynamic assistant creation supporting multiple LLM providers and manages file operations and tool integration.
+2. **src/config.py**: Handles configuration settings, environment variables, and API key management.
+3. **src/main.py**: Provides the command-line interface using Typer for running workflows.
+4. **src/orchestrator.py**: Implements the core workflow management logic and coordinates interactions between assistants.
+5. **src/workers.py**: Implements the SAAsWorkers class for parallel task processing and planning.
 
 ## Dependencies
 
-| Dependency    | Version | Purpose                                     |
-| ------------- | ------- | ------------------------------------------- |
-| phidata       | 2.4.22  | Provides the base Assistant class and tools |
-| pydantic      | 2.7.4   | Data validation and settings management     |
-| python-dotenv | 1.0.1   | Loads environment variables from .env file  |
-| typer         | 0.12.3  | Creates CLI interfaces                      |
-| rich          | 13.7.1  | Enhanced terminal output                    |
-
-## Setup and Installation
+Main dependencies include:
 
-1. Clone the repository:
+- phidata==2.4.22
+- pydantic==2.7.4
+- python-dotenv==1.0.1
+- typer==0.12.3
+- rich==13.7.1
 
-   ```
-   git clone https://github.com/waveuphq/smart-autonomous-assistants.git
-   cd smart-autonomous-assistants
-   ```
+For a full list of dependencies, see `requirements.txt` and `requirements-dev.txt`.
 
-2. Create and activate a virtual environment:
-
-   ```
-   python -m venv venv
-   source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
-   ```
-
-3. Install dependencies:
-
-   ```
-   pip install -r requirements.txt
-   ```
+## Setup and Installation
 
-4. Create a `.env` file in the project root and add your API keys and VertexAI settings:
-   ```
-   ANTHROPIC_API_KEY=your_anthropic_api_key
-   OPENAI_API_KEY=your_openai_api_key
-   GOOGLE_API_KEY=your_google_api_key
-   TAVILY_API_KEY=your_tavily_api_key
-   VertexAI_Project_Name=your_vertexai_project_id
-   VertexAI_Location=your_vertexai_location
-   ```
+1. Clone the repository
+2. Create and activate a virtual environment
+3. Install dependencies: `pip install -r requirements.txt`
+4. Create a `.env` file with your API keys and VertexAI settings
 
 ## Configuration
 
-The project supports multiple LLM providers. Update the `MAIN_ASSISTANT`, `SUB_ASSISTANT`, and `REFINER_ASSISTANT` settings in `src/config.py` to use the desired models.
+Update the `settings` in `src/config.py` to configure LLM models and other parameters.
 
 ## Usage
 
-To run a workflow, use the following command:
+Run a workflow using:
 
 ```
 python -m src.main run-workflow "Your objective here"
 ```
 
-Example:
-
-```
-python -m src.main run-workflow "Create a python script to copy all .py files content and exclude files and folder excluded in the .gitignore uses Typer commands"
-```
-
 ## Development Setup
 
-To set up the development environment:
-
-1. Clone the repository and navigate to the project directory.
-2. Create and activate a virtual environment:
-   ```
-   python -m venv venv
-   source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
-   ```
-3. Install development dependencies:
-   ```
-   pip install -r requirements-dev.txt
-   ```
-4. Install pre-commit hooks:
-   ```
-   pre-commit install
-   ```
-
-This will set up your environment with all necessary development tools, including testing, linting, and formatting utilities.
+1. Install development dependencies: `pip install -r requirements-dev.txt`
+2. Install pre-commit hooks: `pre-commit install`
 
 ## Testing
 
@@ -176,7 +121,7 @@ pytest
 
 ## Continuous Integration
 
-The project uses GitHub Actions for CI/CD. The pipeline runs tests, checks code formatting, and verifies import sorting on each push and pull request to the main branch.
+The project uses GitHub Actions for CI/CD, running tests and checks on each push and pull request.
 
 ## System Architecture
 
@@ -202,8 +147,8 @@ This architecture allows for a flexible and extensible system that can handle co
 
 ## Contributing
 
-Contributions are welcome! Please feel free to submit a Pull Request.
+Contributions are welcome! Please read the CONTRIBUTING.md file for guidelines.
 
 ## License
 
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+This project is licensed under the MIT License - see the LICENSE file for details.
diff --git a/ROADMAP.md b/ROADMAP.md
@@ -0,0 +1,55 @@
+# Updated 7-Day SAA Implementation Roadmap
+
+## Day 1: SAAs Workers Integration
+
+- [x] Implement SAAsWorkers class in a new file `src/workers.py`
+- [x] Add asynchronous execution capabilities for parallel processing
+- [x] Integrate SAAsWorkers with the existing Orchestrator class
+
+## Day 2: Enhance Main Assistant and Prompts
+
+- [ ] Improve the `_generate_main_prompt` method in `src/orchestrator.py` for better task breakdown
+- [ ] Implement logic for the MAIN_ASSISTANT to handle tasks without subtask decomposition
+- [ ] Enhance prompt writing capabilities for SUB_ASSISTANT tasks
+
+## Day 3: Implement Parallel Research Use Case
+
+- [ ] Develop a parallel research system in `src/use_cases/research.py`
+- [ ] Integrate the research use case with SAAsWorkers
+- [ ] Add necessary tools for web scraping and data processing
+
+## Day 4: CLI Enhancements and Basic UI
+
+- [ ] Extend the CLI in `src/main.py` to support new SAAsWorkers functionality
+- [ ] Implement a basic Streamlit UI for interaction (instead of Sveltekit for time constraints)
+- [ ] Create commands for the research use case
+
+## Day 5: Testing and Error Handling
+
+- [ ] Update existing tests in `tests/test_orchestrator.py` for new functionality
+- [ ] Add new tests for SAAsWorkers and the research use case
+- [ ] Enhance error handling and logging throughout the project
+
+## Day 6: Documentation and Examples
+
+- [ ] Update the README.md with new features and usage instructions
+- [ ] Create example scripts for the research use case
+- [ ] Document the SAAsWorkers implementation and integration
+
+## Day 7: Optimization and Final Testing
+
+- [ ] Optimize parallel execution in SAAsWorkers
+- [ ] Conduct end-to-end testing of the entire system
+- [ ] Address any remaining bugs or issues
+- [ ] Prepare for deployment (if applicable)
+
+# Backlog (Future Development)
+
+1. Implement additional use cases (e.g., content creation, autocomplete)
+2. Develop a plugin system for easy integration of new tools
+3. Enhance the configuration management system
+4. Implement a full-featured FastAPI-based API
+5. Develop strategies for handling larger workloads and scaling
+6. Integrate additional Phidata tools and features
+7. Implement long-term memory and knowledge base systems
+8. Create a more advanced UI with data visualization capabilities
diff --git a/requirements-dev.txt b/requirements-dev.txt
@@ -4,12 +4,13 @@
 # Testing
 pytest==8.2.2
 pytest-mock==3.14.0
+pytest-asyncio
 
 # Type checking
-mypy==1.7.1  
+mypy==1.7.1
 
 # Debugging
-ipdb==0.13.13  
+ipdb==0.13.13
 
 # Security
 bandit==1.7.5
@@ -27,4 +28,4 @@ python-dotenv==1.0.1  # Already in requirements.txt, but included here for compl
 typer[all]==0.12.3  # Already in requirements.txt, but included here for completeness
 # Code Formating
 black==24.4.2
-isort==5.12.0
+isort==5.12.0
diff --git a/requirements.txt b/requirements.txt
@@ -25,7 +25,7 @@ numpy==2.0.0
 python-multipart==0.0.9
 
 # Async support
-anyio==4.4.0
+asyncio
 
 # CLI enhancements
 click==8.1.7
@@ -34,8 +34,8 @@ shellingham==1.5.4
 # Time handling
 python-dateutil==2.9.0.post0
 
-# pgvector 
-# pypdf 
-# psycopg2-binary 
-# sqlalchemy 
-# fastapi
+# pgvector
+# pypdf
+# psycopg2-binary
+# sqlalchemy
+# fastapi
diff --git a/src/__init__.py b/src/__init__.py
@@ -1,14 +1,16 @@
-from .assistants import get_full_response, main_assistant, refiner_assistant, sub_assistant
+from .assistants import create_assistant, get_full_response
 from .config import settings
 from .orchestrator import Orchestrator, Task, TaskExchange
+from .workers import PlanResponse, SAAsWorkers, WorkerTask
 
 __all__ = [
     "get_full_response",
-    "main_assistant",
-    "refiner_assistant",
-    "sub_assistant",
+    "create_assistant",
     "settings",
     "Orchestrator",
     "Task",
     "TaskExchange",
+    "SAAsWorkers",
+    "WorkerTask",
+    "PlanResponse",
 ]
diff --git a/src/assistants.py b/src/assistants.py
@@ -74,6 +74,7 @@ def create_assistant(name: str, model: str):
                 read_file,
                 list_files,
             ],
+            debug_mode=True,
         )
     except Exception as e:
         logger.error(f"Error creating assistant {name} with model {model}: {str(e)}")

diff --git a/src/config.py b/src/config.py
@@ -38,17 +38,20 @@ def tavily_api_key(self) -> str:
     OPENAI_API_KEY: Optional[str] = os.getenv("OPENAI_API_KEY")
 
     # Assistant settings
-    MAIN_ASSISTANT: str = "claude-3-5-sonnet-20240620"
-    SUB_ASSISTANT: str = "gpt-3.5-turbo"
+    MAIN_ASSISTANT: str = "claude-3-sonnet-20240229"
+    SUB_ASSISTANT: str = "claude-3-haiku-20240307"
     REFINER_ASSISTANT: str = "gemini-1.5-pro-preview-0409"
 
     # Fallback models
-    FALLBACK_MODEL_1: str = "claude-3-sonnet-20240229"
+    FALLBACK_MODEL_1: str = "gpt-3.5-turbo"
     FALLBACK_MODEL_2: str = "gpt-3.5-turbo"
 
     # Tools
     TAVILY_API_KEY: Optional[str] = os.getenv("TAVILY_API_KEY")
 
+    # New setting for SAAsWorkers
+    NUM_WORKERS: int = 3
+
     class Config:
         env_file = ".env"
         extra = "ignore"  # This will ignore any extra fields in the environment