Agent-panel is a control panel and observability platform for optimizing and monitoring LLM/AI agents. The current version focuses on the AI gateway, providing effortless access to 100+ LLMs across 20+ AI platforms. The project is written in Rust.
- Background
- Supported LLMs
- Installation, Configure and Run 3.1 Build 3.2 Configure 3.3 Run 3.4 Develop
- Example Configuration
- Multi-Platform Configuration
- Next steps
- Thank you
- Get involved
At its core, Agent Panel addresses the complex challenges faced in managing multi-agent systems, particularly those involving numerous function-calling steps and multiple invocations.The project was born out of the difficulties faced while building and optimizing the Incredible AI agent. It highlighted the need for detailed observability of the agent's interactions, including sequential and parallel engagements with sub-agents like QnA, code search, code understanding, code generation, and evaluation agents.
The initial version of Agent Panel introduces the AI Gateway, This feature alone significantly reduces the complexity and time required to connect and use different LLMs for various tasks.
Here are the two significant differentiators of the agent panel.
-
Unlike most observability tools, the agent panel takes an API-first approach rather than the wrapper approach for the OpenAI clients. It makes it easy for Vanilla implementations to integrate. In the future, I'll be building a wrapper for the OpenAI client; until then, it's more like a low-level driver and needs to be integrated using its API.
-
Most observability tools track the LLM calls and their metrics but aren't designed to optimize multi-agent systems. At its core, Agent Panel addresses the complex challenges faced in managing multi-agent systems, particularly those involving numerous function-calling steps and multiple invocations.
- OpenAI GPT-3.5/GPT-4 (paid, vision, function-calling)
- Gemini: Gemini-1.0/Gemini-1.5 (free, paid, vision, function-calling)
- Claude: Claude-3.5/Claude-3 (vision, paid, function-calling)
- Mistral (paid, function-calling)
- Cohere: Command-R/Command-R+ (paid, function-calling)
- Perplexity: Llama-3/Mixtral (paid)
- Groq: Llama-3/Mixtral/Gemma (free)
- Ollama (free, local, embedding)
- Azure OpenAI (paid, vision, function-calling)
- VertexAI: Gemini-1.0/Gemini-1.5 (paid, vision, function-calling)
- VertexAI-Claude: Claude-3 (paid, vision)
- Bedrock: Llama-3/Claude-3/Mistral (paid, vision)
- Cloudflare (free, paid, vision)
- Replicate (paid)
- Ernie (paid)
- Qianwen (paid, vision, embedding)
- Moonshot (paid)
- ZhipuAI: GLM-3.5/GLM-4 (paid, vision)
- Deepseek (paid)
- Other openAI-compatible platforms
To install and run the agent-panel project, follow these steps:
- Clone the repository from GitHub:
$ git clone https://github.com/IncredibleDevHQ/agent-panel
$ cd agent-panel
- Build the project using Cargo (ensure you have Rust installed on your system):
cargo build --release
- Create an empty config file called
config.yaml
amd Set theCONFIG_DIR
environment variable to the directory containing yourconfig.yaml
file:
export CONFIG_DIR=/path/to/directory/containing/config.yaml
To set up the AI gateway, follow these steps:
-
Open
config.yaml
in a text editor. -
Start with an empty
clients:
section:clients:
-
Decide which LLM platforms you want to support. You can include multiple platforms.
-
For each platform you want to support, copy the relevant client configuration from the
example.config.yaml
(provided in the repo's root directory) and paste it under theclients:
section in yourconfig.yaml
.
Here's an example config.yaml
that supports both Claude and Gemini:
clients:
- type: claude
api_key: sk-ant-xxx # Replace with your actual Claude API key
- type: gemini
api_key: xxx # Replace with your actual Gemini API key
patches:
'.*':
chat_completions_body:
safetySettings:
- category: HARM_CATEGORY_HARASSMENT
threshold: BLOCK_NONE
- category: HARM_CATEGORY_HATE_SPEECH
threshold: BLOCK_NONE
- category: HARM_CATEGORY_SEXUALLY_EXPLICIT
threshold: BLOCK_NONE
- category: HARM_CATEGORY_DANGEROUS_CONTENT
threshold: BLOCK_NONE
Remember to replace the placeholder API keys with your actual credentials.
Run the binary:
./target/release/agent-panel
By default, the server will run on port 8000. If you need to change the port, use the --port
argument:
./target/release/agent-panel --port 8080
This will start the server on port 8080 instead.
Remember to ensure that your config.yaml
file is properly set up with the LLM platforms you want to use before running the server.
Once the server is running, you can start making requests to the AI gateway as described in the previous sections.
If you're developing or want to run the project without building a release version, you can use cargo run
.
cargo run -- [arguments]
For example, to run in development mode with a custom port:
cargo run -- --port 8080
This will compile and run the project in one step, which is useful during development.
After configuring your config.yaml
, you can start making requests to the AI gateway. Here's how to use it:
-
Endpoint URL: Send POST requests to
http://127.0.0.1:8000/v1/chat/completions
(adjust if your gateway is configured differently). -
Headers: Set the Content-Type header to application/json:
Content-Type: application/json
-
Request Body: The body should be a JSON object containing:
model
: Specifies which LLM to use (format:<platform>:<model_name>
)messages
: An array of message objects representing the conversationstream
: A boolean indicating whether to stream the response (optional)
curl -X POST \
http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude:claude-3-5-sonnet-20240620",
"messages": [{"role": "user", "content": "How does JVM work?"}],
"stream": false
}'
Here's how you construct value for the "model" field in the API request.
When making API requests, you can only use models from platforms that you've added to your config.yaml
file. Use the format <platform>:<model_name>
in the model
field of your HTTP request.
Here's a table of supported platforms and their corresponding model names:
Platform | Supported Model names |
---|---|
openai | gpt-3.5-turbo, gpt-3.5-turbo-1106, gpt-4o, gpt-4-turbo, gpt-4-turbo-preview, gpt-4-1106-preview, gpt-4-vision-preview, gpt-4, gpt-4-32k, text-embedding-3-large, text-embedding-3-small |
gemini | gemini-1.5-pro-latest, gemini-1.0-pro-latest, gemini-1.0-pro-vision-latest, gemini-1.5-flash-latest, gemini-1.5-pro-latest, text-embedding-004 |
claude | claude-3-5-sonnet-20240620, claude-3-opus-20240229, claude-3-sonnet-20240229, claude-3-haiku-20240307 |
mistral | open-mistral-7b, open-mixtral-8x7b, open-mixtral-8x22b, mistral-small-latest, mistral-large-latest, mistral-embed |
cohere | command-r, command-r-plus, embed-english-v3.0, embed-multilingual-v3.0 |
perplexity | llama-3-sonar-small-32k-chat, llama-3-sonar-large-32k-chat, llama-3-8b-instruct, llama-3-70b-instruct, mixtral-8x7b-instruct |
groq | llama3-8b-8192, llama3-70b-8192, mixtral-8x7b-32768, gemma-7b-it |
vertexai | gemini-1.0-pro-002, gemini-1.0-pro-vision-001, gemini-1.5-flash-preview-0514, gemini-1.5-pro-preview-0514, text-embedding-004, text-multilingual-embedding-002 |
vertexai-claude | claude-3-opus@20240229, claude-3-sonnet@20240229, claude-3-haiku@20240307 |
bedrock | anthropic.claude-3-opus-20240229-v1:0, anthropic.claude-3-sonnet-20240229-v1:0, anthropic.claude-3-haiku-20240307-v1:0, meta.llama3-8b-instruct-v1:0, meta.llama3-70b-instruct-v1:0, mistral.mistral-7b-instruct-v0:2, mistral.mixtral-8x7b-instruct-v0:1, mistral.mistral-large-2402-v1:0 |
cloudflare | @cf/meta/llama-3-8b-instruct, @cf/mistral/mistral-7b-instruct-v0.2-lora, @cf/google/gemma-7b-it-lora, @cf/qwen/qwen1.5-14b-chat-awq, @hf/thebloke/deepseek-coder-6.7b-instruct-awq |
replicate | meta/meta-llama-3-70b-instruct, meta/meta-llama-3-8b-instruct, mistralai/mistral-7b-instruct-v0.2, mistralai/mixtral-8x7b-instruct-v0.1 |
ernie | ernie-4.0-8k-preview, ernie-3.5-8k-preview, ernie-speed-128k, ernie-lite-8k, ernie-tiny-8k |
qianwen | qwen-long, qwen-turbo, qwen-plus, qwen-max, qwen-max-longcontext, qwen-vl-plus, qwen-vl-max, text-embedding-v2 |
moonshot | moonshot-v1-8k, moonshot-v1-32k, moonshot-v1-128k |
deepseek | deepseek-chat, deepseek-coder |
zhipuai | glm-4, glm-4v, glm-3-turbo |
anyscale | meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3-70B-Instruct, codellama/CodeLlama-70b-Instruct-hf, mistralai/Mistral-7B-Instruct-v0.1, mistralai/Mixtral-8x7B-Instruct-v0.1, mistralai/Mixtral-8x22B-Instruct-v0.1, google/gemma-7b-it |
deepinfra | meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3-70B-Instruct, mistralai/Mistral-7B-Instruct-v0.2, mistralai/Mixtral-8x7B-Instruct-v0.1, mistralai/Mixtral-8x22B-Instruct-v0.1, google/gemma-1.1-7b-it, databricks/dbrx-instruct, 01-ai/Yi-34B-Chat |
fireworks | accounts/fireworks/models/llama-v3-8b-instruct, accounts/fireworks/models/llama-v3-70b-instruct, accounts/fireworks/models/mistral-7b-instruct-v0p2, accounts/fireworks/models/mixtral-8x7b-instruct, accounts/fireworks/models/mixtral-8x22b-instruct, accounts/fireworks/models/qwen-72b-chat, accounts/fireworks/models/gemma-7b-it, accounts/fireworks/models/dbrx-instruct |
openrouter | meta-llama/llama-3-8b-instruct, meta-llama/llama-3-8b-instruct:nitro, meta-llama/llama-3-8b-instruct:extended, meta-llama/llama-3-70b-instruct, meta-llama/llama-3-70b-instruct:nitro, mistralai/mistral-7b-instruct:free, codellama/codellama-70b-instruct, google/gemma-7b-it:free, 01-ai/yi-34b-chat, openai/gpt-3.5-turbo, openai/gpt-4o, openai/gpt-4-turbo, openai/gpt-4-turbo-preview, openai/gpt-4-vision-preview, openai/gpt-4, openai/gpt-4-32k, google/gemini-pro, google/gemini-pro-vision, google/gemini-pro-1.5, anthropic/claude-3-opus, anthropic/claude-3-sonnet, anthropic/claude-3-haiku, mistralai/mixtral-8x7b-instruct, mistralai/mixtral-8x22b-instruct, mistralai/mistral-small, mistralai/mistral-large, databricks/dbrx-instruct, cohere/command-r, cohere/command-r-plus |
octoai | meta-llama-3-8b-instruct, meta-llama-3-70b-instruct, mistral-7b-instruct, mixtral-8x7b-instruct, mixtral-8x22b-instruct |
together | meta-llama/Llama-3-8b-chat-hf, meta-llama/Llama-3-70b-chat-hf, mistralai/Mistral-7B-Instruct-v0.2, mistralai/Mixtral-8x7B-Instruct-v0.1, mistralai/Mixtral-8x22B-Instruct-v0.1, google/gemma-7b-it, Qwen/Qwen1.5-72B-Chat, databricks/dbrx-instruct, zero-one-ai/Yi-34B-Chat, deepseek-ai/deepseek-llm-67b-chat, deepseek-ai/deepseek-coder-33b-instruct |
- If you've configured
openai
in your config.yaml:
"model": "openai:gpt-4-turbo"
- If you've configured
gemini
:
"model": "gemini:gemini-1.5-pro-latest"
- If you've configured
claude
:
"model": "claude:claude-3-opus-20240229"
- If you've configured Mistral:
"model": "mistral:mistral-large-latest"
- If you've configured Cohere:
"model": "cohere:command-r-plus"
Here's a sample config.yaml
that supports OpenAI, Gemini, and Claude:
clients:
- type: openai
api_key: sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx # Replace with your OpenAI API key
api_base: https://api.openai.com/v1
organization_id: org-xxxxxxxxxxxxxxxx # Optional, replace if needed
- type: gemini
api_key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx # Replace with your Gemini API key
patches:
'.*':
chat_completions_body:
safetySettings:
- category: HARM_CATEGORY_HARASSMENT
threshold: BLOCK_NONE
- category: HARM_CATEGORY_HATE_SPEECH
threshold: BLOCK_NONE
- category: HARM_CATEGORY_SEXUALLY_EXPLICIT
threshold: BLOCK_NONE
- category: HARM_CATEGORY_DANGEROUS_CONTENT
threshold: BLOCK_NONE
- type: claude
api_key: sk-ant-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx # Replace with your Claude API key
Remember to replace the placeholder API keys with your actual credentials.
Here are sample curl requests demonstrating how to use different models from each configured platform:
- Using OpenAI's GPT-4 model:
curl -X POST \
http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai:gpt-4o",
"messages": [{"role": "user", "content": "Explain quantum computing in simple terms."}],
"stream": false
}'
- Using Gemini's latest pro model:
curl -X POST \
http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini:gemini-1.5-pro-latest",
"messages": [{"role": "user", "content": "What are the main differences between machine learning and deep learning?"}],
"stream": false
}'
- Using Claude's latest model:
curl -X POST \
http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude:claude-3-5-sonnet-20240620",
"messages": [{"role": "user", "content": "Describe the process of photosynthesis."}],
"stream": false
}'
These examples demonstrate how you can easily switch between different AI models from various platforms by simply changing the model
field in your API requests. The AI gateway handles the routing to the appropriate platform based on the model specified.
Remember that for each of these requests to work:
- The corresponding platform (OpenAI, Gemini, or Claude) must be properly configured in your
config.yaml
file. - You must have valid API keys for each platform.
- The model names used in the requests must match those supported by each platform (as listed in the model table provided earlier).
By setting up your config.yaml
with multiple platforms and using the appropriate model names in your requests, you can leverage the power of various AI models through a single, unified API gateway.
The journey of Agent Panel is just beginning. The roadmap includes several exciting features designed further to enhance the capability and efficiency of AI agents:
- Long-Term Centralized Memory: Manage and optimize the memory usage across all agents, ensuring information retention and accessibility.
- Resuming Distributed Agent State: This feature will allow the system to resume operations from the last state before a failure, preventing the need to restart processes from the beginning.
- Time Travel Debugging and Graph Visualization: Developers can step through the agent calls to debug and visualize the decision-making process, enhancing understanding and troubleshooting capabilities.
- API Function Calling Decision-Making: Rather than executing functions directly, the system will identify and suggest the best function to call based on the current context, optimizing performance and accuracy.
- Prompt Optimization Mode: This will enable developers to iterate on prompts, compare responses, and refine their strategies to achieve the best outcomes.
I want to thank @sigoden, author of Aichat, for his great work, which I could bank on building out the initial version.
If any issues are running or using the project, please feel free to open an issue here.
If you'd like to chat, please join the discord channel: https://discord.gg/DETRMEVqQ6 or stay tuned on X https://x.com/IncredibleDevHQ .
If you like the project, remember to increment the star count!