Releases: BerriAI/litellm
v1.52.15
What's Changed
- (feat) use
@google-cloud/vertexai
js sdk with litellm by @ishaan-jaff in #6873 - (chore) fix new .js tests running for vertex.js by @ishaan-jaff in #6875
- Bump cross-spawn from 7.0.3 to 7.0.6 in /ui/litellm-dashboard by @dependabot in #6865
- (Perf / latency improvement) improve pass through endpoint latency to ~50ms (before PR was 400ms) by @ishaan-jaff in #6874
- LiteLLM Minor Fixes & Improvements (11/23/2024) by @krrishdholakia in #6870
- Litellm dev 11 23 2024 by @krrishdholakia in #6881
- docs - have 1 section for routing +load balancing by @ishaan-jaff in #6884
- (QOL improvement) Provider budget routing - allow using 1s, 1d, 1mo, 2mo etc by @ishaan-jaff in #6885
- (feat) - provider budget improvements - ensure provider budgets work with multiple proxy instances + improve latency to ~90ms by @ishaan-jaff in #6886
Full Changelog: v1.52.14...v1.52.15
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.15
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 280.0 | 454.59782761891177 | 5.830264051408934 | 0.010023376592680114 | 1745 | 3 | 139.27931299997454 | 5766.263976999994 |
Aggregated | Failed ❌ | 280.0 | 454.59782761891177 | 5.830264051408934 | 0.010023376592680114 | 1745 | 3 | 139.27931299997454 | 5766.263976999994 |
v1.52.14
What's Changed
- (fix) passthrough - allow internal users to access /anthropic by @ishaan-jaff in #6843
- LiteLLM Minor Fixes & Improvements (11/21/2024) by @krrishdholakia in #6837
- fix latency issues on google ai studio by @ishaan-jaff in #6852
- (fix) add linting check to ban creating
AsyncHTTPHandler
during LLM calling by @ishaan-jaff in #6855 - (feat) Add usage tracking for streaming
/anthropic
passthrough routes by @ishaan-jaff in #6842 - (Feat) Allow passing
litellm_metadata
to pass through endpoints + Add e2e tests for /anthropic/ usage tracking by @ishaan-jaff in #6864
Full Changelog: v1.52.12...v1.52.14
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.14
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 260.0 | 292.32742033908687 | 6.002121672811824 | 0.0 | 1796 | 0 | 222.04342999998516 | 2700.951708000048 |
Aggregated | Passed ✅ | 260.0 | 292.32742033908687 | 6.002121672811824 | 0.0 | 1796 | 0 | 222.04342999998516 | 2700.951708000048 |
v1.52.10.staging.2
Full Changelog: v1.52.10.staging.1...v1.52.10.staging.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.10.staging.2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 260.85031392210647 | 6.091362356611515 | 0.0 | 1823 | 0 | 196.95026900001267 | 3095.300408000014 |
Aggregated | Passed ✅ | 230.0 | 260.85031392210647 | 6.091362356611515 | 0.0 | 1823 | 0 | 196.95026900001267 | 3095.300408000014 |
v1.52.10.staging.1
Full Changelog: v1.52.10...v1.52.10.staging.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.10.staging.1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 276.3844152464711 | 6.156861360191656 | 0.0 | 1842 | 0 | 213.47366499998088 | 2957.452922000016 |
Aggregated | Passed ✅ | 250.0 | 276.3844152464711 | 6.156861360191656 | 0.0 | 1842 | 0 | 213.47366499998088 | 2957.452922000016 |
v1.52.10-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_nov_21-stable
Full Changelog: v1.52.10.staging.2...v1.52.10-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.10-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 210.34423537712797 | 6.281907830899626 | 0.0 | 1880 | 0 | 174.39816099999916 | 1348.7341720000074 |
Aggregated | Passed ✅ | 190.0 | 210.34423537712797 | 6.281907830899626 | 0.0 | 1880 | 0 | 174.39816099999916 | 1348.7341720000074 |
v1.52.12
What's Changed
- LiteLLM Minor Fixes & Improvements (11/19/2024) by @krrishdholakia in #6820
- Add gpt-4o-2024-11-20 by @Manouchehri in #6832
- LiteLLM Minor Fixes & Improvements (11/20/2024) by @krrishdholakia in #6831
- Litellm dev 11 20 2024 by @krrishdholakia in #6838
- (refactor) anthropic - move _process_response in transformation.py by @ishaan-jaff in #6834
- (feat) add usage / cost tracking for Anthropic passthrough routes by @ishaan-jaff in #6835
- (testing) - add e2e tests for anthropic pass through endpoints by @ishaan-jaff in #6840
- (fix) don't block proxy startup if license check fails & using prometheus by @ishaan-jaff in #6839
Full Changelog: v1.52.11...v1.52.12
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.12
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 260.0 | 288.3101155320099 | 6.056613494123171 | 0.0 | 1812 | 0 | 231.241644000022 | 2338.7360799999897 |
Aggregated | Passed ✅ | 260.0 | 288.3101155320099 | 6.056613494123171 | 0.0 | 1812 | 0 | 231.241644000022 | 2338.7360799999897 |
v1.52.11
What's Changed
- (docs improvement) remove emojis, use
guides
section, categorize uncategorized docs by @ishaan-jaff in #6796 - (docs) simplify left nav names + use a section for
making llm requests
by @ishaan-jaff in #6799 - Bump cross-spawn from 7.0.3 to 7.0.5 in /ui by @dependabot in #6779
- Docs - use 1 page for all logging integrations on proxy + add logging features at top level by @ishaan-jaff in #6805
- (docs) add docstrings for all /key, /user, /team, /customer endpoints by @ishaan-jaff in #6804
- LiteLLM Minor Fixes & Improvements (11/15/2024) by @krrishdholakia in #6746
- (Proxy) add support for DOCS_URL and REDOC_URL by @ishaan-jaff in #6806
- feat - add
fireworks_ai/qwen2p5-coder-32b-instruct
by @ishaan-jaff in #6818 - Litellm stable pr 10 30 2024 by @krrishdholakia in #6821
- (Feat) Add provider specific budget routing by @ishaan-jaff in #6817
- (feat) provider budget routing improvements by @ishaan-jaff in #6827
Full Changelog: v1.52.10...v1.52.11
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.11
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 270.0 | 300.82403009385007 | 6.198177352347725 | 0.0 | 1854 | 0 | 229.45128300000306 | 3106.586268000001 |
Aggregated | Failed ❌ | 270.0 | 300.82403009385007 | 6.198177352347725 | 0.0 | 1854 | 0 | 229.45128300000306 | 3106.586268000001 |
v1.52.10
What's Changed
- add openrouter/qwen/qwen-2.5-coder-32b-instruct by @paul-gauthier in #6731
- Update routing references by @emmanuel-ferdman in #6758
- (Doc) Add section on what is stored in the DB + Add clear section on key/team based logging by @ishaan-jaff in #6769
- (Admin UI) - Remain on Current Tab when user clicks refresh by @ishaan-jaff in #6777
- (UI) fix - allow editing key alias on Admin UI by @ishaan-jaff in #6776
- (docs) add doc string for /key/update by @ishaan-jaff in #6778
- (patch) using image_urls with
vertex/anthropic
models by @ishaan-jaff in #6775 - (fix) Azure AI Studio - using
image_url
in content with both text and image_url by @ishaan-jaff in #6774 - build: add gemini-exp-1114 by @krrishdholakia in #6786
- (fix) httpx handler - bind to ipv4 for httpx handler by @ishaan-jaff in #6785
Full Changelog: v1.52.9...v1.52.10
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.10
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 271.7799367877801 | 6.1248828197277065 | 0.0 | 1833 | 0 | 213.09577699997817 | 2144.701510999994 |
Aggregated | Passed ✅ | 240.0 | 271.7799367877801 | 6.1248828197277065 | 0.0 | 1833 | 0 | 213.09577699997817 | 2144.701510999994 |
v1.52.9.dev1
What's Changed
- add openrouter/qwen/qwen-2.5-coder-32b-instruct by @paul-gauthier in #6731
- Update routing references by @emmanuel-ferdman in #6758
- (Doc) Add section on what is stored in the DB + Add clear section on key/team based logging by @ishaan-jaff in #6769
- (Admin UI) - Remain on Current Tab when user clicks refresh by @ishaan-jaff in #6777
- (UI) fix - allow editing key alias on Admin UI by @ishaan-jaff in #6776
- (docs) add doc string for /key/update by @ishaan-jaff in #6778
- (patch) using image_urls with
vertex/anthropic
models by @ishaan-jaff in #6775 - (fix) Azure AI Studio - using
image_url
in content with both text and image_url by @ishaan-jaff in #6774
Full Changelog: v1.52.9...v1.52.9.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.9.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 222.33484302855678 | 6.204541843497746 | 0.0033429643553328373 | 1856 | 1 | 62.294459999975516 | 2005.856768000001 |
Aggregated | Passed ✅ | 200.0 | 222.33484302855678 | 6.204541843497746 | 0.0033429643553328373 | 1856 | 1 | 62.294459999975516 | 2005.856768000001 |
v1.52.9
What's Changed
- (feat) add bedrock/stability.stable-image-ultra-v1:0 by @ishaan-jaff in #6723
- [Feature]: Stop swallowing up AzureOpenAi exception responses in litellm's implementation for a BadRequestError by @ishaan-jaff in #6745
- [Feature]: json_schema in response support for Anthropic by @ishaan-jaff in #6748
- fix: import audio check by @IamRash-7 in #6740
- (fix) Cost tracking for
vertex_ai/imagen3
by @ishaan-jaff in #6752 - (feat) Vertex AI - add support for fine tuned embedding models by @ishaan-jaff in #6749
- LiteLLM Minor Fixes & Improvements (11/13/2024) by @krrishdholakia in #6729
- feat - add us.llama 3.1 models by @ishaan-jaff in #6760
- (Feat) Add Vertex Model Garden llama 3.1 models by @ishaan-jaff in #6763
- (fix) Fix - don't allow
viewer
roles to create virtual keys by @ishaan-jaff in #6764 - (feat) Use
litellm/
prefix when storing virtual keys in AWS secret manager by @ishaan-jaff in #6765
New Contributors
- @IamRash-7 made their first contribution in #6740
Full Changelog: v1.52.8...v1.52.9
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.52.9
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 280.0 | 314.28547197285235 | 6.039371468840217 | 0.0 | 1805 | 0 | 226.56484299994872 | 2776.9337409999935 |
Aggregated | Failed ❌ | 280.0 | 314.28547197285235 | 6.039371468840217 | 0.0 | 1805 | 0 | 226.56484299994872 | 2776.9337409999935 |