Skip to content

Latest commit

 

History

History
35 lines (26 loc) · 580 Bytes

readme.md

File metadata and controls

35 lines (26 loc) · 580 Bytes

Python API for Chat With RTX

Usage

.\start_server.bat

import rtx_api_july_2024 as rtx_api

response = rtx_api.send_message("write fire emoji")
print(response)

Speed

Chat With RTX builds int4 (W4A16 AWQ) tensortRT engines for LLMs

Model On 4090
Mistral 457 char/sec
Llama2 315 char/sec
ChatGLM3 385 char/sec
Gemma 407 char/sec



Update History of Chat With RTX
3.2024  Removed youtube video transcript fetch
4.2024  Added Whisper Speech to text model
7.2024  Electron app ui

LICENSE: CC0