Exa

Boost your GPU's LLM performance by 300% on everyday GPU hardware, as validated by renowned developers, in just 5 minutes of setup and with no additional hardware costs.

Principles

Radical Simplicity (Utilizing super-powerful LLMs with as minimal lines of code as possible)
Ultra-Optimizated Peformance (High Performance code that extract all the power from these LLMs)
Fludity & Shapelessness (Plug in and play and re-architecture as you please)

📦 Install 📦

$ pip3 install exxa

Usage

🎉 Features 🎉

World-Class Quantization: Get the most out of your models with top-tier performance and preserved accuracy! 🏋️‍♂️
Automated PEFT: Simplify your workflow! Let our toolkit handle the optimizations. 🛠️
LoRA Configuration: Dive into the potential of flexible LoRA configurations, a game-changer for performance! 🌌
Seamless Integration: Designed to work seamlessly with popular models like LLAMA, Falcon, and more! 🤖

💌 Feedback & Contributions 💌

We're excited about the journey ahead and would love to have you with us! For feedback, suggestions, or contributions, feel free to open an issue or a pull request. Let's shape the future of fine-tuning together! 🌱

Check out our project board for our current backlog and features we're implementing

License

MIT

Todo

Setup utils logger classes for metric logging with useful metadata such as token inference per second, latency, memory consumption
Add cuda c++ extensions for radically optimized classes for high performance quantization + inference on the edge

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Exa

Principles

📦 Install 📦

Usage

🎉 Features 🎉

💌 Feedback & Contributions 💌

License

Todo

Files

README.md

Latest commit

History

README.md

File metadata and controls

Exa

Principles

📦 Install 📦

Usage

🎉 Features 🎉

💌 Feedback & Contributions 💌

License

Todo