Stars
A highly optimized LLM inference acceleration engine for Llama and its variants.
FlashInfer: Kernel Library for LLM Serving
Visualizer for neural network, deep learning and machine learning models
A high-throughput and memory-efficient inference and serving engine for LLMs
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Adlik: Toolkit for Accelerating Deep Learning Inference
A C++ library for interacting with JSON.
library to read/write .npy and .npz files in C/C++
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
ERNIE Bot Agent is a Large Language Model (LLM) Agent Framework, powered by the advanced capabilities of ERNIE Bot and the platform resources of Baidu AI Studio.
本书为《C++17 the complete guide》的个人中文翻译,仅供学习和交流使用,侵删
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…
Accessible large language models via k-bit quantization for PyTorch.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
how to optimize some algorithm in cuda.
Development repository for the Triton language and compiler
`std::execution`, the proposed C++ framework for asynchronous and parallel programming.
Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app
【A common used C++ DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流
SmartMatrix Library for Teensy 3, Teensy 4, and ESP32
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)