Skip to content
Change the repository type filter

All

    Repositories list

    • ScaleLLM

      Public
      A high-performance inference system for large language models, designed for production environments.
      C++
      Apache License 2.0
      30397327Updated Nov 21, 2024Nov 21, 2024
    • whl

      Public
      repository to host python whl package.
      HTML
      0000Updated Oct 26, 2024Oct 26, 2024
    • FlashInfer: Kernel Library for LLM Serving
      Cuda
      Apache License 2.0
      160001Updated Oct 18, 2024Oct 18, 2024
    • vcpkg

      Public
      C++ Library Manager for Windows, Linux, and MacOS
      CMake
      MIT License
      6.5k000Updated Jun 22, 2024Jun 22, 2024
    • LLMBench

      Public
      A library for validating and benchmarking LLMs inference.
      Python
      Apache License 2.0
      1410Updated Jun 18, 2024Jun 18, 2024
    • 0000Updated Jun 5, 2024Jun 5, 2024
    • An open source ChatGPT UI.
      TypeScript
      MIT License
      8.1k000Updated May 20, 2024May 20, 2024
    • Fast and memory-efficient exact attention
      Python
      BSD 3-Clause "New" or "Revised" License
      1.4k000Updated Oct 15, 2023Oct 15, 2023
    • 💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
      Rust
      Apache License 2.0
      816000Updated Aug 4, 2023Aug 4, 2023
    • xformers

      Public
      Hackable and optimized Transformers building blocks, supporting a composable construction.
      Python
      Other
      632000Updated Aug 1, 2023Aug 1, 2023
    • Transformer related optimization, including BERT, GPT
      C++
      Apache License 2.0
      896000Updated Jul 28, 2023Jul 28, 2023
    • optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
      C++
      Apache License 2.0
      37000Updated Jul 24, 2023Jul 24, 2023