Skip to content

Commit

Permalink
add OLMo-Eval (#583)
Browse files Browse the repository at this point in the history
  • Loading branch information
zhimin-z authored Sep 4, 2024
1 parent f59045e commit 96db9bc
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -353,6 +353,7 @@ Please review our [CONTRIBUTING.md](https://github.com/EthicalML/awesome-product
* [LLM AutoEval](https://github.com/mlabonne/llm-autoeval) ![](https://img.shields.io/github/stars/mlabonne/llm-autoeval.svg?style=social) - LLM AutoEval simplifies the process of evaluating LLMs using a convenient Colab notebook. You just need to specify the name of your model, a benchmark, a GPU, and press run!
* [MLPerf Inference](https://github.com/mlcommons/inference) ![](https://img.shields.io/github/stars/mlcommons/inference.svg?style=social) - MLPerf Inference is a benchmark suite for measuring how fast systems can run models in a variety of deployment scenarios.
* [MTEB](https://github.com/embeddings-benchmark/mteb) ![](https://img.shields.io/github/stars/embeddings-benchmark/mteb.svg?style=social) - Massive Text Embedding Benchmark (MTEB) is a comprehensive benchmark of text embeddings.
* [OLMo-Eval](https://github.com/allenai/OLMo-Eval) ![](https://img.shields.io/github/stars/allenai/OLMo-Eval.svg?style=social) - OLMo-Eval is an evaluation suite for evaluating open language models.
* [OpenCompass](https://github.com/open-compass/OpenCompass) ![](https://img.shields.io/github/stars/open-compass/OpenCompass.svg?style=social) - OpenCompass is an LLM evaluation platform, supporting a wide range of models (LLaMA, LLaMa2, ChatGLM2, ChatGPT, Claude, etc) over 50+ datasets.
* [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) ![](https://img.shields.io/github/stars/huggingface/optimum-benchmark.svg?style=social) - A unified multi-backend utility for benchmarking Transformers and Diffusers with support for Optimum's arsenal of hardware optimizations/quantization schemes.
* [PhaseLLM](https://github.com/wgryc/phasellm) ![](https://img.shields.io/github/stars/wgryc/phasellm.svg?style=social) - PhaseLLM is a large language model evaluation and workflow framework.
Expand Down

0 comments on commit 96db9bc

Please sign in to comment.