diff --git a/README.md b/README.md
index b593578..a760cc3 100644
--- a/README.md
+++ b/README.md
@@ -16,7 +16,7 @@
#### 主要内容
-- 🚀 开源Llama-3-Chinese基座模型和Llama-3-Chinese-Instruct指令模型
+- 🚀 开源Llama-3-Chinese基座模型和Llama-3-Chinese-Instruct指令模型(v1, v2, v3)
- 🚀 开源了预训练脚本、指令精调脚本,用户可根据需要进一步训练或微调模型
- 🚀 开源了alpaca_zh_51k, stem_zh_instruction, ruozhiba_gpt4 (4o/4T) 指令精调数据
- 🚀 提供了利用个人电脑CPU/GPU快速在本地进行大模型量化和部署的教程
@@ -29,7 +29,9 @@
## 新闻
-**[2024/05/08] 发布Llama-3-Chinese-8B-Instruct-v2版指令模型,直接采用500万条指令数据在 [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) 上进行精调。详情查看:[📚v2.0版本发布日志](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v2.0)**
+**[2024/05/30] 发布Llama-3-Chinese-8B-Instruct-v3版指令模型,相比v1/v2在下游任务上获得显著提升。详情查看:[📚v3.0版本发布日志](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v3.0)**
+
+[2024/05/08] 发布Llama-3-Chinese-8B-Instruct-v2版指令模型,直接采用500万条指令数据在 [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) 上进行精调。详情查看:[📚v2.0版本发布日志](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v2.0)
[2024/05/07] 添加预训练脚本、指令精调脚本。详情查看:[📚v1.1版本发布日志](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v1.1)
@@ -86,18 +88,35 @@
| 模型大小 | 8B | 8B |
| 训练类型 | Causal-LM (CLM) | 指令精调 |
| 训练方式 | LoRA + 全量emb/lm-head | LoRA + 全量emb/lm-head |
-| 初始化模型 | [原版Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | v1: Llama-3-Chinese-8B
v2: [原版Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) |
+| 初始化模型 | [原版Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | v1: Llama-3-Chinese-8B
v2: [原版Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
v3: mix of inst/inst-v2/inst-meta |
| 训练语料 | 无标注通用语料(约120GB) | 有标注指令数据(约500万条) |
| 词表大小 | 原版词表(128,256) | 原版词表(128,256) |
| 支持上下文长度 | 8K | 8K |
| 输入模板 | 不需要 | 需要套用Llama-3-Instruct模板 |
| 适用场景 | 文本续写:给定上文,让模型生成下文 | 指令理解:问答、写作、聊天、交互等 |
+以下是Instruct版本之间的对比,**如无明确偏好,请优先使用Instruct-v3版本。**
+
+| 对比项 | Instruct-v1 | Instruct-v2 | Instruct-v3 |
+| :-------------------- | :----------------------------------------------------: | :----------------------------------------------------------: | :-------------------: |
+| 发布时间 | 2024/4/30 | 2024/5/8 | 2024/5/30 |
+| 基模型 | [原版Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | [原版Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | (见训练方式) |
+| 训练方式 | 第一阶段:120G中文语料预训练
第二阶段:500万指令数据精调 | 直接使用500万指令数据精调 | 使用inst-v1, inst-v2, inst-meta进行模型融合,并经过少量指令数据(~5K条)的精调得到 |
+| 中文能力[1] | 49.3 / 51.5 | 51.6 / 51.6 | **55.2 / 54.8** 👍🏻 |
+| 英文能力[1] | 63.21 | 66.68 | **66.81** 👍🏻 |
+| 长文本能力[1] | 29.6 | **46.4** 👍🏻 | 40.5 |
+| 大模型竞技场胜率 / Elo评分[2] | 49.4% / 1430 | 66.1% / 1559 | **83.6% / 1627** 👍🏻 |
+
+> [!NOTE]
+> [1] 中文能力效果来自C-Eval (valid);英文能力效果来自Open LLM Leaderboard (avg);长文本能力来自LongBench (avg);详细效果请参阅[💯模型效果](#模型效果)一节。
+> [2] 大模型竞技场效果获取时间:2024/5/30,仅供参考。
+
### 下载地址
| 模型名称 | 完整版 | LoRA版 | GGUF版 |
| :------------------------ | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| **Llama-3-Chinese-8B-Instruct-v3**
(指令模型) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v3)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3) | N/A | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v3-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3-gguf) |
| **Llama-3-Chinese-8B-Instruct-v2**
(指令模型) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-gguf) |
| **Llama-3-Chinese-8B-Instruct**
(指令模型) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-gguf) |
| **Llama-3-Chinese-8B**
(基座模型) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-gguf) |
@@ -110,7 +129,7 @@
- v2基模型:原版[Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
- **GGUF模型**:[llama.cpp](https://github.com/ggerganov/llama.cpp)推出的量化格式,适配ollama等常见推理工具,推荐只需要做推理部署的用户下载;模型名后缀为`-im`表示使用了importance matrix进行量化,通常具有更低的PPL,建议使用(用法与常规版相同)
> [!NOTE]
-> 若无法访问HF,可考虑一些镜像站点(如[hf-mirror.com](hf-mirror.com)),具体方法请自行查找解决。
+> 若无法访问HF,可考虑一些镜像站点(如hf-mirror.com),具体方法请自行查找解决。
## 推理与部署
@@ -145,6 +164,7 @@
| Models | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) |
| ------------------------ | :-----------: | :-----------: | :-----------: | :-----------: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 55.2 | 54.8 | 52.1 | 52.4 |
| **Llama-3-Chinese-8B-Instruct-v2** | 51.6 | 51.6 | 49.7 | 49.8 |
| **Llama-3-Chinese-8B-Instruct** | 49.3 | 51.5 | 48.3 | 49.4 |
| **Llama-3-Chinese-8B** | 47.0 | 50.5 | 46.1 | 49.0 |
@@ -161,6 +181,7 @@
| Models | Test (0-shot) | Test (5-shot) |
| ------------------------ | :-----------: | :-----------: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 54.4 | 54.8 |
| **Llama-3-Chinese-8B-Instruct-v2** | 51.8 | 52.4 |
| **Llama-3-Chinese-8B-Instruct** | 49.7 | 51.5 |
| **Llama-3-Chinese-8B** | 48.0 | 50.9 |
@@ -177,6 +198,7 @@
| Models | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) |
| ------------------------ | :-----------: | :-----------: | :-----------: | :-----------: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 64.7 | 65.0 | 64.8 | 65.9 |
| **Llama-3-Chinese-8B-Instruct-v2** | 62.1 | 63.9 | 62.6 | 63.7 |
| **Llama-3-Chinese-8B-Instruct** | 60.1 | 61.3 | 59.8 | 61.8 |
| **Llama-3-Chinese-8B** | 55.5 | 58.5 | 57.3 | 61.1 |
@@ -193,6 +215,7 @@
| Models | 单文档QA | 多文档QA | 摘要 | FS学习 | 代码 | 合成 | 平均 |
| ------------------------------------------------------------ | :------: | :------: | :--: | :----: | :--: | :--: | :--: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 20.3 | 28.8 | 24.5 | 28.1 | 59.4 | 91.9 | 40.5 |
| **Llama-3-Chinese-8B-Instruct-v2** | 57.3 | 27.1 | 13.9 | 30.3 | 60.6 | 89.5 | 46.4 |
| **Llama-3-Chinese-8B-Instruct** | 44.1 | 24.0 | 12.4 | 33.5 | 51.8 | 11.5 | 29.6 |
| **Llama-3-Chinese-8B** | 16.4 | 19.3 | 4.3 | 28.7 | 14.3 | 4.6 | 14.6 |
@@ -211,6 +234,7 @@
| Models | ARC | HellaS | MMLU | TQA | WinoG | GSM8K | 平均 |
| ------------------------------------------------------------ | :---: | :----: | :---: | :---: | :---: | :---: | :---: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 63.40 | 80.51 | 67.90 | 53.57 | 76.24 | 59.21 | 66.81 |
| **Llama-3-Chinese-8B-Instruct-v2** | 62.63 | 79.72 | 66.48 | 53.93 | 76.72 | 60.58 | 66.68 |
| **Llama-3-Chinese-8B-Instruct** | 61.26 | 80.24 | 63.10 | 55.15 | 75.06 | 44.43 | 63.21 |
| **Llama-3-Chinese-8B** | 55.88 | 79.53 | 63.70 | 41.14 | 77.03 | 37.98 | 59.21 |
@@ -281,7 +305,7 @@
问题5:为什么不对模型做全量预训练而是用LoRA?
问题6:为什么Llama-3-Chinese对话效果不好?
问题7:为什么指令模型会回复说自己是ChatGPT?
-问题8:Instrcut模型的v1(原版)和v2有什么区别?
+问题8:Instruct模型的v1(原版)和v2有什么区别?
```
## 免责声明
diff --git a/README_EN.md b/README_EN.md
index 60a9242..e8a518d 100644
--- a/README_EN.md
+++ b/README_EN.md
@@ -16,7 +16,7 @@ This project is developed based on Meta's newly released next-generation open-so
#### Main Content
-- 🚀 Open-source Llama-3-Chinese base model and Llama-3-Chinese-Instruct instruction model
+- 🚀 Open-source Llama-3-Chinese base model and Llama-3-Chinese-Instruct instruction model (v1, v2, v3)
- 🚀 Released pre-training scripts and instruction fine-tuning scripts, allowing users to further train or fine-tune the model as needed
- 🚀 Released alpaca_zh_51k, stem_zh_instruction, ruozhiba_gpt4 (4o/4T) instruction data
- 🚀 Provides a tutorial for quickly quantizing and deploying large models locally using a personal computer's CPU/GPU
@@ -29,7 +29,9 @@ This project is developed based on Meta's newly released next-generation open-so
## News
-**[2024/05/08] Release Llama-3-Chinese-8B-Instruct-v2, which is directly tuned on [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) with 5M instructions. For details, see: [📚Version 2.0 Release Log](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v2.0)**
+**[2024/05/30] Release Llama-3-Chinese-8B-Instruct-v3, which has better performance on downstream tasks than v1/v2. For details, see: [📚Version 3.0 Release Log](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v3.0)**
+
+[2024/05/08] Release Llama-3-Chinese-8B-Instruct-v2, which is directly tuned on [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) with 5M instructions. For details, see: [📚Version 2.0 Release Log](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v2.0)
[2024/05/07] Add pre-training and SFT scripts. For details, see: [📚Version 1.1 Release Log](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v1.1)
@@ -86,18 +88,33 @@ Here's a comparison of the models in this project and recommended usage scenario
| Model Size | 8B | 8B |
| Training Type | Causal-LM (CLM) | Instruction Fine-Tuning |
| Training Method | LoRA + Full emb/lm-head | LoRA + Full emb/lm-head |
-| Initial Model | [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | v1: Llama-3-Chinese-8B
v2: [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) |
+| Initial Model | [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | v1: Llama-3-Chinese-8B
v2: [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
v3: mix of inst/inst-v2/inst-meta |
| Training Corpus | Unlabeled general corpus (approx. 120GB) | Labeled instruction data (approx. 5 million entries) |
| Vocabulary Size | Original vocabulary (128,256) | Original vocabulary (128,256) |
| Supported Context Length | 8K | 8K |
| Input Template | Not required | Requires Llama-3-Instruct template |
| Applicable Scenarios | Text continuation: Given a context, let the model generate the following text | Instruction understanding: Q&A, writing, chatting, interaction, etc. |
+Here is a comparison between different versions of Instruct. **Unless there is a clear preference, please prioritize using the Instruct-v3 version.**
+
+| Comparison Item | Instruct-v1 | Instruct-v2 | Instruct-v3 |
+| :----------------------- | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| Release Date | 2024/4/30 | 2024/5/8 | 2024/5/30 |
+| Base Model | [Original Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | [Original Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | (See Training Method) |
+| Training Method | First Stage: Pre-training with 120G Chinese Corpus
Second Stage: Fine-tuning with 5 million instruction data | Direct fine-tuning with 5 million instruction data | Model merging using inst-v1, inst-v2, and inst-meta, followed by fine-tuning with a small amount of instruction data |
+| Chinese Proficiency | 49.3 / 51.5 | 51.6 / 51.6 | **55.2 / 54.8** 👍🏻 |
+| English Proficiency | 63.21 | 66.68 | **66.81** 👍🏻 |
+| Long Text Capability | 29.6 | **46.4** 👍🏻 | 40.5 |
+| LLM Arena Win Rate / Elo | 49.4% / 1430 | 66.1% / 1559 | **83.6% / 1627** 👍🏻 |
+
+> [!NOTE]
+> Chinese proficiency results are from C-Eval (valid); English proficiency results are from Open LLM Leaderboard (avg); long text capability results are from LongBench (avg). For detailed performance, please refer to the [💯 Model Performance](#模型效果) section.
### Download Links
| Model Name | Full Version | LoRA Version | GGUF Version |
| --------------------------------------------------- | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| **Llama-3-Chinese-8B-Instruct-v3**
(chat model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v3)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3) | N/A | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v3-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v3-gguf) |
| **Llama-3-Chinese-8B-Instruct-v2**
(chat model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-gguf) |
| **Llama-3-Chinese-8B-Instruct**
(chat model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-gguf) |
| **Llama-3-Chinese-8B**
(base model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-gguf) |
@@ -146,6 +163,7 @@ To evaluate the effectiveness of the related models, this project conducted both
| Models | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) |
| ------------------------ | :-----------: | :-----------: | :-----------: | :-----------: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 55.2 | 54.8 | 52.1 | 52.4 |
| **Llama-3-Chinese-8B-Instruct-v2** | 51.6 | 51.6 | 49.7 | 49.8 |
| **Llama-3-Chinese-8B-Instruct** | 49.3 | 51.5 | 48.3 | 49.4 |
| **Llama-3-Chinese-8B** | 47.0 | 50.5 | 46.1 | 49.0 |
@@ -162,6 +180,7 @@ To evaluate the effectiveness of the related models, this project conducted both
| Models | Test (0-shot) | Test (5-shot) |
| ------------------------ | :-----------: | :-----------: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 54.4 | 54.8 |
| **Llama-3-Chinese-8B-Instruct-v2** | 51.8 | 52.4 |
| **Llama-3-Chinese-8B-Instruct** | 49.7 | 51.5 |
| **Llama-3-Chinese-8B** | 48.0 | 50.9 |
@@ -178,6 +197,7 @@ To evaluate the effectiveness of the related models, this project conducted both
| Models | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) |
| ------------------------ | :-----------: | :-----------: | :-----------: | :-----------: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 64.7 | 65.0 | 64.8 | 65.9 |
| **Llama-3-Chinese-8B-Instruct-v2** | 62.1 | 63.9 | 62.6 | 63.7 |
| **Llama-3-Chinese-8B-Instruct** | 60.1 | 61.3 | 59.8 | 61.8 |
| **Llama-3-Chinese-8B** | 55.5 | 58.5 | 57.3 | 61.1 |
@@ -194,6 +214,7 @@ To evaluate the effectiveness of the related models, this project conducted both
| Models | Single-doc QA | Multi-doc QA | Summarization | Few-Shot Learning | Code | Synthesis | Average |
| --- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 20.3 | 28.8 | 24.5 | 28.1 | 59.4 | 91.9 | 40.5 |
| **Llama-3-Chinese-8B-Instruct-v2** | 57.3 | 27.1 | 13.9 | 30.3 | 60.6 | 89.5 | 46.4 |
| **Llama-3-Chinese-8B-Instruct** | 44.1 | 24.0 | 12.4 | 33.5 | 51.8 | 11.5 | 29.6 |
| **Llama-3-Chinese-8B** | 16.4 | 19.3 | 4.3 | 28.7 | 14.3 | 4.6 | 14.6 |
@@ -212,6 +233,7 @@ To evaluate the effectiveness of the related models, this project conducted both
| Models | ARC | HellaS | MMLU | TQA | WinoG | GSM8K | Average |
| ------------------------------------------------------------ | :---: | :----: | :---: | :---: | :---: | :---: | :-----: |
+| **Llama-3-Chinese-8B-Instruct-v3** | 63.40 | 80.51 | 67.90 | 53.57 | 76.24 | 59.21 | 66.81 |
| **Llama-3-Chinese-8B-Instruct-v2** | 62.63 | 79.72 | 66.48 | 53.93 | 76.72 | 60.58 | 66.68 |
| **Llama-3-Chinese-8B-Instruct** | 61.26 | 80.24 | 63.10 | 55.15 | 75.06 | 44.43 | 63.21 |
| **Llama-3-Chinese-8B** | 55.88 | 79.53 | 63.70 | 41.14 | 77.03 | 37.98 | 59.21 |