咨询一个ceval数据测试的问题 #124

niexufei · 2023-07-28T06:44:20Z

niexufei
Jul 28, 2023

1、ceval数据测试的时候，好像默认进行five-shot的测试，下面是输出prediction的文件：
ceval-accountant.json.txt
每一个试题都给出几个例子，这块如何设置成0-shot呢？
2、这些shot的例子都是从dev中获取的吗？感觉每个都是固定的，我没挨个看，实际情况是否是这样？如果是，那dev中只需要保持5个数据即可；
3、如何设置带有思维链的测试，比如提示让模型think step by step，将dev的explanation列用起来？

Answered by Leymore

Jul 31, 2023

Modify the following lines

https://github.com/InternLM/opencompass/blob/262ab794fb52084e4494c281210e652706ce1280/configs/datasets/ceval/ceval_ppl_578f8d.py#L166-L167

into

            retriever=dict(type=ZeroRetriever),
            inferencer=dict(type=PPLInferencer),

and that is 0-shot.
2. Yes, all the in-context examples come from the dev set. Yes, each test cases in the same subset (e.g. 计算机网络 / 初中化学) use the same in-context examples, and test cases from different subsets use different in-context examples. Yes, each subset in dev has and only has 5 line of data.
3. Please take this as an example: https://github.com/Leymore/opencompass/blob/8087b91e81ba1fb361d7153db60e6883e5e81210/conf…

View full answer

Leymore · 2023-07-31T02:24:33Z

Leymore
Jul 31, 2023

Modify the following lines

https://github.com/InternLM/opencompass/blob/262ab794fb52084e4494c281210e652706ce1280/configs/datasets/ceval/ceval_ppl_578f8d.py#L166-L167

into

            retriever=dict(type=ZeroRetriever),
            inferencer=dict(type=PPLInferencer),

and that is 0-shot.
2. Yes, all the in-context examples come from the dev set. Yes, each test cases in the same subset (e.g. 计算机网络 / 初中化学) use the same in-context examples, and test cases from different subsets use different in-context examples. Yes, each subset in dev has and only has 5 line of data.
3. Please take this as an example: https://github.com/Leymore/opencompass/blob/8087b91e81ba1fb361d7153db60e6883e5e81210/configs/datasets/ceval/ceval_gen_286279.py
We have just updated the prompt document here. This document includes the usage of prompt. I believe it is worth reading.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

咨询一个ceval数据测试的问题 #124

{{title}}

Replies: 1 comment

{{title}}

Select a reply

咨询一个ceval数据测试的问题 #124

niexufei Jul 28, 2023

Replies: 1 comment

Leymore Jul 31, 2023

niexufei
Jul 28, 2023

Leymore
Jul 31, 2023