-
1、ceval数据测试的时候,好像默认进行five-shot的测试,下面是输出prediction的文件: |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
into retriever=dict(type=ZeroRetriever),
inferencer=dict(type=PPLInferencer), and that is 0-shot. |
Beta Was this translation helpful? Give feedback.
https://github.com/InternLM/opencompass/blob/262ab794fb52084e4494c281210e652706ce1280/configs/datasets/ceval/ceval_ppl_578f8d.py#L166-L167
into
and that is 0-shot.
2. Yes, all the in-context examples come from the dev set. Yes, each test cases in the same subset (e.g. 计算机网络 / 初中化学) use the same in-context examples, and test cases from different subsets use different in-context examples. Yes, each subset in dev has and only has 5 line of data.
3. Please take this as an example: https://github.com/Leymore/opencompass/blob/8087b91e81ba1fb361d7153db60e6883e5e81210/conf…