Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于CodeArena的一些问题 #209

Open
buaali opened this issue Dec 18, 2024 · 2 comments
Open

关于CodeArena的一些问题 #209

buaali opened this issue Dec 18, 2024 · 2 comments

Comments

@buaali
Copy link

buaali commented Dec 18, 2024

您好,我部署了一下仓库中CodeArena的评测基准,在运行eval_arena.sh脚本时遇到了一个问题
我注意到该脚本中有三个步骤分别是

  1. python infer_vllm.py生成output.jsonl文件
  2. python judge_models.py生成output.jsonl.judge文件
  3. python judge_models.py生成output.jsonl.judge.metric文件
    前两步正常运行并生成了对应的文件,我在第三步时遇到错误:
Traceback (most recent call last):                                                                                                                               
  File "/home/largemodel/others/CodeArena/judge_models.py", line 268, in <module>                                                                                
    main()                                                                                                                                                       
  File "/home/largemodel/others/CodeArena/judge_models.py", line 253, in main                                                                                    
    score = get_scores(objs, tasktype_to_levels)                                                                                                                 
  File "/home/largemodel/others/CodeArena/judge_models.py", line 227, in get_scores                                                                              
    main_classified_scores, sub_classified_scores = calculate_classified_score(objs, tasktype_to_levels)                                                         
  File "/home/largemodel/others/CodeArena/judge_models.py", line 152, in calculate_classified_score                                                              
    task_type = wash_tag(obj["meta"]["parsed"]["task_type"][0])                                                                                                  
KeyError: 'meta'

我在调试过程中发现obj是一个字典,打印其键名如下:
dict_keys(['messages', 'id', 'gpt-4o-2024-05-13_response', 'gpt-4o-2024-08-06_response', 'gpt-4-turbo-2024-04-09_response', 'difficulty', 'level', 'programming_l anguage', 'gpt-4-turbo-2024-04-09_response_len', 'input', 'model', 'response', 'question', 'games', 'if_win', 'if_tie'])
确实不存在meta这个键,请问这个键应该是在哪个部分生成的呢,存储的貌似是代码对应的类型信息吗

@CSJianYang
Copy link
Collaborator

We have updated the CodeArena file and included the "meta" key in the updated version of CodeArena. You can redownload the test file from "https://huggingface.co/datasets/CSJianYang/CodeArena"

@buaali
Copy link
Author

buaali commented Dec 18, 2024

已解决,谢谢~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants