Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can i get the answers in the dataset? #3

Open
Guozhenyuan opened this issue Oct 19, 2024 · 2 comments
Open

How can i get the answers in the dataset? #3

Guozhenyuan opened this issue Oct 19, 2024 · 2 comments
Assignees
Labels
question Further information is requested

Comments

@Guozhenyuan
Copy link

The CodeMMLU is a great piece of work!

I noticed that the dataset provides task_id, question, and choices columns, but is there an answer column?

How should I handle this dataset if I want to fine-tune an LLM?

@sinsauzero
Copy link

+1

@nmd2k
Copy link
Collaborator

nmd2k commented Oct 25, 2024

Thank you for your interest in CodeMMLU! @Guozhenyuan @sinsauzero

  1. In order to evaluate with CodeMMLU, you can hand me the LLM's response via email (follow format mentioned in HERE).

  2. Since CodeMMLU is intended as a benchmark, not a training dataset, its purpose is to evaluate the performance of pre-trained LLMs, so it might not correctly serve fine-tune purpose.

@nmd2k nmd2k self-assigned this Oct 25, 2024
@nmd2k nmd2k added the question Further information is requested label Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants