How can i get the answers in the dataset? #3

Guozhenyuan · 2024-10-19T11:21:20Z

The CodeMMLU is a great piece of work!

I noticed that the dataset provides task_id, question, and choices columns, but is there an answer column?

How should I handle this dataset if I want to fine-tune an LLM?

sinsauzero · 2024-10-22T08:08:34Z

+1

nmd2k · 2024-10-25T03:02:20Z

Thank you for your interest in CodeMMLU! @Guozhenyuan @sinsauzero

In order to evaluate with CodeMMLU, you can hand me the LLM's response via email (follow format mentioned in HERE).
Since CodeMMLU is intended as a benchmark, not a training dataset, its purpose is to evaluate the performance of pre-trained LLMs, so it might not correctly serve fine-tune purpose.

nmd2k self-assigned this Oct 25, 2024

nmd2k added the question Further information is requested label Oct 25, 2024

Provide feedback