UGPhysics

🔥News

2025-02-01: We have released the UGPhysics dataset and Our Paper on Arxiv.

UGPhysics is a large-scale and comprehensive benchmark tailored for evaluating the physics problem-solving abilities of LLMs across multiple UnderGraduate-level Physics (UGPhysics) disciplines, comprising 5,520 distinct problems in three main domains, 13 core subjects, and 59 key topics.

🏆 Leaderboard

coming soon!

🫵🏻 Test your own model

Download Data

git clone https://huggingface.co/datasets/UGPhysics/ugphysics
mv ugphysics/* ./data/

Export Your Key

export OPENAI_BASE_URL=your_base_url
export OPENAI_API_KEY=your_key

Inference

For closed-source LLMs, please replace ${MODEL} and ${SUBJECT} by the LLM and subject you want to test.

python codes/infer_close.py --model ${MODEL} --subject ${SUBJECT}

For open-source LLMs, we will use vllm to accelerate inference.

python codes/generate_open.py --model ${MODEL} --system ${SYS_PROMPT} --subject ${SUBJECT} --tensor_parallel_size 8

where SYS_PROMPT is the system prompt, e.g., "Please reason step by step, and put your final answer within \boxed{}.".

Evaluate

Remember to export your keys as mentioned above.

python codes/eval.py --model_path ${MODEL} --subject ${SUBJECT}

💬 Citation

Thanks for the open source code of OlympiadBench and OlympicArena. Our README.md is adapted from OmniMath.

If you find our work interesting and meaningful, welcome to give a 🌟 to our repo and cite our paper.

@article{xu2025ugphysics,
  title={UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models},
  author={Xu, Xin and Xu, Qiyun and Xiao, Tong and Chen, Tianhao and Yan, Yuchen and Zhang, Jiaxin and Diao, Shizhe and Yang, Can and Wang, Yang},
  journal={arXiv preprint arXiv:2502.00334},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
codes		codes
data		data
imgs		imgs
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UGPhysics

🔥News

🏆 Leaderboard

🫵🏻 Test your own model

Download Data

Export Your Key

Inference

Evaluate

💬 Citation

About

Releases

Packages

Languages

YangLabHKUST/UGPhysics

Folders and files

Latest commit

History

Repository files navigation

UGPhysics

🔥News

🏆 Leaderboard

🫵🏻 Test your own model

Download Data

Export Your Key

Inference

Evaluate

💬 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages