C-EvalTranslation site

2wks agorelease 880,430 0 95.1K

C-Eval is a Chinese foundational model evaluation suite jointly developed by Shanghai Jiao Tong University, Tsinghua University, and the University of Edinburgh. It compr...

Location:
China
Language:
CN
Collection time:
2025-05-30
C-EvalC-Eval

In today’s rapidly advancing AI era, evaluating the capabilities of large language models has become crucial. C-Eval, jointly developed by research teams from Shanghai Jiao Tong University, Tsinghua University, and the University of Edinburgh in May 2023, is designed for this purpose.

Website Introduction

C-Eval is a Chinese evaluation suite specifically designed for large language models, aiming to comprehensively assess models’ Chinese comprehension and reasoning abilities through multi-level and multi-discipline tests.

Key Features

-Extensive Coverage:Comprises 13,948 multiple-choice questions across 52 disciplines. -Multiple Difficulty Levels:Questions are categorized into four difficulty levels: middle school, high school, college, and professional, catering to various evaluation needs. -Standardized Assessment:Provides a unified evaluation standard, facilitating horizontal comparison of model performance.

Related Projects

Similar to international benchmarks like MMLU, C-Eval aims to offer authoritative evaluation standards for Chinese large models.

Advantages

-Comprehensiveness:Covers a wide range of disciplines and difficulty levels, ensuring thorough evaluation. -Authority:Jointly developed by renowned universities, ensuring scientific and impartial assessment. -Practicality:Offers researchers and developers a reliable tool to aid in model optimization and improvement.

Pricing

C-Eval is currently open and free to access; users can visit its official website to obtain relevant resources.

Summary

The launch of C-Eval provides a standardized and authoritative tool for evaluating Chinese large language models. Through its comprehensive question bank and multi-level evaluation system, researchers and developers can gain a more accurate understanding of model performance, enabling targeted optimization and enhancement.

Relevant Navigation