![](/images/logo/logo-1_hu431254d4f0ccfada3294ee753d738e8a_58966_1110x0_resize_lanczos_3.png)
![](/images/logo/logo-3_hud81f85f3bbbaa0a2f9396c86ebe34a29_37801_1110x0_resize_lanczos_3.png)
![](/images/logo/logo-7_hu5838c1b7cd094d93cf1efb1ff0546469_62539_1110x0_resize_lanczos_3.png)
![](/images/logo/logo-5_huc6fe2bd78a75856426cd18d65dc8868b_57580_1110x0_resize_lanczos_3.png)
![](/images/logo/logo-4_hu9539d90a0691532859c0e43625f77319_38848_1110x0_resize_lanczos_3.png)
![](/images/logo/logo-6_huc21722974765f002513dc19aca584286_21227_280x180_resize_q90_h2_lanczos_3.webp)
![feature image](/images/service-cn_hu443715ef1f3c4e3c4958d17685de675a_166663_1110x0_resize_lanczos_3.png)
数学测评基准五大亮点
- 首个LLM在数学领域的一站式测评基准
- 灵活的扩展方式,便捷新增数学测评集加入测评
- 测评数据集多维度划分支持,多角度测评LLM数学能力
- 丰富的模型支持,支持各种模型接入测评(HF模型、API模型、自定义开源模型)
- 多样化测评方式,支持零样本测评和小样本测评
已加入测评榜单的模型
GPT-4
GPT-3.5
LLaMA2-7B
LLaMA2-7B-chat
LLaMA2-13B
LLaMA2-13B-chat
LLaMA2-70B
LLaMA2-70B-chat
ChatGLM2-6B
Baichuan2-13B-Base
InternLM-20B
InternLM-chat-20B
InternLM2-base-20B
InternLM2-chat-20B
InternLM2-math-20B
MathGPT
Qwen
WizardMath-13B-V1.0
WizardMath-70B-V1.0
MOSS-003-base-16B
文心一言
讯飞星火
MammoTH-70B
GAIRMath-Abel-70B
Mistral-7B-Instruct-v0.1
Mistral-7B-v0.1
Llemma-7B
Llemma-34B
MetaMath-70B
GLM4
申请加入测评