OpenLLMLeaderboard之DataLearner备份

大模型评测得分排行榜Open LLM Leaderboard中国站

为了方便大家更便捷查询，DataLearnerAI发布了DataLearnerAI-GPT：目前已经支持基于OpenLLMLeaderboard数据回答任意大模型评测结果数据地址如下：

https://chat.openai.com/g/g-8eu9KgtUm-datalearnerai-gpt

关于DataLearnerAI-GPT的详细介绍参考：https://www.datalearner.com/blog/1051699757266256

随着大量大型语言模型（LLMs）和聊天机器人每周都在发布，它们往往伴随着对性能的夸大宣称，要筛选出由开源社区所取得的真正进展以及哪个模型是当前的技术领先水平，可能会非常困难。

为此，HF推出了这个大模型开放评测追踪排行榜。📐 🤗 Open LLM Leaderboard 旨在追踪、排名和评估开源大型语言模型（LLMs）和聊天机器人在不同评测任务上的得分。

由于HuggingFace的访问稳定性和速度，我们提供了同步更新的结果。原网页请访问：https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

Open LLM Leaderboard排行榜的各个评测任务介绍

AI2 Reasoning Challenge (25-shot)

一套小学科学问题。
HellaSwag (10-shot)

对于人类而言简单（大约95%）的常识推理测试，但对于最新技术模型而言具有挑战性。
MMLU (5-shot)

测试文本模型的多任务准确性，涵盖57项任务，包括小学数学、美国历史、计算机科学、法律等。
TruthfulQA (0-shot)

测试模型复制网络上常见虚假信息的倾向。注意：工具中的 TruthfulQA 实际上至少是6次尝试的任务。
Winogrande (5-shot)

大规模的、具有对抗性的、困难的 Winograd 基准测试，用于常识推理。
GSM8k (5-shot)

多样化的小学数学文字问题，用于测试模型解决多步骤数学推理问题的能力。

下表中关于模型类型的图标解释如下：

🟢 : 预训练模型：这类模型是新的基础模型，它们是基于特定数据集进行预训练的。

🔶 ：领域特定微调模型：这些预训练模型经过了针对特定领域数据集的进一步微调，以获得更好的性能。

💬 ：聊天模型：包括使用任务指令数据集的IFT（指令式任务训练）、RLHF（强化学习从人类反馈）或DPO（通过增加策略稍微改变模型的损失）等方法进行的聊天式微调模型。

🤝 ：基础合并和Moerges模型：这类模型通过合并或MoErges（模型融合）技术集成了多个模型，但不需要额外的微调。如果您发现没有图标的模型，请随时提交问题，以补充模型信息。

❓：表示未知

你可以按照如下类型筛选不同类型的模型来排序：

全部模型

Pretrained Models

Fine Tuned Models

Chat Models

Merged or MoE Models

模型名称	模型类型	参数大小（亿）	平均分	ARC分数	Hellaswag分数	MMLU分数	TruthfulQA分数	Winogrande分数	GSM8K分数	模型架构
mistral-class-tutor-7b-ep3 📑	🔶	72.4	46.09	47.95	77.8	34.57	44.69	71.51	0.0	MistralForCausalLM
bloom ✅ 📑	🟢	1762.5	46.07	50.43	76.41	30.85	39.76	72.06	6.9	BloomForCausalLM
vicuna-7b-v1.5-lora-mctaco-modified2 📑	🔶	66.1	46.03	42.92	73.97	48.49	40.43	69.69	0.68	Unknown
Ambari-7B-base-v0.1-sharded 📑	🔶	68.8	45.92	47.95	74.62	40.39	38.91	72.06	1.59	LlamaForCausalLM
ssh_1.8B 📑	🔶	18.4	45.91	39.08	62.37	44.09	43.15	59.27	27.52	LlamaForCausalLM
quan-1.8b-chat 📑	💬	18	45.91	39.08	62.37	44.09	43.15	59.27	27.52	LlamaForCausalLM
mistral_v1 📑	🔶	72.4	45.85	47.01	67.58	48.68	37.53	64.8	9.48	MistralForCausalLM
CodeLlama-13b-Instruct-hf 📑	💬	130.2	45.82	44.54	64.93	38.89	45.88	68.03	12.66	LlamaForCausalLM
CodeLlama-13B-Instruct-fp16 📑	💬	130.2	45.82	44.62	64.94	38.77	45.88	68.03	12.66	LlamaForCausalLM
Kan-LLaMA-7B-SFT-v0.1-sharded 📑	🔶	68.8	45.76	45.9	71.43	40.86	45.04	68.82	2.5	LlamaForCausalLM
Ambari-7B-Instruct-v0.1-sharded 📑	🔶	68.8	45.74	50.0	74.59	38.03	40.39	69.53	1.9	LlamaForCausalLM
llama2-7b-raw-sft 📑	🔶	67.4	45.67	47.44	75.25	33.86	40.77	73.01	3.71	LlamaForCausalLM
mistral-7b-raw-sft 📑	🔶	67.4	45.67	47.44	75.25	33.86	40.77	73.01	3.71	LlamaForCausalLM
Planner-7B-fp16 📑	🔶	70	45.65	51.02	77.82	35.71	34.33	71.43	3.56	LlamaForCausalLM
speechless-codellama-platypus-13b 📑	💬	130	45.64	45.31	68.63	42.82	42.38	65.59	9.1	LlamaForCausalLM
llama-base-7b 📑	🟢	66.1	45.62	50.94	77.8	35.67	34.34	71.43	3.56	Unknown
PandaLM-Alpaca-7B-v1 📑	🔶	70	45.59	50.85	77.36	35.91	36.63	71.9	0.91	LlamaForCausalLM
Airavata 📑	🔶	68.7	45.52	46.5	69.26	43.9	40.62	68.82	4.02	LlamaForCausalLM
tamil-llama-7b-instruct-v0.1 📑	🔶	70	45.52	48.04	70.97	39.95	41.7	70.64	1.82	LlamaForCausalLM
chinese-llama-plus-13b-hf 📑	🔶	130	45.39	46.25	71.88	40.74	39.89	73.09	0.53	LlamaForCausalLM
vicuna-7b-v1.5-lora-mctaco-modified1 📑	🔶	66.1	45.38	40.87	73.4	47.42	39.87	69.46	1.29	Unknown
openthaigpt-1.0.0-beta-7b-chat-ckpt-hf 📑	🔶	70	45.35	44.97	70.19	36.22	49.99	69.38	1.36	LlamaForCausalLM
ALMA-7B 📑	🔶	70	45.32	50.34	75.5	38.04	35.64	72.38	0.0	LlamaForCausalLM
MiniChat-3B 📑	🔶	30.2	45.31	44.03	67.19	39.17	45.67	65.27	10.54	LlamaForCausalLM
opt-iml-max-30b 📑	❓	300	45.28	43.86	72.39	41.09	38.16	73.72	2.5	OPTForCausalLM
openbuddy-openllama-7b-v12-bf16 📑	🔶	70	45.28	42.06	62.01	46.53	45.18	65.04	10.84	LlamaForCausalLM
stablelm-2-1_6b ✅ 📑	🟢	16.4	45.25	43.34	70.45	38.95	36.78	64.56	17.44	Unknown
HamSter-0.1 📑	🔶	72.4	45.19	46.93	68.08	43.03	51.24	61.88	0.0	MistralForCausalLM
llama-shishya-7b-ep3-v1 📑	🔶	70	45.19	48.04	76.63	46.12	30.9	69.46	0.0	LlamaForCausalLM
guanaco-unchained-llama-2-7b 📑	🔶	70	45.11	47.35	72.16	41.76	41.49	64.48	3.41	Unknown
speechless-coding-7b-16k-tora 📑	🔶	70	45.1	41.21	64.45	39.14	44.91	63.61	17.29	LlamaForCausalLM
vicuna-7b-v1.5-lora-mctaco-modified4 📑	🔶	66.1	45.1	40.7	73.08	47.26	41.59	67.88	0.08	Unknown
speechless-coding-7b-16k-tora 📑	🔶	70	45.05	41.13	64.48	38.86	44.95	63.85	17.06	LlamaForCausalLM
Qwen-VL-LLaMAfied-7B-Chat 📑	🔶	70	45.0	47.35	69.97	44.12	42.87	65.67	0.0	LlamaForCausalLM
llama-7b-logicot 📑	❓	70	44.95	47.01	72.56	38.93	43.63	67.56	0.0	LlamaForCausalLM
WizardLM-7B-Uncensored 📑	🔶	66.1	44.92	47.87	73.08	35.42	41.49	68.43	3.26	Unknown
codellama-13b-oasst-sft-v10 📑	💬	130.2	44.85	45.39	62.36	35.36	45.02	67.8	13.19	LlamaForCausalLM
CodeLLaMA-chat-13b-Chinese 📑	🔶	128.5	44.84	43.26	63.87	34.29	48.97	67.88	10.77	Unknown
speechless-codellama-orca-13b 📑	💬	130	44.83	44.37	65.2	43.46	45.94	64.01	5.99	LlamaForCausalLM
mistral-class-shishya-all-hal-7b-ep3 📑	🔶	70	44.8	46.59	78.87	34.45	35.98	72.93	0.0	MistralForCausalLM
chinese-alpaca-plus-7b-hf 📑	🔶	70	44.77	49.23	70.48	38.39	39.72	70.09	0.68	LlamaForCausalLM
MiniMA-2-3B 📑	🔶	30	44.75	44.71	69.33	41.22	38.44	66.69	8.11	LlamaForCausalLM
Qwen-1_8B-Llamafied 📑	🟢	18.4	44.75	37.71	58.87	46.37	39.41	61.72	24.41	LlamaForCausalLM
palmyra-med-20b 📑	💬	200	44.71	46.93	73.51	44.34	35.47	65.35	2.65	GPT2LMHeadModel
Poro-34B-GPTQ 📑	💬	480.6	44.67	47.01	73.75	32.47	38.37	71.35	5.08	BloomForCausalLM
ThetaWave-14B-v0.1 📑	🟢	142.2	44.54	42.83	47.09	61.45	50.41	65.43	0.0	MistralForCausalLM
tamil-llama-7b-base-v0.1 📑	🔶	70	44.52	46.67	72.85	40.95	35.93	70.72	0.0	LlamaForCausalLM
Project-Baize-v2-7B-GPTQ 📑	❓	90.4	44.5	45.99	73.44	35.46	39.92	69.69	2.5	LlamaForCausalLM
h2o-danube-1.8b-chat 📑	💬	18.3	44.49	41.13	68.06	33.41	41.64	65.35	17.36	MistralForCausalLM
falcon_7b_norobots 📑	💬	70	44.46	47.87	77.92	27.94	36.81	71.74	4.47	Unknown
falcon_7b_norobots 📑	🔶	70	44.4	48.12	77.9	28.11	36.76	71.59	3.94	Unknown
rank_vicuna_7b_v1_fp16 📑	🔶	70	44.36	44.62	65.67	44.14	45.13	66.61	0.0	LlamaForCausalLM
llama-shishya-7b-ep3-v2 📑	🔶	70	44.33	47.35	75.88	43.84	30.16	68.75	0.0	LlamaForCausalLM
CodeLlama-34b-Instruct-hf 📑	💬	337.4	44.33	40.78	35.66	39.72	44.29	74.51	31.01	LlamaForCausalLM
koala-7B-HF 📑	🔶	70	44.29	47.1	73.58	25.53	45.96	69.93	3.64	LlamaForCausalLM
mistral-class-shishya-7b-ep3 📑	🔶	70	44.28	46.59	76.62	39.07	33.54	69.85	0.0	MistralForCausalLM
open_llama_7b_v2 📑	🟢	70	44.26	43.69	72.2	41.29	35.54	69.38	3.49	LlamaForCausalLM
tora-code-13b-v1.0 📑	🔶	130	44.19	44.71	69.15	36.69	34.98	63.14	16.45	LlamaForCausalLM
falcon-7b ✅ 📑	🟢	70	44.17	47.87	78.13	27.79	34.26	72.38	4.62	FalconForCausalLM
speechless-codellama-airoboros-orca-platypus-13b 📑	🔶	130	44.1	44.88	67.7	43.16	40.88	66.14	1.82	LlamaForCausalLM
falcon_7b_DolphinCoder 📑	🔶	70	44.09	48.72	78.03	27.08	35.12	70.48	5.08	Unknown
falcon_7b_DolphinCoder 📑	🔶	70	44.09	48.72	78.03	27.08	35.12	70.48	5.08	Unknown
calm2-7b-chat-dpo-experimental 📑	💬	70.1	44.03	41.04	68.99	39.82	43.13	65.67	5.53	LlamaForCausalLM
llama-class-shishya-7b-ep3 📑	🔶	70	43.88	40.78	77.04	46.74	27.94	70.8	0.0	LlamaForCausalLM
BigTranslate-13B-GPTQ 📑	🔶	179.9	43.86	45.31	75.1	31.18	40.6	70.96	0.0	LlamaForCausalLM
gpt-sw3-20b-instruct 📑	💬	209.2	43.7	43.17	71.09	31.32	41.02	66.77	8.79	GPT2LMHeadModel
h2o-danube-1.8b-sft 📑	🔶	18.3	43.68	40.19	67.34	33.75	40.29	65.43	15.09	MistralForCausalLM
falcon_7b_3epoch_norobots 📑	💬	70	43.65	47.61	77.24	29.73	36.27	69.53	1.52	Unknown
deepseek-coder-6.7b-instruct ✅ 📑	💬	67.4	43.57	38.14	55.09	39.02	45.56	56.83	26.76	LlamaForCausalLM
amber_fine_tune_sg_part1 📑	🔶	67.4	43.5	44.88	75.1	29.36	40.85	67.01	3.79	LlamaForCausalLM
gpt-sw3-40b 📑	🟢	399.3	43.42	43.0	72.37	34.97	37.52	67.96	4.7	GPT2LMHeadModel
minima-3b-layla-v2 📑	🔶	30	43.39	44.2	69.93	28.53	43.64	65.43	8.64	LlamaForCausalLM
CodeLlama-13b-hf 📑	🟢	130.2	43.35	40.87	63.35	32.81	43.79	67.17	12.13	LlamaForCausalLM
tigerbot-7b-sft 📑	❓	70.7	43.35	41.64	60.56	29.89	58.18	63.54	6.29	Unknown
quan-1.8b-base 📑	🟢	18	43.35	36.95	58.46	45.44	41.6	57.93	19.71	LlamaForCausalLM
Kan-LLaMA-7B-base 📑	🔶	68.8	43.31	43.94	70.75	37.06	39.57	68.51	0.0	LlamaForCausalLM
amber_fine_tune_001 📑	🔶	67.4	43.28	44.8	73.78	30.41	42.93	64.09	3.64	LlamaForCausalLM
calm2-7b-chat 📑	💬	70	43.27	40.27	68.12	39.39	41.96	64.96	4.93	LlamaForCausalLM
falcon-7b-instruct 📑	💬	70	43.26	46.16	70.85	25.84	44.08	67.96	4.7	FalconForCausalLM
Guanaco 📑	❓	0	43.25	50.17	72.69	30.3	37.64	68.67	0.0	LlamaForCausalLM
minima-3b-layla-v1 📑	❓	30	43.21	42.32	67.48	28.44	46.46	65.9	8.64	LlamaForCausalLM
falcon-7b-instruct 📑	💬	70	43.16	45.82	70.78	25.66	44.07	68.03	4.62	FalconForCausalLM
chinese-llama-2-7b 📑	🔶	67	43.14	44.45	69.5	37.47	37.0	68.98	1.44	Unknown
GPT-JT-6B-v1 ✅ 📑	🔶	60	43.13	40.87	67.15	47.19	37.07	65.27	1.21	GPTJForCausalLM
ex-llm-e1 📑	💬	0	43.11	39.93	68.11	39.44	42.01	64.88	4.32	Unknown
phoenix-inst-chat-7b 📑	❓	70	43.03	44.71	63.23	39.06	47.08	62.83	1.29	BloomForCausalLM
GPT-NeoXT-Chat-Base-20B 📑	🔶	200	43.02	45.65	74.03	29.92	34.51	67.09	6.9	GPTNeoXForCausalLM
galpaca-30b 📑	🔶	300	43.0	49.57	58.2	43.78	41.16	62.51	2.81	OPTForCausalLM
CodeLlama-34B-Instruct-fp16 📑	💬	337.4	43.0	40.78	35.66	39.72	44.29	74.51	23.05	LlamaForCausalLM
Anima-7B-100K 📑	🔶	70	42.98	46.59	72.28	33.4	37.84	67.09	0.68	LlamaForCausalLM
Deita-1_8B 📑	💬	80	42.96	36.52	60.63	45.62	40.02	59.35	15.62	LlamaForCausalLM
Qwen-1_8B-Chat-llama 📑	💬	18.4	42.94	36.95	54.34	44.55	43.7	58.88	19.26	LlamaForCausalLM
InstructPalmyra-20b 📑	💬	200	42.91	47.1	73.0	28.26	41.81	64.72	2.58	GPT2LMHeadModel
dopeyshearedplats-2.7b-v1 📑	🔶	27	42.9	46.08	75.17	29.01	44.12	62.67	0.38	LlamaForCausalLM
landmark-attention-llama7b-fp16 📑	🔶	66.1	42.84	47.35	65.81	31.59	42.63	68.03	1.59	Unknown
opt-66b 📑	🟢	660	42.78	46.33	76.25	26.99	35.43	70.01	1.67	OPTForCausalLM
Qwen-1_8b-EverythingLM 📑	💬	18.4	42.77	38.65	62.66	44.94	38.7	58.96	12.74	LlamaForCausalLM
tora-code-13b-v1.0 📑	💬	130	42.7	44.45	69.29	36.67	34.98	62.59	8.19	LlamaForCausalLM
open-llama-7b-open-instruct 📑	❓	70	42.59	49.74	73.67	31.52	34.65	65.43	0.53	LlamaForCausalLM
codegen-16B-nl 📑	🟢	160	42.59	46.76	71.87	32.35	33.95	67.96	2.65	CodeGenForCausalLM

注意：手机屏幕有限，仅展示平均分，所有内容建议电脑端访问。

模型名称：	mistral-class-tutor-7b-ep3 📑 🔶
参数大小：	72.4
平均分：	46.09

模型名称：	bloom ✅ 📑 🟢
参数大小：	1762.5
平均分：	46.07

模型名称：	vicuna-7b-v1.5-lora-mctaco-modified2 📑 🔶
参数大小：	66.1
平均分：	46.03

模型名称：	Ambari-7B-base-v0.1-sharded 📑 🔶
参数大小：	68.8
平均分：	45.92

模型名称：	ssh_1.8B 📑 🔶
参数大小：	18.4
平均分：	45.91

模型名称：	quan-1.8b-chat 📑 💬
参数大小：	18
平均分：	45.91

模型名称：	mistral_v1 📑 🔶
参数大小：	72.4
平均分：	45.85

模型名称：	CodeLlama-13b-Instruct-hf 📑 💬
参数大小：	130.2
平均分：	45.82

模型名称：	CodeLlama-13B-Instruct-fp16 📑 💬
参数大小：	130.2
平均分：	45.82

模型名称：	Kan-LLaMA-7B-SFT-v0.1-sharded 📑 🔶
参数大小：	68.8
平均分：	45.76

模型名称：	Ambari-7B-Instruct-v0.1-sharded 📑 🔶
参数大小：	68.8
平均分：	45.74

模型名称：	llama2-7b-raw-sft 📑 🔶
参数大小：	67.4
平均分：	45.67

模型名称：	mistral-7b-raw-sft 📑 🔶
参数大小：	67.4
平均分：	45.67

模型名称：	Planner-7B-fp16 📑 🔶
参数大小：	70
平均分：	45.65

模型名称：	speechless-codellama-platypus-13b 📑 💬
参数大小：	130
平均分：	45.64

模型名称：	llama-base-7b 📑 🟢
参数大小：	66.1
平均分：	45.62

模型名称：	PandaLM-Alpaca-7B-v1 📑 🔶
参数大小：	70
平均分：	45.59

模型名称：	Airavata 📑 🔶
参数大小：	68.7
平均分：	45.52

模型名称：	tamil-llama-7b-instruct-v0.1 📑 🔶
参数大小：	70
平均分：	45.52

模型名称：	chinese-llama-plus-13b-hf 📑 🔶
参数大小：	130
平均分：	45.39

模型名称：	vicuna-7b-v1.5-lora-mctaco-modified1 📑 🔶
参数大小：	66.1
平均分：	45.38

模型名称：	openthaigpt-1.0.0-beta-7b-chat-ckpt-hf 📑 🔶
参数大小：	70
平均分：	45.35

模型名称：	ALMA-7B 📑 🔶
参数大小：	70
平均分：	45.32

模型名称：	MiniChat-3B 📑 🔶
参数大小：	30.2
平均分：	45.31

模型名称：	opt-iml-max-30b 📑 ❓
参数大小：	300
平均分：	45.28

模型名称：	openbuddy-openllama-7b-v12-bf16 📑 🔶
参数大小：	70
平均分：	45.28

模型名称：	stablelm-2-1_6b ✅ 📑 🟢
参数大小：	16.4
平均分：	45.25

模型名称：	HamSter-0.1 📑 🔶
参数大小：	72.4
平均分：	45.19

模型名称：	llama-shishya-7b-ep3-v1 📑 🔶
参数大小：	70
平均分：	45.19

模型名称：	guanaco-unchained-llama-2-7b 📑 🔶
参数大小：	70
平均分：	45.11

模型名称：	speechless-coding-7b-16k-tora 📑 🔶
参数大小：	70
平均分：	45.1

模型名称：	vicuna-7b-v1.5-lora-mctaco-modified4 📑 🔶
参数大小：	66.1
平均分：	45.1

模型名称：	speechless-coding-7b-16k-tora 📑 🔶
参数大小：	70
平均分：	45.05

模型名称：	Qwen-VL-LLaMAfied-7B-Chat 📑 🔶
参数大小：	70
平均分：	45.0

模型名称：	llama-7b-logicot 📑 ❓
参数大小：	70
平均分：	44.95

模型名称：	WizardLM-7B-Uncensored 📑 🔶
参数大小：	66.1
平均分：	44.92

模型名称：	codellama-13b-oasst-sft-v10 📑 💬
参数大小：	130.2
平均分：	44.85

模型名称：	CodeLLaMA-chat-13b-Chinese 📑 🔶
参数大小：	128.5
平均分：	44.84

模型名称：	speechless-codellama-orca-13b 📑 💬
参数大小：	130
平均分：	44.83

模型名称：	mistral-class-shishya-all-hal-7b-ep3 📑 🔶
参数大小：	70
平均分：	44.8

模型名称：	chinese-alpaca-plus-7b-hf 📑 🔶
参数大小：	70
平均分：	44.77

模型名称：	MiniMA-2-3B 📑 🔶
参数大小：	30
平均分：	44.75

模型名称：	Qwen-1_8B-Llamafied 📑 🟢
参数大小：	18.4
平均分：	44.75

模型名称：	palmyra-med-20b 📑 💬
参数大小：	200
平均分：	44.71

模型名称：	Poro-34B-GPTQ 📑 💬
参数大小：	480.6
平均分：	44.67

模型名称：	ThetaWave-14B-v0.1 📑 🟢
参数大小：	142.2
平均分：	44.54

模型名称：	tamil-llama-7b-base-v0.1 📑 🔶
参数大小：	70
平均分：	44.52

模型名称：	Project-Baize-v2-7B-GPTQ 📑 ❓
参数大小：	90.4
平均分：	44.5

模型名称：	h2o-danube-1.8b-chat 📑 💬
参数大小：	18.3
平均分：	44.49

模型名称：	falcon_7b_norobots 📑 💬
参数大小：	70
平均分：	44.46

模型名称：	falcon_7b_norobots 📑 🔶
参数大小：	70
平均分：	44.4

模型名称：	rank_vicuna_7b_v1_fp16 📑 🔶
参数大小：	70
平均分：	44.36

模型名称：	llama-shishya-7b-ep3-v2 📑 🔶
参数大小：	70
平均分：	44.33

模型名称：	CodeLlama-34b-Instruct-hf 📑 💬
参数大小：	337.4
平均分：	44.33

模型名称：	koala-7B-HF 📑 🔶
参数大小：	70
平均分：	44.29

模型名称：	mistral-class-shishya-7b-ep3 📑 🔶
参数大小：	70
平均分：	44.28

模型名称：	open_llama_7b_v2 📑 🟢
参数大小：	70
平均分：	44.26

模型名称：	tora-code-13b-v1.0 📑 🔶
参数大小：	130
平均分：	44.19

模型名称：	falcon-7b ✅ 📑 🟢
参数大小：	70
平均分：	44.17

模型名称：	speechless-codellama-airoboros-orca-platypus-13b 📑 🔶
参数大小：	130
平均分：	44.1

模型名称：	falcon_7b_DolphinCoder 📑 🔶
参数大小：	70
平均分：	44.09

模型名称：	falcon_7b_DolphinCoder 📑 🔶
参数大小：	70
平均分：	44.09

模型名称：	calm2-7b-chat-dpo-experimental 📑 💬
参数大小：	70.1
平均分：	44.03

模型名称：	llama-class-shishya-7b-ep3 📑 🔶
参数大小：	70
平均分：	43.88

模型名称：	BigTranslate-13B-GPTQ 📑 🔶
参数大小：	179.9
平均分：	43.86

模型名称：	gpt-sw3-20b-instruct 📑 💬
参数大小：	209.2
平均分：	43.7

模型名称：	h2o-danube-1.8b-sft 📑 🔶
参数大小：	18.3
平均分：	43.68

模型名称：	falcon_7b_3epoch_norobots 📑 💬
参数大小：	70
平均分：	43.65

模型名称：	deepseek-coder-6.7b-instruct ✅ 📑 💬
参数大小：	67.4
平均分：	43.57

模型名称：	amber_fine_tune_sg_part1 📑 🔶
参数大小：	67.4
平均分：	43.5

模型名称：	gpt-sw3-40b 📑 🟢
参数大小：	399.3
平均分：	43.42

模型名称：	minima-3b-layla-v2 📑 🔶
参数大小：	30
平均分：	43.39

模型名称：	CodeLlama-13b-hf 📑 🟢
参数大小：	130.2
平均分：	43.35

模型名称：	tigerbot-7b-sft 📑 ❓
参数大小：	70.7
平均分：	43.35

模型名称：	quan-1.8b-base 📑 🟢
参数大小：	18
平均分：	43.35

模型名称：	Kan-LLaMA-7B-base 📑 🔶
参数大小：	68.8
平均分：	43.31

模型名称：	amber_fine_tune_001 📑 🔶
参数大小：	67.4
平均分：	43.28

模型名称：	calm2-7b-chat 📑 💬
参数大小：	70
平均分：	43.27

模型名称：	falcon-7b-instruct 📑 💬
参数大小：	70
平均分：	43.26

模型名称：	Guanaco 📑 ❓
参数大小：	0
平均分：	43.25

模型名称：	minima-3b-layla-v1 📑 ❓
参数大小：	30
平均分：	43.21

模型名称：	falcon-7b-instruct 📑 💬
参数大小：	70
平均分：	43.16

模型名称：	chinese-llama-2-7b 📑 🔶
参数大小：	67
平均分：	43.14

模型名称：	GPT-JT-6B-v1 ✅ 📑 🔶
参数大小：	60
平均分：	43.13

模型名称：	ex-llm-e1 📑 💬
参数大小：	0
平均分：	43.11

模型名称：	phoenix-inst-chat-7b 📑 ❓
参数大小：	70
平均分：	43.03

模型名称：	GPT-NeoXT-Chat-Base-20B 📑 🔶
参数大小：	200
平均分：	43.02

模型名称：	galpaca-30b 📑 🔶
参数大小：	300
平均分：	43.0

模型名称：	CodeLlama-34B-Instruct-fp16 📑 💬
参数大小：	337.4
平均分：	43.0

模型名称：	Anima-7B-100K 📑 🔶
参数大小：	70
平均分：	42.98

模型名称：	Deita-1_8B 📑 💬
参数大小：	80
平均分：	42.96

模型名称：	Qwen-1_8B-Chat-llama 📑 💬
参数大小：	18.4
平均分：	42.94

模型名称：	InstructPalmyra-20b 📑 💬
参数大小：	200
平均分：	42.91

模型名称：	dopeyshearedplats-2.7b-v1 📑 🔶
参数大小：	27
平均分：	42.9

模型名称：	landmark-attention-llama7b-fp16 📑 🔶
参数大小：	66.1
平均分：	42.84

模型名称：	opt-66b 📑 🟢
参数大小：	660
平均分：	42.78

模型名称：	Qwen-1_8b-EverythingLM 📑 💬
参数大小：	18.4
平均分：	42.77

模型名称：	tora-code-13b-v1.0 📑 💬
参数大小：	130
平均分：	42.7

模型名称：	open-llama-7b-open-instruct 📑 ❓
参数大小：	70
平均分：	42.59

模型名称：	codegen-16B-nl 📑 🟢
参数大小：	160
平均分：	42.59