RedPajama INCITE 3B（RedPajama INCITE 3B）详细信息 | 名称、简介、使用方法，开源情况，商用授权信息

RedPajama INCITE 3B - RedPajama INCITE 3B

模型详细情况和参数

RedPajama INCITE 3B

模型全称: RedPajama INCITE 3B
模型简称: RedPajama INCITE 3B
模型类型: 基础大模型
发布日期: 2023-05-05
预训练文件大小: 5.69GB
是否支持中文（中文优化）: 否
最高支持的上下文长度: 2K
模型参数数量（亿）: 28.0
模型代码开源协议
预训练结果开源商用情况: -
模型GitHub链接: https://github.com/togethercomputer
模型HuggingFace链接: https://huggingface.co/togethercomputer/RedPajama-INCITE-Instruct-3B-v1
在线演示地址: 暂无
DataLearnerAI的模型介绍
官方博客论文: Releasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned & chat models
基础模型: 无基础模型
发布机构: TOGETHER

RedPajama INCITE 3B 简介

RedPajama项目是TOGETHER公司发布的一项旨在复刻LLaMA的项目。RedPajama INCITE 3B是官方发布的第一批模型，30亿参数版本。

RedPajama INCITE 3B系列模型简介

RedPajama INCITE 3B模型是RedPajama系列模型中的一类模型，是一种语言模型。基于RedPajama的数据集训练，该模型包含3个版本，全部开源。

模型名称	模型类型	参数大小（亿）
RedPajama-INCITE-Base-3B-v1	语言模型	28
RedPajama-INCITE-Chat-3B-v1	Chat优化	28
RedPajama-INCITE-Instruct-3B-v1	指令优化	28

其中Base模型是基础的语言模型，根据官方的描述，要比其它同等规模参数的模型效果更好。而Chat模型则是基于Dolly 2.0和Open Assistant数据集做微调的结果。Instruct是做了一些prompts的优化，使用GPT-JT的方式（ https://www.datalearner.com/ai-models/pretrained-models/GPT-JT ）做的指令优化。

本次发布的模型都属于INCITE系列，是一项合作的成果。这项工作基于：

RedPajama收集的1.2万亿tokens的RedPajama数据集
EleutherAI的Pythia训练代码
Stanford的FlashAttention和Together、Stanford CRFM的HELM基准测试
MILA、EleutherAI和LAION对INCITE计划奖项“可扩展的基础模型用于可转移的通用AI”中Summit超级计算机的计算时间（关于INCITE解释见后面内容）

RedPajama INCITE 3B系列模型的性能

RedPajama INCITE 3B模型是在8000亿tokens上训练的结果，其few-shot和zero-shot的表现都比同等规模模型效果更好。其在HELM核心场景下评测结果：

Few-Shot得分结果

模型名称	类型	HELM (16个核心场景平均得分)
GPT-Neo	Base model	0.357
Pythia-2.8B	Base model	0.377
RedPajama-INCITE-Base-3B-v1	Base model	0.406
RedPajama-INCITE-Instruct-3B-v1	Instruction-tuned	0.453
Llama-7B	Base model	0.465

可以看到，和MetaAI的LLaMA-7B的得分很接近~

Zero-Shot得分结果

模型名称	Lambada_openai (acc)	Hellaswag (acc_norm)	Winogrande (acc)	Piqa(acc)	average
GPT-Neo	0.6223	0.5579	0.5769	0.7219	0.6197
Pythia-2.8B	0.6466	0.5933	0.6006	0.7399	0.6451
Pythia-2.8B-dedup	0.6524	0.5941	0.5848	0.7404	0.6429
RedPajama-INCITE-Base-3B-v1	0.6541	0.6317	0.6322	0.7470	0.6662

RedPajama-INCITE-Base-3B-v1的HuggingFace地址： https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-3B-v1

RedPajama-INCITE-Chat-3B-v1的HuggingFace地址： https://huggingface.co/togethercomputer/RedPajama-INCITE-Chat-3B-v1

RedPajama-INCITE-Instruct-3B-v1的HuggingFace地址： https://huggingface.co/togethercomputer/RedPajama-INCITE-Instruct-3B-v1

关于INCTE名称的含义来源

RedPajama是一项合作项目。INCITE含义：“理论和实验的创新和新型计算影响（The Innovative and Novel Computational Impact on Theory and Experiment，INCITE）”计划是科学界获得美国能源部领先级超级计算机（ALCF和OLCF）的主要手段。按照上述说明，MILA、EleutherAI和LAION应该是有这个超级计算机的使用时间，然后他们把它贡献给了RedPajama团队用以训练RedPajama INCITE 3B，所以这个模型名称带了INCITE。