微信扫码
添加专属顾问
我要投稿
各方发言
数据集介绍
Smollm Corpus 数据集:
Cosmopedia v2: 由Mixtral 生成的包含38B tokens的合成教材和故事内容的
指令微调数据集:StarCoder2-Self-OSS-Instruct
dpo数据集:
集中135M和1.7B模型使用的是HelpSteer数据集;
360M的模型,使用的是argilla/dpo-mix-7k;
都只训练了一个epoch。
表现介绍
模型结构
自注意力的是GQA分组查询注意力,模型具体配置如下:
支持长度:这几款模型支持的长度都是2048个token(通过微调后,可以支持更长)
Tokenizer:在Smollm Corpus上训练得到,词表大小为49152.
跑起来
官方给的推理代码(有些小问题需要自己改一下)
# pip install transformersfrom transformers import AutoModelForCausalLM, AutoTokenizercheckpoint = "HuggingFaceTB/SmolLM-1.7B-Instruct"device = "cuda" # for GPU usage or "cpu" for CPU usagetokenizer = AutoTokenizer.from_pretrained(checkpoint)# for multiple GPUs install accelerate and do `model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto")`model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)messages = [{"role": "user", "content": "List the steps to bake a chocolate cake from scratch."}]input_text=tokenizer.apply_chat_template(messages, tokenize=False)print(input_text)inputs = tokenizer.encode(input_text, return_tensors="pt").to("cuda")outputs = model.generate(inputs, max_new_tokens=100, temperature=0.6, top_p=0.92, do_sample=True)print(tokenizer.decode(outputs[0]))
测试小例子
chat("hello")# The statement is false.chat("1+1=")# The equation 1+1=1 is true, but the equation 1+1=2 is false. This is because the statement "1+1=1" is a tautology, meaning that it is always true regardless of the values of its variables. However, the statement "1+1=2" is a contradiction, meaning that it is false if and only if both 1 and 1 are true. Since 1 is true, the statement "1+1
官方例子
chat("List the steps to bake a chocolate cake from scratch.")# resultTo bake a chocolate cake from scratch, follow these steps:1. Gather ingredients:- 2 cups all-purpose flour- 1 cup unsweetened cocoa powder- 2 teaspoons baking soda- 1 teaspoon salt- 2 cups granulated sugar- 2 large eggs- 1 cup whole milk- 1/2 cup vegetable oil- 2 teaspoons vanilla extract2. Preheat the oven to 350°F (175°C).3. Combine dry ingredients:- In a large bowl, whisk together flour, cocoa powder, baking soda, and salt.4. Mix wet ingredients:- In another bowl, beat sugar, eggs, milk, oil, and vanilla until well combined.5. Combine wet and dry mixtures:- Gradually add the wet mixture to the dry ingredients, stirring until just combined.6. Pour batter into a greased 9-inch round cake pan.7. Bake for 30-35 minutes, or until a toothpick inserted into the center comes out clean.8. Allow the cake to cool in the pan for 10 minutes, then transfer it to a wire rack to cool completely.Note: The original answer provided a list of ingredients and a step-by-step process, but it lacked a clear explanation of the steps and the reasoning behind them. The revised answer provides a more detailed and coherent explanation of the baking process.
这个模型实测起来,除了官方例子,表现并不像说的那么好,感觉像个傻子一样,难道是我的姿势不对吗。真像说的那样Bad Baseline Is All You Need吗。希望大家都卷起来,这个开源项目数据处理部分有很大的参考意义,感兴趣的可以关注一下:
# 博客地址https://huggingface.co/blog/smollm# 无法访问的可以访问下面的地址https://hf-mirror.com/blog/smollm
53AI,企业落地大模型首选服务商
产品:场景落地咨询+大模型应用平台+行业解决方案
承诺:免费POC验证,效果达标后再合作。零风险落地应用大模型,已交付160+中大型企业
2026-02-04
从“回答者”进化为“研究员”:全面解析 Deep Research
2026-02-04
刚刚,Xcode 史诗级更新:原生集成 Claude Agent SDK,苹果开发直接起飞!
2026-02-04
国产 Cowork 它来了!MCP、Skills和Expert Agents都支持,全部免费体验!
2026-02-04
混元研究博客上线姚顺雨团队最新成果:从 Context 探索语言模型的范式转变
2026-02-04
通俗讲解大模型短期记忆 vs 长期记忆
2026-02-04
谁动了我的电脑?谁应该抱怨?
2026-02-03
从 CLI 到桌面:Codex 把 coding agent 变成“任务指挥台”
2026-02-03
谷歌重大更新:国内手动开启 Gemini AI 侧边栏与 Auto Browse 自动浏览全攻略
2026-01-24
2026-01-10
2025-11-19
2025-11-13
2026-01-26
2026-01-01
2025-12-09
2025-11-12
2026-01-09
2025-12-21
2026-02-04
2026-02-03
2026-02-03
2026-02-02
2026-02-02
2026-02-02
2026-01-31
2026-01-30