HuggingFaceTB/SmolLM2-135M-Instruct-GGUF
NewUltra-lightweight 135M instruct model from Hugging Face. Apache 2.0. Optimized for browser/mobile edge deployment, classification, and low-latency fallback tasks.
Ultra-lightweight 135M instruct model from Hugging Face. Apache 2.0. Optimized for browser/mobile edge deployment, classification, and low-latency fallback tasks.
A 2026-native 3B reasoning model from Hugging Face. Dual-mode `/think` and `/no_think` for agentic workflows with 64K-128K context. Fully open recipe.
Alibaba Qwen3 updated 4B instruct model. 256K native context, Apache 2.0. Optimized for instruction-following, tool-calling, and agentic workflows without CoT overhead.
unsloth/SmolLM2-1.7B-Instruct-GGUF
Hugging Face SmolLM2 fine-tuned via Unsloth. 1.7B params, Apache 2.0. Optimized for instruction-following agents in data labeling, product cataloging, and editorial generation.
To get started, install the `transformers` library:
pip install transformersThen, use the following snippet to load the model:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "unsloth/SmolLM2-1.7B-Instruct-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Your inference code here...SmolLM2-Base
Knowledge Distillation (Logits)
Flickr30k (Conceptual)
Multimodal Generation
| Metric | Student Model | Teacher Model |
|---|---|---|
| Model Size | 1.3GB | 8.5GB |
| BLEU Score | 28.5 | 30.1 |