Qwen/Qwen3.5-2B-GGUF
NewAlibaba Qwen3.5 2B edge-optimized model. Hybrid Gated DeltaNet+Attention architecture, 256K context, Apache 2.0. Built for tool-calling agents and multimodal workflows.
Alibaba Qwen3.5 2B edge-optimized model. Hybrid Gated DeltaNet+Attention architecture, 256K context, Apache 2.0. Built for tool-calling agents and multimodal workflows.
Mistral AI edge-optimized 3.4B+0.4B vision model. Native function calling, JSON outputs, 256K context. Built for tool-using agentic pipelines.
Alibaba Qwen3.5 sub-1B via Unsloth Dynamic 2.0. 256K context, Apache 2.0. Optimized for lightweight function-calling agents and document parsing workflows.
unsloth/LFM2-700M-GGUF
Liquid AI hybrid architecture via Unsloth. 700M params, 32K context, CPU-optimized. Built for narrow-scope agentic tasks: data extraction, RAG, multi-turn workflows.
To get started, install the `transformers` library:
pip install transformersThen, use the following snippet to load the model:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "unsloth/LFM2-700M-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Your inference code here...LFM2-1.2B
Knowledge Distillation (Logits)
Flickr30k (Conceptual)
Multimodal Generation
| Metric | Student Model | Teacher Model |
|---|---|---|
| Model Size | 469MB | 8.5GB |
| BLEU Score | 28.5 | 30.1 |