Similar Models

Qwen/Qwen3.5-2B-GGUF

Alibaba Qwen3.5 2B edge-optimized model. Hybrid Gated DeltaNet+Attention architecture, 256K context, Apache 2.0. Built for tool-calling agents and multimodal workflows.

agenticmultimodalhybrid-arch

Qwen/Qwen3.5-2B-Instruct-GGUF

New

Alibaba Qwen3.5 2B edge model. Hybrid architecture, 256K context, Apache 2.0. Verified GGUF quantizations via Unsloth for local inference.

agenticverifiedtool-calling

unsloth/Qwen3-4B-Instruct-2507-GGUF

New

Alibaba Qwen3 updated 4B instruct model. 256K native context, Apache 2.0. Optimized for instruction-following, tool-calling, and agentic workflows without CoT overhead.

instruction-tunedtool-callinglong-context

← Back to Models

Qwen3.5-0.8B (Ultra-Edge Agent)

unsloth/Qwen3.5-0.8B-GGUF

Alibaba Qwen3.5 sub-1B via Unsloth Dynamic 2.0. 256K context, Apache 2.0. Optimized for lightweight function-calling agents and document parsing workflows.

How to Use

To get started, install the `transformers` library:

pip install transformers

Then, use the following snippet to load the model:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "unsloth/Qwen3.5-0.8B-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Your inference code here...

Available Versions

Tag / Variant	Size	Format	Download
unsloth/Qwen3.5-0.8B-GGUF:UD-Q4_K_XL	0.7GB	GGUF	Link
unsloth/Qwen3.5-0.8B-GGUF:Q6_K	0.95GB	GGUF	Link

Model Details

Teacher Model

Qwen3.5-4B

Distillation Method

Knowledge Distillation (Logits)

Training Dataset

Flickr30k (Conceptual)

Primary Task

Multimodal Generation

Performance Metrics (Example)

Metric	Student Model	Teacher Model
Model Size	0.7GB	8.5GB
BLEU Score	28.5	30.1