Similar Models

DeepSeek R1 Distill (Qwen 7B)

deepseek/r1-distill-qwen-7b

A 2026-native reasoning model distilled from R1. Specialized for agentic "Chain of Thought" logic on local hardware.

How to Use

To get started, install the `transformers` library:

pip install transformers

Then, use the following snippet to load the model:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "deepseek/r1-distill-qwen-7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Your inference code here...

Available Versions

Tag / VariantSizeFormatDownload
deepseek/r1-distill-qwen-7b:Q4_K_M4.9GBGGUFLink

Model Details

Teacher Model

DeepSeek-R1-Full

Distillation Method

Knowledge Distillation (Logits)

Training Dataset

Flickr30k (Conceptual)

Primary Task

Multimodal Generation

Performance Metrics (Example)

MetricStudent ModelTeacher Model
Model Size4.9GB8.5GB
BLEU Score28.530.1