Jackrong/Qwen3.5-2B-Claude-4.6-Opus-Reasoning-Distilled-GGUF
NewA 2B dense architecture model fine-tuned with structured step-by-step reasoning trajectories distilled from Claude 4.6 Opus.
reasoningchain-of-thoughtqwen
A 2B dense architecture model fine-tuned with structured step-by-step reasoning trajectories distilled from Claude 4.6 Opus.
Task-specialized 4B model for natural-language-to-SQL conversion. Distilled from DeepSeek-V3. Quantized GGUF for local database agents.
deepseek/r1-distill-qwen-7b
A 2026-native reasoning model distilled from R1. Specialized for agentic "Chain of Thought" logic on local hardware.
To get started, install the `transformers` library:
pip install transformersThen, use the following snippet to load the model:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "deepseek/r1-distill-qwen-7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Your inference code here...| Tag / Variant | Size | Format | Download |
|---|---|---|---|
| deepseek/r1-distill-qwen-7b:Q4_K_M | 4.9GB | GGUF | Link |
DeepSeek-R1-Full
Knowledge Distillation (Logits)
Flickr30k (Conceptual)
Multimodal Generation
| Metric | Student Model | Teacher Model |
|---|---|---|
| Model Size | 4.9GB | 8.5GB |
| BLEU Score | 28.5 | 30.1 |