Analytics loading...
NVIDIA Nemotron-3 Nano 4B
nvidia/NVIDIA-Nemotron-3-Nano-4B-GGUF
A highly efficient 4B parameter model from NVIDIA, optimized for low-latency on-device tasks and high-quality text generation.
How to Use
To get started, install the `transformers` library:
pip install transformersThen, use the following snippet to load the model:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "nvidia/NVIDIA-Nemotron-3-Nano-4B-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Your inference code here...Available Versions
| Tag / Variant | Size | Format | Download |
|---|---|---|---|
| nvidia/NVIDIA-Nemotron-3-Nano-4B-GGUF:Q4_K_M | 2.8GB | GGUF | Link |
Model Details
Teacher Model
Nemotron-3-Large
Distillation Method
Knowledge Distillation (Logits)
Training Dataset
Flickr30k (Conceptual)
Primary Task
Multimodal Generation
Performance Metrics (Example)
| Metric | Student Model | Teacher Model |
|---|---|---|
| Model Size | 2.8GB | 8.5GB |
| BLEU Score | 28.5 | 30.1 |
