Sweelol-ai/gemma3-270m-dolly-teacher
NewA Gemma-3 270M model fully fine-tuned on the Dolly-15k dataset, intended to be used as a "teacher" for knowledge distillation.
gemma3fine-tunedteacher
A Gemma-3 270M model fully fine-tuned on the Dolly-15k dataset, intended to be used as a "teacher" for knowledge distillation.
A Gemma-3 270M model, pruned for efficiency and then fully fine-tuned on the Dolly-15k instruction dataset.
A fast and efficient distilled version of Gemma, great for general tasks.
Sweelol-ai/kd-gemma3-pruned-dolly
A highly optimized model, first pruned for size and then knowledge-distilled from a larger teacher on the Dolly-15k dataset.
To get started, install the `transformers` library:
pip install transformersThen, use the following snippet to load the model:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "Sweelol-ai/kd-gemma3-pruned-dolly"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Your inference code here...| Tag / Variant | Size | Format | Download |
|---|---|---|---|
| No specific variants listed for this model. | |||
N/A
Knowledge Distillation (Logits)
Flickr30k (Conceptual)
Multimodal Generation
| Metric | Student Model | Teacher Model |
|---|---|---|
| Model Size | ~270MB | 8.5GB |
| BLEU Score | 28.5 | 30.1 |