sweelol/distilled-gemma-v1
NewA fast and efficient distilled version of Gemma, great for general tasks.
A fast and efficient distilled version of Gemma, great for general tasks.
A Gemma-3 270M model, pruned for efficiency and then fully fine-tuned on the Dolly-15k instruction dataset.
A Gemma-3 270M model fully fine-tuned on the Dolly-15k dataset, intended to be used as a "teacher" for knowledge distillation.
The base version of Gemma-3 270M with weights pruned for efficiency. This is the starting point for fine-tuning.
A baseline Gemma-3 270M model with 50% of its weights pruned for increased efficiency and inference speed.
A highly optimized model, first pruned for size and then knowledge-distilled from a larger teacher on the Dolly-15k dataset.
A Gemma-3 270M model fine-tuned on the Dolly-15k dataset using Low-Rank Adaptation (LoRA) for maximum efficiency.
A compact chat-tuned model, distilled from LLaMA-2 for quick interactions.
A Gemma-3 270M model adapted to the Dolly-15k dataset using Prompt Tuning, the most memory-efficient PEFT method.