Models

Multimodal

Computer Vision

Natural Language Processing

A fast and efficient distilled version of Gemma, great for general tasks.

A Gemma-3 270M model, pruned for efficiency and then fully fine-tuned on the Dolly-15k instruction dataset.

A Gemma-3 270M model fully fine-tuned on the Dolly-15k dataset, intended to be used as a "teacher" for knowledge distillation.

The base version of Gemma-3 270M with weights pruned for efficiency. This is the starting point for fine-tuning.

A baseline Gemma-3 270M model with 50% of its weights pruned for increased efficiency and inference speed.

A highly optimized model, first pruned for size and then knowledge-distilled from a larger teacher on the Dolly-15k dataset.

A Gemma-3 270M model fine-tuned on the Dolly-15k dataset using Low-Rank Adaptation (LoRA) for maximum efficiency.

A compact chat-tuned model, distilled from LLaMA-2 for quick interactions.

A Gemma-3 270M model adapted to the Dolly-15k dataset using Prompt Tuning, the most memory-efficient PEFT method.