Similar Models

Qwen/Qwen3.5-0.8B-GGUF

Alibaba Qwen3.5 sub-1B multimodal model. Text+image+video understanding with 262K context. Apache 2.0. Built for lightweight agentic assistants.

multimodalultra-smalllong-context

Qwen/Qwen3.5-2B-GGUF

New

Alibaba Qwen3.5 2B edge-optimized model. Hybrid Gated DeltaNet+Attention architecture, 256K context, Apache 2.0. Built for tool-calling agents and multimodal workflows.

agenticmultimodalhybrid-arch

unsloth/Phi-4-mini-instruct-GGUF

New

Microsoft Phi-4-mini distilled for edge reasoning. 3.8B params, 128K context, MIT license. Optimized for agentic tool-calling and multilingual tasks.

reasoningmultilingualtool-calling

← Back to Models

Gemma 4 E4B-it (Multimodal Edge)

google/gemma-4-E4B-it

Google DeepMind multimodal instruction model. 4.5B effective params, 128K context, text+image+audio. Native function calling, configurable thinking modes, Apache 2.0.

How to Use

To get started, install the `transformers` library:

pip install transformers

Then, use the following snippet to load the model:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "google/gemma-4-E4B-it"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Your inference code here...

Available Versions

Tag / Variant	Size	Format	Download
google/gemma-4-E4B-it:Q4_K_M	6.1GB	GGUF	Link
google/gemma-4-E4B-it:Q5_K_M	6.9GB	GGUF	Link
google/gemma-4-E4B-it:Q8_0	8.8GB	GGUF	Link

Model Details

Teacher Model

google/gemma-4-E4B-base

Distillation Method

Knowledge Distillation (Logits)

Training Dataset

Flickr30k (Conceptual)

Primary Task

Multimodal Generation

Performance Metrics (Example)

Metric	Student Model	Teacher Model
Model Size	6.1GB	8.5GB
BLEU Score	28.5	30.1