LFM2.5-350M (Hybrid Edge Specialist)

LiquidAI/LFM2.5-350M-GGUF

Liquid AI hybrid architecture: 10 LIV convolution blocks + 6 GQA layers. 350M params, 32K context, 9 languages. 313 tok/s CPU decode, <1GB RAM. Day-one llama.cpp/MLX/vLLM support.

How to Use

To get started, install the `transformers` library:

pip install transformers

Then, use the following snippet to load the model:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "LiquidAI/LFM2.5-350M-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Your inference code here...

Available Versions

Tag / Variant	Size	Format	Download
LiquidAI/LFM2.5-350M-GGUF:Q4_K_M	285MB	GGUF	Link
LiquidAI/LFM2.5-350M-GGUF:Q6_K	398MB	GGUF	Link
LiquidAI/LFM2.5-350M-GGUF:ONNX	312MB	ONNX	Link

Model Details

Teacher Model

LFM2-Base

Distillation Method

Knowledge Distillation (Logits)

Training Dataset

Flickr30k (Conceptual)

Primary Task

Multimodal Generation

Performance Metrics (Example)

Metric	Student Model	Teacher Model
Model Size	285MB	8.5GB
BLEU Score	28.5	30.1