Similar Models

openai-community/gpt-oss-20b-GGUF

OpenClaw-compatible open-weight 20B model. 64K+ context, Apache 2.0. Balanced performance for tool-use, memory persistence, and multi-channel agentic workflows.

balancedtool-usememory-persistence

unsloth/Qwen3-Coder-7B-GGUF

New

OpenClaw-recommended coding specialist. 7B params, 128K context, Apache 2.0. Optimized for tool-calling, shell commands, and multi-file edits in agentic workflows.

codingtool-callingopenclaw-ready

Qwen/Qwen3.5-2B-GGUF

New

Alibaba Qwen3.5 2B edge-optimized model. Hybrid Gated DeltaNet+Attention architecture, 256K context, Apache 2.0. Built for tool-calling agents and multimodal workflows.

agenticmultimodalhybrid-arch

← Back to Models

GLM-4.7-Flash (Balanced Edge Agent)

THUDM/glm-4-7-flash-GGUF

OpenClaw-recommended general-purpose model. 7B params, 128K context, MIT license. Balanced speed/quality for daily assistant tasks, research, and multi-step reasoning.

How to Use

To get started, install the `transformers` library:

pip install transformers

Then, use the following snippet to load the model:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "THUDM/glm-4-7-flash-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Your inference code here...

Available Versions

Tag / Variant	Size	Format	Download
THUDM/glm-4-7-flash-GGUF:Q4_K_M	4.8GB	GGUF	Link
THUDM/glm-4-7-flash-GGUF:Q8_0	7.9GB	GGUF	Link

Model Details

Teacher Model

GLM-Edge-1.5B

Distillation Method

Knowledge Distillation (Logits)

Training Dataset

Flickr30k (Conceptual)

Primary Task

Multimodal Generation

Performance Metrics (Example)

Metric	Student Model	Teacher Model
Model Size	4.8GB	8.5GB
BLEU Score	28.5	30.1