Qwen: Qwen3.5-122B-A10B

qwen/qwen3.5-122b-a10b

Description

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of overall performance, this model is second only to Qwen3.5-397B-A17B. Its text capabilities significantly outperform those of Qwen3-235B-2507, and its visual capabilities surpass those of Qwen3-VL-235B.

How this model compares

Overall covers the full catalog. By plan covers only models available on that tier (same rules as available models in your list). Position on min–average–max. Prices use the higher of prompt or completion per token, shown per 1M tokens.

Price (per 1M tokens)

Min

Max

This model

336 models in this groupPrice (per 1M tokens)

Min: $0.04
Avg: $12.571466
Max: $750.00

This model: $2.08 / 1M tokens

Context length (tokens)

Min

Max

This model

336 models in this groupContext length (tokens)

Min: 4,095 tokens
Avg: 398,336.839 tokens
Max: 2,000,000 tokens

This model: 262,144 tokens

Capabilities

text+image+video->textContext: 262,144 tokens

Input:

TextImageVideo

Output:

Text

Qwen: Qwen3.5-122B-A10B