Qwen: Qwen3 235B A22B Instruct 2507

Description

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following, logical reasoning, math, code, and tool usage. The model supports a native 262K context length and does not implement "thinking mode" (think blocks). Compared to its base variant, this version delivers significant gains in knowledge coverage, long-context reasoning, coding benchmarks, and alignment with open-ended tasks. It is particularly strong on multilingual understanding, math reasoning (e.g., AIME, HMMT), and alignment evaluations like Arena-Hard and WritingBench.

How this model compares

Overall covers the full catalog. By plan covers only models available on that tier (same rules as available models in your list). Position on min–average–max. Prices use the higher of prompt or completion per token, shown per 1M tokens.

Price (per 1M tokens)

Min

Max

This model

332 models in this groupPrice (per 1M tokens)

Min: $0.04
Avg: $12.655889
Max: $750.00

This model: $0.55 / 1M tokens

Context length (tokens)

Min

Max

This model

332 models in this groupContext length (tokens)

Min: 4,095 tokens
Avg: 424,110.593 tokens
Max: 10,000,000 tokens

This model: 262,144 tokens

Description

How this model compares

Price (per 1M tokens)

Context length (tokens)

Capabilities