Qwen3.6-27B: Flagship coding power in an efficient dense model

22 June, 2026

Joshua Hartmann

Systems Engineer

Joshua hat im Sommer 2023 seine Ausbildung zum Fachinformatiker für Systemintegration bei den NETWAYS Web Services erfolgreich abgeschlossen. Heute ist er ein wichtiger Teil des Teams, das sich mit großer Hingabe um die Kundenbetreuung und die kontinuierliche Weiterentwicklung der SaaS-Apps kümmert. Neben seinem musikalischen Talent am Klavier hat Joshua eine Leidenschaft für Wintersport und findet auch Freude im Gaming. Doch am allerliebsten verbringt er seine Zeit mit seiner besseren Hälfte, denn sie ist für ihn das größte Glück.

by Joshua Hartmann | Jun 22, 2026

AI Blog

We are pleased to announce that we have added a powerful model to our AI Models portfolio with Qwen3.6-27B.

The Alibaba Qwen team has released Qwen3.6-27B, a fully open-source, 27-billion-parameter model that outperforms all of its larger predecessors in agentic coding.
If you’re looking for a model that masters complex logical tasks, excels in software development and can also understand images and videos, then you’ve come to the right place.

What makes the model special?

The model is specifically designed for demanding logical tasks and software development and is particularly powerful:

Strong coding performance: Whether code generation, explanations or debugging, the model delivers precise results.
Advanced Reasoning: The model remains reliably on course for multi-level, complex tasks.
262K Context Window: You can process up to 262,144 tokens in a single prompt.
Multimodal input: The model can process and analyze not only text, but also images and videos directly.
Dense Architecture: All 27 billion parameters are active for every request.
Thinking Preservation: Reasoning traces are preserved across responses. This reduces redundant token generation and significantly improves the use of KV cache in multi-turn agents.

What is “Dense Architecture”?

You may be familiar with Mixture-of-Experts (MoE) models, such as GPT-OSS 120B, which only activate a subset of parameters per request. Qwen3.6-27B takes a different approach by using a dense architecture.

This means that the entire model is activated for every query.
The advantage? You get consistently high-quality results without routing overhead or unexpected quality fluctuations.
The trade-off? The response times are slightly longer. In total however, this approach is a clear advantage for use cases where accuracy and in-depth logical reasoning are a priority.

The benchmarks speak for themselves

Compared to Claude Sonnet 4.5:
Qwen3.6-27B achieves a higher Intelligence Index (45.8 vs. 43.0) and clearly leads in autonomous agent tasks (GDPval-AA: 1406 vs. 1320). In the Coding Index, it is on a par with Claude with 38.5 points, but costs only a fraction of its price. The model shows clear advantages particularly in visual reasoning (MMMU-Pro: 75 %) and scientific tasks (GPQA: 84 %).

Compared to GPT-OSS 120B:
While GPT-OSS remains a cost-effective choice for simple tasks, Qwen3.6-27B clearly outperforms it in overall intelligence (45.8 vs. 33.3) and coding capabilities (38.5 vs. 28.6). It is the superior choice for complex workflows.

You can view the full benchmarks on Artificial Analysis here

Prices & Access to Qwen3.6-27B

Getting started is straightforward. You can integrate the model directly into your existing workflows.

New endpoint: https://api.ai.nws.netways.de/qwen/v1
Model ID: Qwen/Qwen3.6-27B
API keys: Your existing API keys already work. You do not need to create new ones.

Prices:

This model is also billed based on usage, so you only pay for the tokens that you actually process

1M Output Tokens: 2,70 €
1M Input Tokens: 0,30 €

Good to know

To get the best out of Qwen3.6-27B, we recommend the following sampling parameters. Adjust them depending on the use case:

Thinking mode (general tasks)
temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
Thinking mode (precise coding / WebDev)
temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
Instruct mode (without thinking)
temperature=0.7, top_p=0.80, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0

Switch off thinking mode?

Thinking mode lets the model “think” internally, which greatly increases the quality of complex tasks or code. For quick, direct answers or if you want to minimize latency, simply switch it off or use the Instruct parameters:
"chat_template_kwargs": {"enable_thinking": false}

Farewell to the Reranker model

At the same time, we are saying goodbye to bge-reranker-v2-m3, as demand for this reranker model has been low in the past.

Nevertheless, your RAG remains stable, because the bge-m3 embedding model remains in our portfolio and continues to provide the technical basis for precise search results. Your existing RAG setups will therefore continue to work perfectly, even without Reranker.

Conclusion

Whether you are working on a coding project, need to analyze complex data or want to process multimodal inputs, Qwen3.6-27B offers you the performance you need.

You can integrate the model directly into your existing workflows or test it in parallel with our other models. If you have any questions about integration or performance optimization, just write to us.

Our portfolio

0 Comments

Submit a Comment Cancel reply

How did you like our article?