IQuest-Coder-V1: Releasing 7B/14B Family Models and 40B-Thinking Models

A new family of code large language models (LLMs) designed to advance autonomous software engineering and code intelligence.

HTML and SVG Generation

Features preliminary support for HTML webpage and SVG graphics code generation.

Code-Flow Training Paradigm

Moving beyond static code representations, our models learn from repository evolution patterns, commit transitions, and dynamic code transformations to understand real-world software development processes.

Dual Specialization Paths

Bifurcated post-training delivers two specialized variants—Thinking models (utilizing reasoning-driven RL for complex problem-solving) and Instruct models (optimized for general coding assistance and instruction-following).

Efficient Architecture

The IQuest-Coder-V1-Loop variant introduces a recurrent mechanism that optimizes the trade-off between model capacity and deployment footprint. The 7B and 14B models adopt shallow architectures for faster inference speed.

Native Long Context

All models natively support up to 128K tokens without requiring additional scaling techniques.

CLI Agent Integration

Demonstrates initial deployment capabilities on ClaudeCode and OpenCode platforms, with the ability to integrate into CLI-based agent workflows.

State-of-the-Art Performance

Achieves leading results on SWE-Bench Verified, BigCodeBench, LiveCodeBench v6, and other major coding benchmarks, surpassing competitive models across agentic software engineering, competitive programming, and complex tool use.

Architecture-Level Chain-of-Thought via Recurrent Depth

40B-Loop-Thinking is a research-oriented experimental prototype for exploring how structural chain-of-thought and procedural chain-of-thought can cooperate in one system.

Model	BigCodeBench (Full)	BigCodeBench (Hard)	HumanEval	HumanEval+	MBPP	MBPP+
7B-Instruct	38.86	22.97	79.90	73.20	73.50	63.50
7B-Thinking	40.53	19.59	76.80	70.70	76.50	62.40
14B-Instruct	46.32	26.35	83.50	78.70	79.60	68.50
14B-Thinking	47.72	23.65	92.70	86.00	90.50	72.00
40B-Thinking	51.05	29.05	93.90	87.80	91.00	75.10
40B-Loop-Thinking	50.61	29.73	97.60	89.60	91.00	76.20

Model	CruxEval (Input)	CruxEval (Output)	CodeArena (Score)	CodeArena (Win Rate %)	CodeArena (Tie Rate %)
7B-Instruct	45.80	54.20	0.37	31.28	10.77
7B-Thinking	57.60	81.50	0.28	23.33	10.00
14B-Instruct	52.60	57.60	0.65	60.00	10.77
14B-Thinking	80.50	90.60	0.72	67.44	9.49
40B-Thinking	87.40	94.00	0.92	88.97	5.13
40B-Loop-Thinking	76.50	75.20	0.65	59.23	11.54

Model	LiveCodeBench (v5)	LiveCodeBench (v6)	Multiple (Avg)	Python	JavaScript	Java	C++	TypeScript
7B-Instruct	24.55	24.57	66.15	82.30	73.30	63.30	65.20	78.00
7B-Thinking	37.72	36.57	53.66	74.40	64.00	53.80	48.40	66.00
14B-Instruct	37.72	40.00	71.24	81.10	77.00	69.00	79.50	76.10
14B-Thinking	72.46	66.29	67.09	86.00	76.40	66.50	67.70	76.10
40B-Thinking	77.25	77.71	75.39	89.00	83.20	81.00	74.50	78.00
40B-Loop-Thinking	79.64	80.00	80.26	89.00	88.20	86.70	84.50	83.60

Model	Terminal-Bench (v1)	Terminal-Bench (v2)	SWE-Verified	Multi-SWE
7B-Instruct	22.50	11.23	45.00	17.33
7B-Thinking	21.25	6.89	38.80	13.33
14B-Instruct	36.25	16.85	66.20	48.00
14B-Thinking	26.25	14.10	63.60	37.00
40B-Thinking	30.00	22.30	71.20	48.67
40B-Loop-Thinking	30.00	18.80	69.40	36.33

Model	BFCL (v3)	Tau-Bench-2 (Airline)	Tau-Bench-2 (Retail)	Tau-Bench-2 (Telecom)	Mercury (Beyond@1)	Mercury (Pass@1)
7B-Instruct	34.02	59.18	49.12	85.96	42.12	50.39
7B-Thinking	43.34	52.00	65.49	76.99	43.24	53.52
14B-Instruct	55.10	70.00	78.07	84.21	63.29	76.17
14B-Thinking	53.59	59.18	76.32	87.72	61.99	74.22
40B-Thinking	64.18	66.00	87.72	91.23	71.14	83.20
40B-Loop-Thinking	61.57	64.00	78.07	89.47	79.61	94.92

IQuest-Coder-V1: 7B/14B Family Models and 40B-Thinking Models

7B Family

14B Family

40B-Thinking

Why 7B/14B Models?

Shallow Architecture, Ultra-Fast Inference

CLI Agent Integration

HTML & SVG Generation

IQuest-Coder-V1 Highlights

HTML and SVG Generation

Code-Flow Training Paradigm

Dual Specialization Paths

Efficient Architecture

Native Long Context

CLI Agent Integration

State-of-the-Art Performance

Architecture-Level Chain-of-Thought via Recurrent Depth

Benchmark Highlights

Performance Comparison

Core Coding Benchmarks

Code Understanding & Reasoning

LiveCodeBench & Multi-Language Support

Terminal & Agent Capabilities

Function Calling & Task Planning

Demo Showcases

CLI Demo: Claude Code & OpenClaw

HTML Demos: Single-File Web Applications

SVG Demos: Vector Graphics & Animations