Qwen3.6 AEON RYS 15/20 Full Release Record

What This Page Claims
Released Artifacts
Experiment Timeline
Test Coverage Snapshot
Complete Candidate Lists
Source Model And RYS Build
Layer-Window Selection
Quantization Survival
Runtime Profile
MTP And Speed Work
Rejected Or Non-Default Variants
Practical Agent Tests
Implementation Footnotes
Evidence Ledger

What This Page Claims

This page is intentionally written as a record, not a leaderboard page. It explains why the release default became the non-MTP IQ4_NL GGUF and where the evidence is strong, weak, or mixed.

Project Positioning

The broader goal of this model series is practical local work: capable, functional models that fit realistic hardware and can simply get tasks done. Built on the AEON uncensored base, this release is meant to stay low-friction: less lecturing, less getting in the way, and more focus on completing the task in front of it.

RYS and IQ4_NL are the practical part of that goal: preserving useful reasoning and coding behavior in a compact form factor. One 24 GB-class GPU can be enough for a serious local worker. More GPUs naturally mean more independent workers in parallel, not a different claim about one model instance.

Claimed

On the mixed BF16-vs-IQ4_NL snapshot, this RYS 15/20 branch lost less than 1% relative mean score after compression to the released practical GGUF.

Claimed

The custom ik-llama fork is the intended runtime path. It also measured faster in the internal comparison, 39.37 tok/s versus 22.51 tok/s for the patched upstream-style comparison, but that speed note is secondary to preserving quality in the Q4_NL file.

Not Claimed

15/20 is not presented as universally best. A later 11/14 long-reasoning comparison was cleaner on repetition, and that remains an important caveat.

Explanatory route map from AEON base to RYS 15/20, strict scan, IQ4_NL quantization, and release default. — The release story is a path, not a single score. The practical claim depends on the RYS build, strict selection, quantization survival, and tested runtime path together.

Four caveat cards distinguishing strong evidence, the 11/14 repetition caveat, experimental MTP, and implementation footnotes. — The page separates strong claims, real caveats, experimental branches, and known tradeoffs so readers do not need to reconstruct that hierarchy from the tables.

Released Artifacts

The public Hugging Face repo contains the practical inference file, an experimental MTP file, a BF16 GGUF reference, and the BF16 safetensors folder for continued work.

File	Purpose	Size	Decision
`Qwen3.6-27B-AEON-RYS-MaxThinkCoder-IQ4_NL-ik-llama-custom-mixed.gguf`	Main non-finetuned inference artifact.	16,554,834,080 bytes	Default release file.
`Qwen3.6-27B-AEON-RYS-MaxThinkCoder-SpeedBoosted-IQ4_NL-MTP-Experimental.gguf`	MTP-capable IQ4_NL artifact with MTP-tail imatrix coverage.	16,794,473,728 bytes	Experimental; not the default.
`Qwen3.6-27B-AEON-RYS-MaxThinkCoder-BF16.gguf`	Source-quality GGUF reference for inspection, conversion, and comparison.	57,597,296,608 bytes	Exploration artifact.
`bf16-safetensors/`	HF-format checkpoint for Transformers, LoRA, SFT, continued training, or conversion work.	11 shards	Training/workflow artifact.

Decision

For normal users, the intended file is the non-MTP IQ4_NL GGUF. The BF16 files exist so people can inspect or continue the work; they are not the small-form deployment claim.

Experiment Timeline

The release moved through source selection, RYS construction, strict scanning, quantization screening, runtime work, and practical validation. Later fine-tune work built on this base, but is separate from the non-finetuned record here.

1. Source Branch

Use AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored as the source branch instead of the earlier official-base line or other abliterated candidates.

2. RYS Build

Build safetensors-first RYS checkpoints with correct tensor remapping and per-layer metadata remapping for Qwen3.6 hybrid attention.

3. Strict Window Scan

Compare candidate RYS windows against AEON baseline on strict math and EQ validation files. Select 15/20 as the balanced winner.

4. Quantization Screen

Convert to GGUF, build imatrix calibration, quantize to IQ4_NL, and compare BF16 against the quantized release candidate.

5. Runtime Work

Patch and package the ik-llama fork for the custom mixed GGUF layout, Qwen3.6 hybrid handling, graph split, Jinja, and DeepSeek reasoning format.

6. Release Decision

Publish the non-MTP IQ4_NL as the default, keep MTP as experimental, keep BF16 as exploration/reference, and document known caveats.

Testing ladder showing source selection, mechanical RYS build, short scan, strict scan, quantization survival, runtime path, MTP research, and practical agent checks. — The ladder view shows the full release process without asking readers to parse every table first.

Test Coverage Snapshot

This is the compact count of the evidence behind the release. The public values used for the argument are reproduced on this page; the evidence ledger gives provenance filenames.

Stage	Coverage	What It Answered	Decision Impact
Early AEON short scan	13 candidate mappings across `math_16` and `eq_16`.	Which windows looked promising before the strict pass?	Exploratory only; not final selection basis.
AEON strict scan	6 candidate mappings across `math_120` and historical `eq_140` files.	Which RYS window had the best balanced strict score?	Selected `15,20`.
15/20 BF16 vs IQ4_NL	Four mixed probes: `math_16`, `eq_16`, `math_4`, `gsm8k_5`; 41 prompt items total.	Did the release quant survive compression?	Confirmed IQ4_NL as the practical target.
11/14 quant caveats	Reasoning slice plus no-think math/EQ quant comparison files.	Was 11/14 obviously better after quantization?	No; it was promising but volatile and not strict-balanced winner.
Long fair comparison	2 IQ4_NL branches, 21 numeric questions, 2048-token reasoning budget.	Did 15/20 have repetition weaknesses against 11/14?	Yes; documented as a real caveat.
Mixed-Q8 probe	2 quant variants, 9-item quick paired eval.	Would protecting the RYS window in Q8 improve the default?	No replacement.
MTP work	MTP-tail imatrix, 512/2048 quality checks, short and long speed matrices.	Should the MTP file become the default?	No; keep experimental.
Practical agent checks	Base row from the 5-task production matrix, plus the AEON RYS attempt/retry rows from a later 5-run canvas comparison.	Could the compressed model act as a coding-agent base?	Yes with caveats. SignalLatch and Unsloth comparison rows are downstream context, not part of this base-release claim.

Spreadsheet Source

The chart data is collected into an OnlyOffice workbook: qwen36_aeon_rys_stats.xlsx. It contains the short scan, strict scan, quantization snapshot, 11/14 long-reasoning comparison, MTP speed notes, and duplicate-window cost sheet.

Complete Candidate Lists

The first version of this page summarized the scan coverage and listed the decision rows, but did not spell out every tested mapping. This section is the explicit candidate appendix from the AEON scan files.

Single-Layer Note

The early short scan contains one true single-layer duplication candidate: blocks:20,21, which duplicates source layer 20 only. The file named aeon_single_blocks_15_20_math120.pkl is misleadingly named for this question: it contains one entry for the 15,20 block candidate, not a full single-layer sweep.

Early short-scan candidate	math_16	eq_16	Mean	Read
`blocks:24,32`	0.821109	0.714006	0.767558	Best short-scan mean; not final strict winner.
`blocks:15,20`	0.818587	0.712340	0.765463	Strong short-scan candidate; later strict-balanced winner.
`blocks:11,14`	0.802940	0.715609	0.759275	Strong math/reasoning branch, later caveated.
`blocks:31,34`	0.790785	0.713317	0.752051	Close strict-scan runner-up later.
`blocks:30,35`	0.800002	0.701506	0.750754	Official-base winner did not transfer as AEON winner.
`blocks:0,0`	0.794387	0.706506	0.750447	AEON baseline.
`blocks:11,14;30,35`	0.771005	0.709006	0.740006	Two-window mesh; did not beat simpler candidates.
`blocks:15,27`	0.756966	0.714423	0.735695	Wider mid-window candidate.
`blocks:28,36`	0.766733	0.700577	0.733655	Late-window candidate.
`blocks:9,17`	0.730846	0.723526	0.727186	Good EQ, weaker math.
`blocks:30,34`	0.713382	0.716506	0.714944	Late-window candidate.
`blocks:8,17`	0.664031	0.724776	0.694403	High EQ but poor math balance.
`blocks:20,21`	0.680363	0.703526	0.691944	Only true single-layer duplication candidate in this short scan.

Strict-scan candidate	math_120	eq_140	Mean	Decision role
`blocks:15,20`	0.971441	0.647347	0.809394	Winner.
`blocks:31,34`	0.977269	0.640750	0.809010	Very close runner-up.
`blocks:11,14`	0.981498	0.634929	0.808214	Best strict math, not combined winner.
`blocks:24,32`	0.971823	0.628559	0.800191	Near baseline.
`blocks:0,0`	0.970181	0.629710	0.799945	AEON baseline.
`blocks:30,35`	0.964841	0.627521	0.796181	Official-base winner, not AEON winner.

Bar chart of AEON strict scan mean scores for 15/20, 31/34, 11/14, 24/32, baseline, and 30/35. — The strict mean chart uses the same values as the table above. The score axis is intentionally narrowed and labelled because these candidates are close together.

Decision From The Full List

The page should not imply that every possible single-layer duplication across all 64 layers was run. The documented evidence supports: 13 early AEON short-scan mappings, 6 strict AEON mappings, one true single-layer candidate in the short scan, and a separate single-entry strict file for 15,20.

Source Model And RYS Build

The source branch for this release is AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored. The RYS operation was a checkpoint transformation: duplicate a trained layer window and insert it back into the stack.

RYS 15/20 insert mapping showing source layers 15 to 19 duplicated into output layers 20 to 24. — For `blocks:15,20`, source layers `15..19` are duplicated after source layer 19. Output layers `20..24` are the copied window, and source layer 20 resumes at output layer 25.

Build Rule

Every output layer must copy its weights and execution metadata from the same source layer. For Qwen3.6 hybrid models, remapping text_config.layer_types with the same output-to-source map was execution-critical.

Failure Mode

A broken RYS checkpoint can load and still generate badly. The key trap was treating the copied tensors and the per-layer execution plan as separate ledgers.

Layer-Window Selection

The selected window was not chosen because it had the highest score on every possible slice. It was chosen because it won the balanced AEON strict scan and later survived the practical Q4_NL compression screen better than the stronger math-leaning branch.

Strict probe chart comparing AEON RYS candidate windows. — The strict scan used AEON validation files, not the older official-base scan. The official-base winner did not transfer as the AEON winner.

Spec	math_120	eq_140	Combined	Delta vs AEON baseline	Read
`blocks:15,20`	0.971441	0.647347	0.809394	+1.181%	Best balanced strict candidate.
`blocks:31,34`	0.977269	0.640750	0.809010	+1.133%	Very close second.
`blocks:11,14`	0.981498	0.634929	0.808214	+1.034%	Best strict math, not combined winner.
`blocks:24,32`	0.971823	0.628559	0.800191	+0.031%	Near baseline.
`blocks:0,0`	0.970181	0.629710	0.799945	baseline	AEON baseline.
`blocks:30,35`	0.964841	0.627521	0.796181	-0.471%	Official-base winner, not AEON winner.

Final Materialized 15/20 Check

After the materialized 15/20 artifact was built, strict checks recorded 0.983317798162 on artifact_15_20_strict_math120.pkl with 120/120, and 0.647756674780 on artifact_15_20_strict_eq140.pkl with 139/139. Combined: 0.815537236471.

Short Scan Caveat

The earlier 13-candidate short scan was not the final selection basis. In that short math_16 + eq_16 view, 24,32 was slightly higher than 15,20, so the release choice depends on the stricter balanced scan plus quantization survival.

Decision

Select 15,20 for the AEON release branch because it had the best balanced strict result. Keep 11,14 in mind as a math-leaning research branch, not the default release target.

Quantization Survival

This is the main reason the public release exists as a Q4-class model. The RYS BF16 did not produce a huge headline gain by itself. The useful result was that the selected branch held up unusually well after compression.

Visual comparison of BF16 reference artifact versus released IQ4_NL deployment target. — The BF16 files are for exploration and continued work. The named project target is the compressed IQ4_NL release path.

Probe	RYS BF16	Released IQ4_NL	Change
Mixed four-probe mean	0.729899	0.724435	-0.005465 / -0.7487%
`math_16`	0.842143	0.789686	down
`eq_16`	0.712340	0.711090	near flat
`math_4`	0.485116	0.516963	up
`gsm8k_5`	0.880000	0.880000	flat

Grouped bar chart comparing BF16 and released IQ4_NL scores across the mixed mean, math_16, eq_16, math_4, and gsm8k_5 probes. — The released IQ4_NL stayed within `-0.0055` absolute of the BF16 mixed four-probe mean, while the individual probes moved in different directions.

Decision

Keep the main release centered on IQ4_NL because the mixed four-probe drop was small relative to the size reduction: about 57.6 GB BF16 GGUF to 16.6 GB IQ4_NL. Runtime speed helped the deployment story, but preserving useful reasoning/coding quality through compression is the core result.

Important Caveat

The 11,14 quant headline is not the same scoreboard. Its two-probe reasoning slice went from 0.854324 BF16 to 0.729260 IQ4_NL, a much larger drop, but that file only used math_4 and gsm8k_5. A separate no-think math_16 + eq_16 file showed a smaller drop, 0.759275 to 0.738754. Defensible read: 11/14 was strong on some probes but more volatile, and it did not win the strict-balanced scan.

Runtime Profile

The model is released for the custom AEON ik-llama fork. That is not packaging trivia; it is part of the tested artifact. The fork carries the Qwen3.6 hybrid and graph-split work needed for this line. In the internal runtime comparison, the recommended custom path decoded at 39.37 tok/s versus 22.51 tok/s for the patched upstream-style comparison; useful speed, but still secondary to the quality-preservation result that made the IQ4_NL file worth releasing.

./build/bin/llama-server \
  -m /path/to/Qwen3.6-27B-AEON-RYS-MaxThinkCoder-IQ4_NL-ik-llama-custom-mixed.gguf \
  -c 65536 \
  -ngl 999 \
  -np 1 \
  -fa on \
  -sm graph \
  --temp 0.7 \
  --jinja \
  --reasoning-format deepseek \
  --reasoning-budget 0 \
  -cram 0 \
  --ctx-checkpoints 0

Why the Fork Exists

The project needed custom mixed-GGUF support, Qwen3.6/Qwen3.5 hybrid handling, graph-split stability work, Jinja chat formatting, DeepSeek reasoning extraction, and May 2026 duplicate tool-call filtering.

Long Context

The public profile starts with -c 65536. The same family was also used for 131072 context comparisons, and default/FP16 KV was separately tested to about 160k context without the earlier failure pattern. FP32 KV was a conservative validation setting, not a requirement.

Practical Single-GPU Deployment

The Q4_NL release is small enough for practical single-GPU deployment. In an observed 24 GB-class GPU reference profile, roughly 160k context with default/FP16 KV fit at about 20.3 GiB total VRAM on an RTX 3090-class card. Treat this as a practical deployment reference point, not a guaranteed cross-hardware memory benchmark.

Runtime and implementation cards for the required fork, GGUF representation, long-context KV caveat, and experimental MTP branch. — The runtime cards collect the practical constraints that matter before someone treats the file as a stock GGUF.

Runtime Check	Setup	Decode	Decision
Recommended custom path	Graph split, long-context deployment profile, FP32 KV validation snapshot.	39.37 tok/s	Public runtime path; faster in this internal comparison, with the release claim still anchored on quality preservation.
Patched upstream-style comparison	Internal standard-typed comparison path, shorter context, layer-style comparison.	22.51 tok/s	Not released as the public target.

MTP And Speed Work

The MTP file is real and structurally valid, but it did not replace the default. We kept it because it is useful for runtime research, not because it beat the normal non-MTP file.

Check	Result	Read
MTP metadata	`qwen35.nextn_predict_layers = 1`	MTP tail exists.
MTP tensor coverage	8 quantizable `blk.69` tensors	Patched imatrix collection covered the MTP tail.
MTP-aware imatrix	80 chunks, PPL 4.3762 +/- 0.07886	Calibration path completed.
512-token quality	0.166138 score with MTP	Effectively identical to the old MTP GGUF in that suite.
2048-token no-MTP quality on MTP file	worst repeat 59	Worse repeat penalty than the practical non-MTP default.

Speed Check	Decode	Prompt	Read
No-MTP graph split reference	48.6795 tok/s	214.9709 tok/s	Final clean matrix reference; still fastest.
Naive MTP draft-1	38.1550 tok/s	noted in matrix	Acceptance 225/345 = 65.217%.
Adaptive MTP short check	45.3589 tok/s	208.79 tok/s	Acceptance 158/218 = 72.477%; closer, still not enough to replace default.
No-MTP long 768-token check	48.7140 tok/s	long check	Longer generation still favored no-MTP.
Adaptive MTP long 768-token check	46.9539 tok/s	long check	Acceptance 38/54 = 70.37%; close but still behind.

Horizontal bar chart comparing no-MTP, naive MTP, and adaptive MTP decode-speed notes. — These are internal same-machine speed notes used for the release decision, not a normalized public speed benchmark. In the tested paths, no-MTP stayed slightly ahead.

Decision

Publish the MTP GGUF as experimental. Keep the non-MTP IQ4_NL file as the default because practical quality and speed still favored it.

Rejected Or Non-Default Variants

Several useful experiments did not become the public default. They are included here because the negative results explain the release shape.

Variant	Test	Result	Decision
`11,14` IQ4_NL	Long-reasoning fair comparison, `math_16 + gsm8k_5`, 21 questions, 2048 max tokens.	Same final/any exact rates as 15/20, but much cleaner repetition: worst 4gram repeat 7 vs 47 for 15/20.	This resolves the long-reasoning Q4_NL setting in favor of cleaner 11/14 repetition, but does not replace the selected release without matching evidence across the mixed quant suite, runtime packaging, and practical agent tests.
RYS-window mixed-Q8	Force 46 attention/SSM tensors in layers 15..24 to Q8_0.	File size +3.35%; mean best-rel 0.87299 vs 0.89998 baseline in a 9-item quick paired eval.	Do not replace IQ4_NL. Consider narrower Q8 variants later.
Standard llama.cpp-style public file	Internal patched upstream-style comparison path.	Still required special runtime assumptions and was not the main tested target.	Do not present as stock llama.cpp support.
MTP default	MTP graph split, graph reuse, adaptive gate, MTP-tail imatrix.	Technically valid but slower or less clean than no-MTP in tested paths.	Publish as experimental only.

Two-panel chart showing 11/14 versus 15/20 composite score and worst 4-gram repeat in the long-reasoning caveat test. — The 11/14 branch deserves the caveat: it tied final/any exact rates in this slice and had much cleaner repetition. It still was not rerun through the full release decision ladder.

Practical Agent Tests

These checks use the released non-finetuned AEON RYS 15/20 IQ4_NL artifact unless a row explicitly says otherwise. They are practical coding-agent checks, not broad benchmarks. The numbers needed to interpret them are reproduced here; the file paths in the ledger are provenance only.

Scope Boundary

Some practical rows were collected during later SignalLatch and Unsloth comparison work. This page uses only the AEON RYS IQ4_NL base rows. SignalLatch strength-sweep and clean-pass claims belong to the separate fine-tune page.

Five-task Matrix

Base run: AEON RYS IQ4_NL. Setting: temp 0.7, graph split, flash attention, Jinja/DeepSeek, 65536 context. Result: strict pass 1/5, mean 0.550, task scores 0.75, 1.00, 0.25, 0.25, 0.50, timeout-like tasks 4.

Canvas Attempt 1

Base run: AEON RYS IQ4_NL attempt 1. Setting: temp 0.7, 131072 context, FP32 KV, graph split, flash attention. Result: rc=1, 337s, verifier 0.0417 / false, root files: none.

Canvas Retry

Base run: AEON RYS IQ4_NL retry 1. Same task and runtime family. Result: rc=0, 803s, verifier 1.0 / true, complete app files. Read: the base can complete the task, but first-attempt reliability remains a caveat.

Production Matrix Task	Score	Pass	RC / Time	Verifier Read
`github_mcp_commits_fix_repeat`	0.75	No	`rc=1`, 260s	Build, branch schema, branch output, and README checks passed; request path/branch parameter checks failed.
`github_mcp_pr_details_fix`	1.00	Yes	`rc=124`, 600s	Correctly used the PR detail endpoint and detail additions/deletions/changed-files fields, but still hit the full timeout.
`local_search_kill_excess_fix`	0.25	No	`rc=124`, 600s	Build passed; targeted process-kill behavior was not implemented.
`local_search_search_timeout_fix`	0.25	No	`rc=124`, 600s	Build passed; timeout schema and handler propagation were not implemented.
`local_search_web_search_race_fix`	0.50	No	`rc=124`, 600s	Multiple engines remained, but the first-success race behavior was not implemented.

Canvas Harness Detail	Value Included Here
Prompt	Build an isolated Krita-like raster canvas app with layers, brush/eraser, transforms, opacity, and a local AI image-generation stub.
Shared settings	Temp `0.7`, context `131072`, FP32 K/V cache, flash attention, graph split, Jinja, DeepSeek reasoning format, `CLAW_MAX_TOKENS=1800`, `TIMEOUT_SECONDS=900`.
Retry root files	`index.html` 3,450 bytes; `styles.css` 7,062 bytes; `app.js` 17,551 bytes; `README.md` 1,033 bytes.
Excluded comparison rows	The full later comparison also included SignalLatch IQ4_NL, Unsloth IQ4_NL, and Unsloth Q8_0 rows. Those are not used as evidence for this non-finetuned base release.

Decision

Keep this release claim narrow: AEON RYS 15/20 IQ4_NL is a viable compressed coding-agent base with practical competence and documented reliability caveats. SignalLatch is a later behavior-finetuned attempt to improve that reliability, not part of the non-finetuned release score.

Implementation Footnotes

The duplicate-window representation is recorded here for transparency, but it is not the headline claim of the release. The main claim remains quantization survival and practical runtime behavior for the tested IQ4_NL file.

GGUF file size16,554,834,080 bytes. This is the released default IQ4_NL file size.

Materialized duplicate spanOutput layers 20..24 occupy 1,067,475,584 bytes / 0.994 GiB as copied RYS-window tensors.

File-level footprintThe duplicate span is about 6.448% of the released file, or 7.220% of transformer block tensor bytes.

Logical layer count69 logical layers: 64 source layers plus 5 inserted RYS layers.

Long-context KV scalingAt 131,072 tokens, the extra KV/cache estimate is about +1.0 GiB FP16 / +2.0 GiB FP32. Around 160k tokens, it scales to about +1.2 GiB FP16 / +2.4 GiB FP32; at 163,840 tokens, about +1.25 GiB FP16 / +2.5 GiB FP32. The inserted window contains two full-attention layers, so KV cost scales with context length; all five inserted layers still add compute.

Why This Version Stayed Materialized

Materialized tensors kept the HF checkpoint, GGUF conversion, quantization, and downstream fine-tune/LoRA workflows explicit and stable for the tested release.

Future Optimization

A procedural or aliased RYS runtime could reuse source-layer weight buffers and save duplicate-weight memory. That is possible future runtime work, not a change to this already tested release artifact.

Evidence Ledger

This ledger records the source filenames used to reconstruct the page. It is a provenance map, not required reading: the public numeric data needed to understand the release is reproduced above in the tables, charts, captions, and artifact rows.

Workspace status qwen36_rys_work/README.md records current defaults, directory map, RYS semantics, strict top-6 table, quantization snapshot, GGUF/imatrix paths, and serving notes.

Strict AEON scan qwen36_aeon_validation/results/aeon_strict_math120.pkl and qwen36_aeon_validation/results/aeon_strict_eq140.pkl.

15/20 BF16 vs IQ4_NL qwen36_aeon_validation/results/quant_compare_15_20_q4_vs_bf16_math16_eq16_reasoning_math4_gsm8k5_20260426.json.

11/14 quant caveat qwen36_aeon_validation/results/quant_compare_11_14_q4_vs_bf16_reasoning_math4_gsm8k5_20260426.json.

11/14 vs 15/20 long-reasoning comparison qwen36_rys_work/rys_fair_compare_11_14_vs_15_20/results/q4nl_reason2048_ctx32768_bwrap0349_20260430_123403/.

Mixed-Q8 probe qwen36_rys_work/aeon_rys_15_20_gguf/mixed_q8_ryswin/README_results.md and latest_quick_eval_results.json.

MTP-aware imatrix and quality qwen36_rys_work/aeon_rys_15_20_mtp_gguf/imatrix_mtp_iq4nl/README_results.md.

MTP speed search qwen36_rys_work/mtp_speed_probe/autoresearch_3090/RESULTS_3090_MTP_SPEED_20260501.md and qwen36_rys_work/mtp_rebuild_20260501_151130/final_results.md.

Practical production matrix base row docs/ckpt386-s010-testing-process/evidence/base_q4nl_summary.md. The base task scores, pass flags, return codes, and timings are reproduced in the Practical Agent Tests section.

Practical canvas comparison qwen36_rys_work/aeon_rys_15_20_signallatch_gguf/evidence/canvas_unsloth_comparison_20260505_summary.md. Only the AEON RYS IQ4_NL attempt/retry rows are used for this base-release page; SignalLatch and Unsloth rows are downstream or external comparison context.

Published model card Qwen3.6-27B-AEON-RYS-15-20-GGUF.

Qwen3.6 AEON RYS 15/20: what we tested, what won, and what did not.

Short Read

Contents

What This Page Claims

Project Positioning

Claimed

Claimed

Not Claimed

Released Artifacts

Decision

Experiment Timeline

1. Source Branch

2. RYS Build

3. Strict Window Scan

4. Quantization Screen

5. Runtime Work

6. Release Decision

Test Coverage Snapshot

Spreadsheet Source

Complete Candidate Lists

Single-Layer Note

Decision From The Full List

Source Model And RYS Build

Build Rule

Failure Mode

Layer-Window Selection

Final Materialized 15/20 Check

Short Scan Caveat

Decision

Quantization Survival

Decision

Important Caveat

Runtime Profile

Why the Fork Exists

Long Context

Practical Single-GPU Deployment

MTP And Speed Work

Decision

Rejected Or Non-Default Variants

Practical Agent Tests

Scope Boundary

Five-task Matrix

Canvas Attempt 1

Canvas Retry

Decision

Implementation Footnotes

Why This Version Stayed Materialized

Future Optimization

Evidence Ledger