Technical release page

Qwen3.6 AEON RYS 15/20 and SignalLatch

A public, evidence-first explanation of the AEON RYS 15/20 IQ4_NL release, the SignalLatch behavior fine-tune, the custom ik-llama runtime path, and the tradeoffs behind the current materialized RYS artifact.

SignalLatch on Hugging Face Fine-tune record Base RYS full record Base RYS on Hugging Face Runtime fork
15/20 RYS window used for the released AEON branch.
IQ4_NL Main practical deployment artifact, around 16 GB.
-0.75% Base RYS compression result: mixed four-probe relative change from RYS BF16 to released RYS IQ4_NL.
s0.10 Selected SignalLatch merge strength after practical sweeps.

Contents

What this is

This is a Qwen3.6-27B AEON-derived model line built around a specific RYS branch: duplicate the source layer window 15..19 and insert that copy after source layer 19. The resulting logical stack has 69 layers instead of the original 64.

The non-finetuned release is the practical base: an AEON RYS 15/20 model exported to a custom ik-llama-compatible IQ4_NL GGUF. SignalLatch is the behavior-finetuned release: checkpoint-386 LoRA merged into that RYS base at strength 0.10, then exported as the same practical GGUF deployment target.

The main project claim is narrow: in our tested custom runtime path, this RYS branch made the Q4_NL deployment much more resistant to reasoning/coding degradation than earlier non-RYS quant attempts, and SignalLatch improved practical coding-agent behavior on top of that base.

Technical deployment path from AEON base to RYS, SignalLatch merge, GGUF export, and ik-llama serving.
The release path is a chain: source model, RYS checkpoint transformation, quantized GGUF, custom runtime, then behavior fine-tune for SignalLatch.

What is claimed, and what is not

Claimed

The released RYS IQ4_NL artifact stayed close to the RYS BF16 source on the mixed probe snapshot: 0.7299 to 0.7244, or about -0.75% relative.

Claimed

SignalLatch is trained around a Review -> Align -> Latch -> Repair -> Confirm loop for coding agents: scoped context review, goal alignment, waiting for concrete tool signals, targeted repair, and focused validation.

Not claimed

This is not a stock llama.cpp drop, not a general leaderboard claim, and not proof that every RYS window helps every model.

Not claimed

This is not memory-optimal RYS. The current public GGUF is materialized RYS, not a runtime-aliased representation.

Required runtime

Use the custom AEON ik-llama fork. The GGUFs in this release line are intended for that runtime path, especially for graph split, Qwen3.6/Qwen3.5 hybrid handling, Jinja chat formatting, and DeepSeek-style reasoning extraction.

github.com/noonr48/qwen36-aeon-ik-llama

./build/bin/llama-server \
  -m /path/to/Qwen3.6-27B-AEON-RYS-SignalLatch-ckpt386-s010-IQ4_NL.gguf \
  -c 65536 \
  -ngl 999 \
  -np 1 \
  -fa on \
  -sm graph \
  --temp 0.7 \
  --jinja \
  --reasoning-format deepseek \
  --reasoning-budget 0 \
  -cram 0 \
  --ctx-checkpoints 0

For long-context use, increase context as your system allows. The default KV type is f16; FP16/default KV has been tested to about 160k context without showing the earlier failure pattern.

# Long-context shape
-c 131072

# Conservative validation shape used for the canvas comparison
-c 131072 -ctk f32 -ctv f32

Practical single-GPU deployment: the SignalLatch Q4_NL release is small enough for practical use on a single RTX 3090 / 24 GB-class card. In an observed reference profile, roughly 160k context with default/FP16 KV fit at about 20.3 GiB total VRAM. Treat this as a deployment reference point, not a guaranteed memory benchmark.

Why --ctx-checkpoints 0?

This flag keeps recurrent context checkpointing disabled explicitly. The current fork also guards against unstable recurrent checkpoints under graph split, but the public command keeps the tested no-checkpoint runtime profile visible.

Why FP32 KV in the canvas run?

It was a conservative validation setting to isolate model/task behavior from KV precision. It should not be read as “FP16 KV is bad.” For normal practical use, start with default/FP16 KV and adjust based on your memory and quality needs.

Evidence summary

The model cards carry the compact tables. This page collects the readout in one place so the release does not depend on ad hoc author replies.

Base RYS compression result

The following table describes the non-finetuned AEON RYS 15/20 base artifact only. It is not a SignalLatch fine-tune score.

ProbeRYS BF16Released RYS IQ4_NL
mixed 4-probe mean0.72990.7244
math_160.84210.7897
eq_160.71230.7111
math_40.48510.5170
gsm8k_50.88000.8800

Snapshot change: -0.0055 absolute, about -0.75% relative. This is not a broad benchmark suite; it is a release sanity snapshot for the tested deployment path.

SignalLatch fine-tune result

The fine-tune claim comes from the merged ckpt386 Q4_NL practical coding-agent matrix, not from the base BF16-to-Q4_NL compression table above.

ComparisonStrict passMeanTimeout-likeScope
Previous AEON RYS Q4_NL1/50.5504first deploy-format run
SignalLatch ckpt386 s0.10 IQ4_NL4/50.9500first deploy-format run
SignalLatch ckpt386 s0.10 repeat stability9/150.842n/athree runs
SignalLatch ckpt386 s0.10 crash-adjusted9/140.884n/aexcludes one invalid server-crash row

For the complete fine-tune method, strength sweep, task rows, and caveats, read the SignalLatch fine-tune record.

Practical canvas-agent comparison

This canvas comparison is not evidence that SignalLatch beats Unsloth IQ4_NL. SignalLatch IQ4_NL and Unsloth IQ4_NL both completed cleanly with verifier 1.0 and rc=0; the useful SignalLatch read is that it completed cleanly where the non-finetuned AEON RYS base needed a retry.

RunResultRead
SignalLatch IQ4_NLrc=0, verifier 1.0Clean completion on the isolated canvas app task.
AEON RYS IQ4_NL attempt 1rc=1, verifier 0.0417Invalid tool/diff failure before usable files.
AEON RYS IQ4_NL retry 1rc=0, verifier 1.0Clean retry, showing the base can complete the task but was less reliable in the first attempt.
unsloth/Qwen3.6-27B-GGUF IQ4_NLrc=0, verifier 1.0Clean external comparison pass.
unsloth/Qwen3.6-27B-GGUF Q8_0rc=124, verifier 1.0Files passed verification but the agent timed out during final wrap-up.

Exploratory frontend-design observation

A separate Open Design-style workstation prompt produced a dense SLOANE CLI Agent interface concept across desktop, artifact-open desktop, and mobile views. These were one-shot UI generations: the model was instructed to inspect the existing code and intended function, understand the workflow, and produce a frontend shell. There was no manual design iteration beyond that instruction.

This was not run as a scored benchmark, but it is a useful qualitative note: the output preserved a clear command layer, attached-context rail, evidence table, inspector panel, artifact dock, and responsive mobile hierarchy without turning the interface into a generic marketing layout.

Sanitized first-shot SignalLatch workstation UI showing attached context, command lifecycle, evidence, artifact dock, and inspector panels.
Selected first-shot UI example from the artifact-open desktop output. The local footer was cropped before publication.

Read this as an early design-oriented observation only. The stronger SignalLatch claim remains coding-agent discipline and completion behavior; frontend/UI taste should get its own repeatable test suite before it becomes a headline claim.

Scoreboard comparing base Q4_NL and SignalLatch merge strengths.
SignalLatch strength selection favored s0.10 because it was less flashy than stronger first runs but more deployable across repeats.
Testing ladder showing direct adapter tests, merged GGUF tests, strength sweeps, and practical validation.
The testing process moved from adapter probes to merged GGUF runs, then practical coding-agent tasks.

Known tradeoffs

Materialized RYS weights

The current release uses materialized RYS. The copied 15,20 window is exported as ordinary GGUF tensors, so output layers 20..24 have their own copied tensors from source layers 15..19.

This keeps the HF checkpoint, GGUF conversion, quantization, and downstream fine-tune workflow explicit and stable. It also means the copied weights are loaded like normal weights.

Known optimization: a procedural or runtime-aliased RYS implementation could reuse source-layer weight buffers and reduce duplicate model-weight memory. That is future runtime work, not the current tested release.

Practical impact in the released Q4_NL

The copied window is about 0.994 GiB, around 6.45% of the GGUF tensor bytes.

Runtime aliasing would not remove the extra compute or KV-cache cost from the five additional logical layers. At 131072 context, those five layers add about 2.5 GiB FP16 KV or 5.0 GiB FP32 KV.

The current artifact remains materialized because that is the version tested and released.

How the release was selected

RYS branch selection

Several candidate windows were tested. The public release centered on 15,20 because it held up as the practical IQ4_NL release target.

Custom runtime path

The fork added the Qwen3.6/Qwen3.5 hybrid handling, graph-split serving fixes, Jinja and DeepSeek reasoning-format support, and runtime guardrails needed for this line.

Quantized deployment target

The project target is the compact IQ4_NL GGUF, not the largest possible BF16 artifact. The fine-tuned BF16 GGUF exists as a source-quality artifact for inspection, conversion, and downstream work.

SignalLatch behavior merge

The selected fine-tune is checkpoint 386 at merge strength 0.10, chosen because repeated practical checks favored its stability over stronger but less reproducible strengths.

Which file should you use?

SignalLatch practical default

Qwen3.6-27B-AEON-RYS-SignalLatch-ckpt386-s010-IQ4_NL.gguf

Use this when you want the behavior-finetuned coding-agent profile.

Non-finetuned RYS default

Qwen3.6-27B-AEON-RYS-MaxThinkCoder-IQ4_NL-ik-llama-custom-mixed.gguf

Use this when you want the base RYS Q4_NL deployment without SignalLatch behavior tuning.

BF16 source-quality artifact

Qwen3.6-27B-AEON-RYS-SignalLatch-ckpt386-s010-BF16.gguf

Use this source-quality fine-tuned GGUF for exploration, inspection, conversion, training, and downstream work. It is not the normal inference target.

FAQ

Can I run this in stock llama.cpp?

No. This release line requires the custom AEON ik-llama fork built for this RYS model. Stock llama.cpp and default upstream ik-llama are not the supported runtime path. The custom fork includes the model-specific RYS/Qwen3.6 handling, graph-split work, Jinja/DeepSeek formatting support, and speed refinements used for the tested release.

Does SignalLatch make the model solved?

No. It improved practical coding-agent reliability in the tested sweep, but variance remains. The claim is practical and bounded.

Where did the RYS idea come from?

For broader RYS research context, see David Noel Ng's article LLM Neuroanatomy II. This page documents this release line specifically.