~ / changelog

Release/history

Every notable change, in order, with PR numbers. Format follows Keep a Changelog; versions follow SemVer.

Latest
v0.16.0
Released
2026-05-22
Releases shown
10
┌─ /timeline descending · newest first
v0.16.0
2026-05-22
Latest
Added
  • Auto-fetch available models during remote engine model selection (#134): when the wizard's step 4 model picker runs against a remote engine endpoint (Ollama, llama.cpp, or vLLM), it now calls the remote API to list available models instead of showing a local-only picker. This removes the "model not found" guesswork for remote setups — you see exactly what the remote server has installed.
  • Smart remote endpoint URL scheme detection (#134): if the user enters a bare IP or hostname (e.g. 192.168.1.100:11434) during the local-vs-remote wizard prompt, the step now auto-prepends http:// and the engine's default port if missing, so typos like gpu-box.local or 192.168.1.100:8000 produce a valid URL instead of a confusing connection error.
  • Test coverage for the remote model-fetch and URL normalization (#134): new unit tests cover the auto-fetch path (probe_remote_models), the URL-scheme normalizer (_normalize_url), the VLLM_BASE_URL env-key extraction for remote vLLM, and the error-handling boundaries (connection refused, 404, JSON parse failure).
  • Interactive local-vs-remote prompt during engine selection (#122): the wizard now asks whether the chosen engine is local on this machine or a remote endpoint. Selecting remote prompts for the base URL (and, for vllm, an API key), stores the value in the engine's *_BASE_URL env var inside the helper script, and skips the local install/launch path entirely.
  • Test coverage for the interactive remote-engine wizard path (#125): new unit and integration tests exercise the local-vs-remote prompt, env-keyfile materialization with chmod 0600, and the remote branching in healthcheck, info, and start_server for llamacpp.
Fixed
  • llamacpp remote-mode branching (#123): the llama.cpp helper script, healthcheck, info, and start_server no longer assume a local llama-server binary when LLAMACPP_BASE_URL points at a remote endpoint. Remote endpoints now skip binary discovery, model-file checks, and the spawn path; healthcheck targets the remote URL directly.
  • Rename Pi local shortcut cpccp (#120): the wizard-installed cp alias shadowed the standard POSIX copy command. The Pi local helper script and short alias are now ccp (long alias pi-local is unchanged). Re-running ccl setup migrates existing installs automatically.
Changed
  • llamacpp-tuner skill no-ops cleanly when llamacpp is remote (#124): the tuner detects a remote LLAMACPP_BASE_URL and exits with a friendly message rather than attempting to introspect a non-existent local binary.
  • README and wizard walkthrough lead with the interactive remote-engine flow (#126): the quickstart and wizard documentation now show the local-vs-remote prompt and remote-endpoint setup as the primary path.
Full compare diff →
v0.14.0
2026-05-20
Added
  • MTP (Multi-Token Prediction) support for llama.cpp (#102, #103): llama-server auto-launches with --spec-type draft-mtp --spec-draft-n-max 5 for MTP variants. Detection runs two passes: a GGUF metadata probe (architecture-specific *.mtp.* keys, *.nextn_predict_layers cross-arch convention, or MTP in general.name/general.architecture) and a filename fallback (*mtp*, word-bounded, case-insensitive). Env vars LLAMACPP_MTP_ENABLED=0/1 and LLAMACPP_SPEC_DRAFT_N_MAX=N (range 1–16) override. Conflict guard recognizes --mmproj and -np/--parallel > 1 and disables MTP with a warning; out-of-range values surface as notes entries on the MTP result.
  • llamacpp-tuner skill: new Claude Code skill that helps optimize llama.cpp server configuration for coding-agent workloads. Includes a benchmark agent and configuration profiles.
Fixed
  • Resolve mypy and ruff failures from v0.13.1 CI.
  • Apply ruff format and rename ambiguous l identifier in bench_agent.py.
Chore
  • Level up llamacpp-tuner skill to A grade.
Full compare diff →
v0.13.1
2026-05-19
Added
  • ccl status command (#98, #101): new top-level subcommand that prints the current ccl setup and shortcut availability. Lists all 9 shortcuts (3 harnesses × 3 engine types) with aliases, selected model, engine name and live status, and an availability column (available / unavailable / unconfigured); follows with an overall setup summary. Engine health is checked via each adapter's healthcheck.
Fixed
  • ccl status consistency: per-harness inference so cx no longer inherits cc's engine; local availability requires both helper script and installed engine; router shortcuts (cc9 / cx9 / cp9, cco / cxo / cpo) need a wired-up alias before claiming available.
  • Default harness/engine/model only show wizard-state values; no longer fabricated from a single detected script.
  • Replace the buggy fence_tag.replace("9","").replace("o","") (which turned codex into cdex) with explicit lookup tables; fix _infer_engine_from_script's return annotation to admit None.
Tests · Chore
  • Add 18 unit tests pinning each cross-row / cross-summary inconsistency and the helper-inference regexes (llamacpp via --model / ANTHROPIC_CUSTOM_MODEL_OPTION, ollama via pi --provider ccl-ollama).
  • Ignore .gstack/ workspace directory.
Full compare diff →
v0.13.0
2026-05-19
Added
  • Cross-harness session bridge: ccl run auto-captures and auto-injects conversation context across Claude Code, Codex, and Pi (#62, #93)
  • Post-run capture from native session files into ~/.claude-codex-local/sessions/<harness>.jsonl
  • Pre-run injection (one-shot -p only): freshest other harness's transcript prepended as a [prior context, agent=…] block
  • Cwd-scoped, 7-day staleness cap, macOS symlink-aware path resolution
  • Opt out per-call with ccl run --no-context or globally via CCL_SESSION_BRIDGE=0
  • ccl session command group (list / show / sync / truncate / clear)
  • Best-effort token redaction (OpenAI, AWS, GitHub, Slack, GitLab, Google API) on every import
  • ccl run --native-params flag (#97, #99): forward everything after --native-params -- verbatim to the launched harness — escape hatch for options ccl does not wrap first-class (e.g. --dangerously-skip-permissions)
  • Wizard llmfit fallback (#95, #100): when step 1 hardware scan is deferred, opportunistically run llmfit and persist the result to the machine-profile cache
Documentation
  • README: rewrite Sharing Context Between Agents for the auto-bridge, scope guards, and interactive-capture / one-shot-inject asymmetry
  • README: theme-aware logo for dark/light mode rendering on GitHub
  • README: add PyPI download badges and experiment tag
  • docs.html: refresh ccl run and ccl session cards for the auto-bridge
Full compare diff →
v0.12.0
2026-05-16
Added
  • OpenRouter Integration: Add OpenRouter as a hosted-SaaS cloud-routing backend alongside 9router (#83)
  • New openrouter engine with OpenRouterAdapter mirroring the 9router shape
  • Helper scripts cco / cxo / cpo (Claude / Codex / Pi via OpenRouter)
  • Default model anthropic/claude-sonnet-4.6; override via CCL_OPENROUTER_MODEL
  • Deferred-secret API key storage (chmod 0600) reused from the 9router pattern
  • Smoke test sends a minimal request to the selected OpenRouter model
  • Doctor checks for OpenRouter key file mode, content, and model name validity
Fixed
  • OpenRouter smoke test: Now targets the selected OpenRouter model (#85)
Documentation
  • Update landing page release history for recent versions
Full compare diff →
v0.11.0
2026-05-16
Added
  • Pi Harness Support: Add Pi as a supported harness alongside Claude Code and Codex CLI, enabling model-agnostic terminal coding workflows (#59, #82)
  • Wire Pi into the wizard setup flow with dedicated configuration
  • Add cp alias for Pi + local model sessions
  • Support Pi-specific models.json configuration
  • Update documentation and guide generation for Pi workflows
Full compare diff →
v0.10.0
2026-05-10
Added
  • Non-interactive CLI: Add ccl run subcommand with -p/--prompt flag for scripted workflows (#70, #71)
  • vLLM Wizard Integration: Wire vLLM into the setup wizard as a selectable backend (#66)
  • 9router Auto-install: Wizard offers to install 9router via npm when selected (#67)
  • Machine Profile Caching: Cache machine specs to avoid re-scanning hardware, speeding up wizard startup (#58, #75)
  • llama.cpp Enhancements: 128k context support, reasoning model smoke test, ccl serve command with auto-restart (#60)
Fixed
  • Wizard Component Recheck: Recheck selected setup components after user modifications (#79, #81)
  • vLLM Detection: Now checks CLI installation rather than server reachability, preventing false positives (#78)
  • llama.cpp Model Matching: Fix HuggingFace tag matching using existing _llamacpp_models_match helper (#64)
  • Machine Profile Cache: Write in-process cache to the correct symbol (#77)
Performance
  • Lazy llmfit Loading: Lazy init + cache-aware model picker, reducing unnecessary hardware scans (#79, #80)
Tests · Docs
  • Add comprehensive end-to-end test against a live vLLM server (#63)
  • Refresh documentation and brand assets for clarity and visual consistency (#76)
Full compare diff →
v0.9.0
2026-05-05
Added
  • 9router Integration: Add 9router as a cloud-routing backend (#51, #52)
  • Router9Adapter with smoke test support
  • Extend wizard with 9router setup flow and API key management
  • Support cc9/cx9 aliases alongside existing cc/cx
  • Fence-tag derivation and doctor checks for 9router
Fixed
  • Wizard now honors forced setup preferences (#51)
  • Update DeepSeek model hub paths
  • Fix step 2 install-hint loop to show 9router URL (#51)
Refactor
  • Refactor wizard _alias_block and _write_helper_script to use 4-way dispatch (#51)
  • Extend WireResult with raw_env field for deferred shell expressions (#51)
Full compare diff →
v0.8.3
2026-04-24
Fixed
  • Retire qwen2.5-coder 0.5b verified path; remove related claims from README, docs, model mapping, and static site (#49)
  • Restore bootstrap docs to point users to ccl find-model instead of a hardcoded tiny model path (#49)
v0.8.2
2026-04-20
Fixed
  • Wizard step IDs renumbered from 2.x (2.12.8) to sequential integers (18) for consistent progress indicators (#47)
  • Documentation updated to reflect new sequential step numbering (111)
  • E2E and unit tests updated to reference the new step IDs
v0.8.1
2026-04-17
Fixed
  • Machine specs table now shows real CPU/RAM/GPU values — wizard was reading llmfit system --json from the wrong nesting level (#46)
  • llmfit ranking now uses available RAM instead of total; Speed/Balanced/Quality picks match what fits on the host right now (#46)
  • Embedding and reranker models hidden from installed-models picker — they cannot serve as chat coding models (#46)
  • Step 4 model picker grouped with visual separators (Running / Suggested / Installed / Other) (#46)
v0.8.0
2026-04-17
Added
  • vLLM backend adapter with unit and e2e test coverage — joins Ollama, LM Studio, and llama.cpp as a first-class engine
  • Wizard detects an already-running llama-server and offers its active model as a pick
  • Wizard pre-populates the model picker with discovered + recommended models (#35, #36)
  • Wizard welcome banner shows installed version and repository URL (#37)
  • Live progress for model downloads (Ollama, LMS, HF CLI) with bytes/speed/ETA and a post-download summary; clean Ctrl-C abort (#39)
  • Fuzzy-search fallback for Hugging Face GGUF downloads via the Hub's search API (#38)
Fixed
  • Post-review polish for the fuzzy fallback and wizard flow (#45)
  • vLLM adapter type annotations and lint warnings cleared (mypy, ruff)
  • Removed a stray agent worktree gitlink that broke CI on fresh clones
earlier
pre-0.8.0

For versions prior to v0.8.0, see the complete changelog on GitHub.

Complete changelog on GitHub →