v0.8.3 · Open Source

Hit your limit? Need privacy?
Just swap the model.

One alias. Claude Code or Codex on a local model. Skills, agents, MCP servers — all intact.

Get Started Free View on GitHub

MIT License · Python 3.10+ · Ollama · LM Studio · llama.cpp

bash — terminal

Launching Claude Code with gemma4:26b...

Claude Code v2.1.100

gemma4:26b with medium effort · API Usage Billing

~/myproject/my-app

❯

my-app | main [1] | gemma4:26b | 39e7a1ed-7086-41ab

-- INSERT -- ⏵⏵ accept edits on (shift+tab to cycle)

Sound Familiar?

You shouldn't have to stop
because the cloud did.

Developer time is expensive. Cloud limits are arbitrary. You deserve a local fallback that actually works.

⚡

Your quota ran out — mid-session

You're deep in a refactor, context loaded, momentum built. Then boom. Rate limit hit. Everything's gone. Back to square one.

🔒

Your code can't leave your machine

Consulting contracts. IP clauses. Air-gapped servers. You need local AI that's operationally viable — not a toy demo.

🔧

Every local setup guide breaks your workflow

New tool. New configs. New muscle memory. You don't want to learn something new. You just wanted a backend swap.

Solution

Meet claude-codex-local

Keep your tools. Swap the brain. Ship locally.

🧙

9-Step Guided Wizard

From zero to working local session in under 10 minutes. Auto-detects Ollama, LM Studio, and llama.cpp. Resumes on failure. No manual file surgery.

🧠

Hardware-Aware Model Selection

llmfit analyzes your RAM and GPU. It recommends the model that actually fits your machine — no more guessing or OOM crashes.

🛡️

Zero Config Breakage

Your real ~/.claude and ~/.codex are never touched. All config stays isolated. Rollback in seconds — remove one folder.

⚡

One Alias to Rule Them All

After setup, type cc or cx. That's it. All your skills, agents, MCP servers, and statusline work exactly as before.

🔒

Offline & Private

Code never leaves your machine. No telemetry. No phone-home. After model download, zero internet required. Perfect for air-gapped projects.

✅

Proven Paths

Claude Code + Ollama + gemma4:26b verified end-to-end. llama.cpp with GGUF models from Hugging Face also supported. Real working combos, not hypothetical support.

How It Works

Three steps. Ten minutes.
Fully local.

Install

Run one curl command. The wizard auto-detects your installed runtimes, checks your hardware, and flags anything missing.

Configure

Answer a few prompts. Pick your runtime, pick your model (or let llmfit pick for you). The wizard wires everything up safely.

Run

Type cc or cx. Your full Claude Code or Codex experience — locally. Every skill, every agent, every MCP server intact.

Get Started

One command to start

Available on PyPI — install with pip or uv, or use the one-line shell installer.

pip · recommended

$ pip install claude-codex-local

uv · alternative

$ uv tool install claude-codex-local

Or use the shell installer (no clone required)

bash <(curl -sSL https://raw.githubusercontent.com/luongnv89/claude-codex-local/main/install.sh)

Or install from source

git clone https://github.com/luongnv89/claude-codex-local.git
cd claude-codex-local
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
ccl

✓ Claude Code + Ollama + gemma4:26b

✓ Codex CLI + Ollama + gemma4:26b

⚠ Claude Code + LM Studio + Qwen3

✓ Any + llama.cpp + GGUF (from Hugging Face)

FAQ

Common questions

No. It keeps both tools exactly as-is and adds a local backend bridge. Your harness, your skills, your muscle memory — all intact. One alias, zero disruption.

Ollama (primary, with native ollama launch support), LM Studio, and llama.cpp — all with auto-detection during setup. The wizard will find what you have installed. For llama.cpp, GGUF models can be downloaded directly from Hugging Face via the built-in huggingface-cli integration.

Never. Your real ~/.claude and ~/.codex are untouched. All local config lives in .claude-codex-local/. To fully revert: remove that folder and the alias block added to your shell rc. Done.

It depends on the model. The wizard uses llmfit to analyze your RAM and GPU, then recommends models that actually fit. Smaller machines get smaller, faster models. Larger machines unlock more capable ones.

Yes. After the initial model download (handled by your runtime like Ollama), no internet connection is required. There is no telemetry, no phone-home, and no license server. Fully air-gap capable.

Instantly. Just run the official claude or codex commands directly — they are completely unmodified. The cc / cx aliases are only for local sessions. Cloud and local coexist.

What's New

Changelog

Release history following Keep a Changelog and Semantic Versioning.

v0.8.1 2026-04-17 Latest

Fixed

✓ Machine specifications table now shows real CPU, RAM, and GPU values — wizard was reading llmfit system --json fields from the top level, but they are wrapped under a system key (#46)
✓ llmfit ranking now uses available RAM instead of total — Speed/Balanced/Quality picks match what will actually fit on the host right now (#46)
✓ Embedding and reranker models are hidden from the installed-models picker for Ollama and LM Studio — they cannot serve as chat coding models (#46)
✓ Step 4 (formerly 2.4) model picker is grouped with visual separators: Running server / Suggested by llmfit / Installed on this machine / Other (#46)

v0.8.0 2026-04-17

Added

+ vLLM backend adapter with unit and e2e test coverage — high-throughput inference engine now joins Ollama, LM Studio, and llama.cpp as a first-class engine option
+ Wizard detects an already-running llama-server and offers its active model as a pick, so you can keep your warm process instead of re-pulling a GGUF
+ Wizard pre-populates the model picker with models discovered on-host and recommendation profile picks (#35, #36)
+ Wizard welcome banner now shows the installed version and repository URL (#37)
+ Live progress for ollama pull, lms get, and Hugging Face CLI downloads, with a post-download summary and clean Ctrl-C abort (#39)
+ Fuzzy-search fallback for missing Hugging Face GGUF repos — wizard suggests up to 3 closest matches and re-prompts if none are found (#38)

Fixed

✓ Post-review polish for the fuzzy fallback and KI wizard flow (#45)
✓ vLLM adapter type annotations and lint warnings cleared under mypy and ruff
✓ Removed a stray agent worktree gitlink that broke CI on fresh clones

v0.7.0 2026-04-12

Added

+ Machine specifications table (CPU cores/name, RAM total/available, GPU details) displayed during environment discovery step (#31)
+ Comprehensive e2e test suite covering all ccl CLI commands: setup, doctor, find-model, and their flags — 26 tests total (#29, #32)

Fixed

✓ --resume and --non-interactive flags are now available at the top-level ccl parser, so ccl --resume works without specifying the setup subcommand explicitly (#28, #30)

v0.6.0 2026-04-11

Added

+ ASCII 3D welcome banner with project tagline displayed at wizard startup (#23, #25)

Fixed

✓ HuggingFace CLI detection now checks both hf (modern) and huggingface-cli (legacy) binary names (#21, #22)
✓ llmfit check is now optional — environment discovery no longer requires it; the wizard prompts to install it only when model selection is requested (#24, #26)

v0.5.0 2026-04-11

Changed · Breaking

~ Single canonical CLI binary. The package now installs one entry point, ccl, replacing claude-codex-local and ccl-bridge. Command tree unchanged — ccl setup, ccl doctor, ccl find-model (#20)
~ Internal rename: claude_codex_local.bridge → claude_codex_local.core. Anyone importing the module directly must switch to the new path

Removed

✗ ccl-bridge debug binary — its subcommands (profile, recommend, doctor, adapters) remain reachable via python -m claude_codex_local.core <cmd>
✗ Legacy bin/claude-codex-local bash wrapper and the top-level wizard.py duplicate — both predated the installable package

Added

+ Top-level ccl --version and new global flags --no-color (honors NO_COLOR), --verbose, --quiet
+ install.sh now runs pip install -e . so the ccl entry point lands in the virtualenv automatically

v0.4.0 2026-04-11

Added

+ Smoke test now reports model speed in tokens/second so you can gauge throughput before you commit to a model (#17)
+ Per-harness alias fences — cc and cx can now coexist in the same shell rc file, each with its own idempotent fenced block (#16)

v0.3.0 2026-04-11

Added

+ llama.cpp backend adapter with llama-server integration and GGUF model support from Hugging Face
+ Docker-based e2e test suite covering pip, uv, source, and extras install scenarios
+ pip install .[dev] optional extras group (pytest, ruff, mypy, bandit, detect-secrets, pre-commit)
+ GitHub Pages landing page with brand refresh and two-column hero layout

Fixed

✗ Empty array expansion in run_e2e_docker.sh under set -u

v0.2.0 2026-04-10

Added

+ One-command remote installer (install.sh) — no clone required
+ ollama launch integration as primary engine path
+ Shell alias installer with idempotent fenced block in ~/.zshrc / ~/.bashrc
+ Personalized guide.md generation after wizard completes
+ --resume flag to pick up after a failed wizard step
+ --non-interactive flag for CI-friendly setup
+ find-model subcommand for standalone llmfit recommendations
+ Installable Python package structure for PyPI distribution

Changed

~ Wizard now uses ollama launch instead of isolated HOME and variant builder
~ LM Studio support moved to secondary/fallback path

Fixed

✗ Shell alias block replaced idempotently on re-run (no more duplicates)
✗ Users reminded to source ~/.zshrc before first cc/cx run

v0.1.0 2026-04-01 Initial release

Added

+ Initial proof-of-concept: interactive wizard (8 steps)
+ Harness support: Claude Code, Codex CLI
+ Engine support: Ollama, LM Studio, llama.cpp
+ llmfit integration for hardware-aware model selection
+ Pre-commit hooks: ruff, mypy, bandit, detect-secrets
+ pytest test suite with @pytest.mark.local marker for integration tests

View full changelog on GitHub →

Open Source · Free Forever

Ready to run locally?

Free, open-source, MIT licensed. Takes under 10 minutes.

Start My Local Setup View on GitHub

Hit your limit? Need privacy? Just swap the model.

You shouldn't have to stopbecause the cloud did.

Your quota ran out — mid-session

Your code can't leave your machine

Every local setup guide breaks your workflow

Meet claude-codex-local

9-Step Guided Wizard

Hardware-Aware Model Selection

Zero Config Breakage

One Alias to Rule Them All

Offline & Private

Proven Paths

Three steps. Ten minutes.Fully local.

Install

Configure

Run

One command to start

Common questions

Changelog

Ready to run locally?

Hit your limit? Need privacy?
Just swap the model.

You shouldn't have to stop
because the cloud did.

Three steps. Ten minutes.
Fully local.