claude-codex-local

┌─ /problem sound familiar?

Sound familiar?

01You shouldn't have to stop
because the cloud did.

Developer time is expensive. Cloud limits are arbitrary. You deserve a backend swap that actually works — local, alternative cloud, or hosted SaaS.

01 · QUOTA

Your quota ran out — mid-session

You're deep in a refactor, context loaded, momentum built. Then — rate limit. Everything gone. Back to square one.

02 · PRIVACY

Your code can't leave your machine

Consulting contracts. IP clauses. Air-gapped servers. You need a backend that respects your constraints.

03 · FRICTION

Every alternative breaks your workflow

New tool. New configs. New muscle memory. You don't want to learn something new. You just wanted a backend swap.

├─ /solution keep your tools · swap the backend

Solution

02Meet claude-codex-local

Keep your tools. Swap the backend. Ship locally or route to cloud. Every skill, agent, MCP server intact.

01 · WIZARD

9-step guided setup

From zero to working local session in under 10 minutes. Auto-detects Ollama, LM Studio, and llama.cpp. Resumes on failure. No manual file surgery.

02 · LLMFIT

Hardware-aware model picking

llmfit analyzes RAM and GPU. Recommends models that actually fit — no more OOM crashes or guessing.

03 · ISOLATED

Zero config breakage

Your real ~/.claude and ~/.codex are never touched. All config isolated. Rollback in seconds.

04 · ALIASES

One alias to rule them all

Type cc/cx for local, cc9/cx9 for 9router, cco/cxo for OpenRouter. Skills, agents, MCP — intact.

05 · OFFLINE

Offline & private (local mode)

Code never leaves your machine. No telemetry. No phone-home. After model download, zero internet. Air-gap ready.

06 · ROUTER

Cloud routing (9router)

Need cloud power without Anthropic limits? 9router routes to alternative providers — OpenAI, DeepSeek, more — while keeping your harness intact.

07 · SAAS

Hosted SaaS (OpenRouter)

Want a hosted alternative? OpenRouter forwards calls to dozens of cloud models. Use cco/cxo/cpo aliases; local cc/cx/ccp untouched.

08 · MTP

llama.cpp MTP auto-detect NEW

Drop in a Multi-Token Prediction GGUF (Qwen, GLM, DeepSeek MTP variants) and llama-server auto-launches with --spec-type draft-mtp for free decode speedups. GGUF metadata probe + filename fallback; LLAMACPP_MTP_ENABLED and LLAMACPP_SPEC_DRAFT_N_MAX let you override.

09 · TUNER

llamacpp-tuner skill NEW

A Claude Code skill that benchmarks coding-agent workloads against your llama.cpp setup and recommends server flag profiles. Ship a faster local backend without hand-tuning.

├─ /how-it-works 3 steps · 10 minutes

How it works

03Three steps. Ten minutes.

Local or cloud routing — the wizard wires either path safely.

01STEP

Install

Run one curl command. The wizard auto-detects your runtimes, checks your hardware, and flags anything missing.

02STEP

Configure

Answer a few prompts. Pick your runtime, pick your model — or let llmfit pick for you. The wizard wires everything up safely.

03STEP

Run

Type cc or cx for local. cc9/cx9 for 9router. cco/cxo for OpenRouter. Every skill, every agent, every MCP server intact.

├─ /model-guide 8GB → 192GB · find your fit

Model guide

04Which coding model fits your memory?

From 8GB to 128GB+ — a hardware-tiered map of Qwen2.5-Coder, DeepSeek-Coder, CodeLlama, and more.

View model selection guide

├─ /install pip · uv · curl · source

Get started

05One command to start

Available on PyPI. Pip, uv, or a one-line shell installer.

01 pip

recommended

$pip install claude-codex-local

02 uv

alternative

$uv tool install claude-codex-local

03 curl · no clone required

shell

$bash <(curl -sSL https://raw.githubusercontent.com/luongnv89/claude-codex-local/main/install.sh)

04 from source

advanced

$git clone https://github.com/luongnv89/claude-codex-local.git

$cd claude-codex-local

$python3 -m venv .venv && source .venv/bin/activate

$pip install -e .

$ccl

Proven paths [7 combinations]

✓ Claude Code + Ollama + gemma4:26b ✓ Codex CLI + Ollama + gemma4:26b ✓ Claude Code + 9router + OpenAI/DeepSeek ✓ Claude Code + OpenRouter + claude-sonnet-4.6 ⚠ Claude Code + LM Studio + Qwen3 ✓ Any + llama.cpp + GGUF (Hugging Face) ✓ Any + llama.cpp + MTP variant (Qwen / GLM / DeepSeek)

├─ /faq 6 questions

FAQ

06Common questions

The questions that actually show up in issues, ordered by frequency.

No. It keeps both tools exactly as-is and adds a local backend bridge. Your harness, your skills, your muscle memory — all intact. One alias, zero disruption.

Local engines: Ollama (primary, with native ollama launch), LM Studio, llama.cpp (with MTP auto-detect for Qwen/GLM/DeepSeek MTP variants — auto-applies --spec-type draft-mtp for faster decode), vLLM — all auto-detected during setup. For llama.cpp, GGUF models download directly from Hugging Face via the built-in huggingface-cli integration.

Cloud routing: 9router — a local cloud-routing proxy for alternative providers (OpenAI, DeepSeek, etc.). The wizard wires cc9/cx9 aliases alongside your existing local ones.

Hosted SaaS: OpenRouter — a hosted alternative with an OpenAI-compatible API at openrouter.ai/api/v1. No daemon, just an API key. Adds cco/cxo/cpo aliases.

Never. Your real ~/.claude and ~/.codex are untouched. All local config lives in .claude-codex-local/. To fully revert: remove that folder and the alias block added to your shell rc. Done.

It depends on the model. The wizard uses llmfit to analyze your RAM and GPU, then recommends models that actually fit. Smaller machines get smaller, faster models. Larger machines unlock more capable ones.

Yes, for local engines. After the initial model download (handled by Ollama et al.), no internet connection is required. No telemetry, no phone-home, no license server. Fully air-gap capable. Note: 9router and OpenRouter route to cloud and need internet.

Instantly. The official claude and codex commands are completely unmodified. The cc/cx aliases are only for local sessions. Cloud and local coexist.

├─ /whats-new latest release

What's new

07Latest release

Following Keep a Changelog and SemVer.

v0.16.0

2026-05-22

Latest

Added

Auto-fetch available models during remote engine model selection (#134): the wizard's step 4 model picker now calls the remote API to list models instead of showing a local-only picker.
Smart remote endpoint URL scheme detection (#134): bare IPs/hostnames get http:// and default port auto-appended.

Fixed

Rename Pi local shortcut cp → ccp (#120): cp alias no longer shadows the POSIX copy command.

View full changelog →

Open source · free forever

Ready to swap the backend?

Local model, alternative cloud, or hosted SaaS — same harness, one alias. Free, open-source, MIT licensed. Under 10 minutes.

Start my setup View on GitHub

Hit your limit? Need privacy? Just swap the backend.

01You shouldn't have to stopbecause the cloud did.