matthalltech/developing-with-claude

Fork 0

matthalltech 742ba4607c docs: rewrite — story arc, analogy bridge, two-part structure

2026-04-01 14:15:49 +00:00

24 KiB

Raw Blame History

title

document_type

status

created

updated

Developing with Claude — Getting Started

Run AI coding agents safely on your own machine. Works on a laptop. Grows as far as you want to take it.

The Problem

You have been using Claude in a web browser. It works — you ask a question, you get an answer, you build something together. Then you close the tab. The next time you open it, Claude has no idea what you were working on. Every session starts from zero.

If you have tried anything ambitious — a multi-day project, infrastructure work, a codebase that evolves — you have hit this wall. The context dies with the session. You end up re-explaining your setup, re-describing your architecture, re-pasting the same files. The tool is powerful but it has no memory.

This guide describes a different model. One where context survives.

If you already work with Docker, Git, and infrastructure-as-code, you know most of the building blocks here. What is different is the memory model — skip to Step 1 to start building, or read the next section for why this approach works.

The Idea

In a healthy forest, trees are not isolated. Beneath the soil runs a mycelium network — a living mesh that stores signals, routes nutrients, and lets organisms share information without any single node holding everything. No single tree holds the full picture. The network does.

Now think about how a company works. There is a mailroom that receives and routes incoming material. There is a reception desk that directs visitors to the right department. Department heads make decisions within their scope. An executive team sets direction. And a managing director signs off on anything that leaves the building. Every person has a defined role, a clear boundary, and a reporting line. Nobody freelances outside their area. Every significant decision is logged.

This is the same structure. The Git repository is the office building — it holds the rules, the roles, the decisions, and the institutional memory. The AI agents are the staff — each one has a defined scope, works within boundaries, and writes everything back to the shared record. The human operator is the managing director — nothing leaves the building without their sign-off.

This is not just a metaphor. In engineering, it is called a digital twin — a virtual replica of a real-world system that stays synchronised with it. NASA built physical twins of spacecraft in the 1960s so engineers could simulate problems from the ground before they became emergencies in orbit. Today, companies build digital twins of their entire operations in software — every process, every role, every decision rule exists as code that can be versioned, tested, and rebuilt from scratch.

That is what the Git repo is. It is the digital twin of your development operation. When something changes — a new tool is added, a decision is made, a process is refined — it is written into the repo. When a new session starts, the agent reads the repo and picks up exactly where the last one left off. Nothing lives in chat history. Nothing lives in someone's head. The repo is the single source of truth, and everything else grows from it.

Core Principles

These are the design decisions that make the system safe and portable.

1. The repo is the memory. Instructions, state, decisions, plans, and handoff notes all live in Git. Agents read the repo before starting work. If a session ends and a new one begins, nothing is lost — it is all in the repo.

2. Secrets stay on the host. API keys, SSH keys, vault passwords, and tokens never go into the repo or the container. They live on your machine and are passed in at runtime. The container is safe to rebuild without worrying about credential exposure.

3. Production actions run on the host, not in the container. Infrastructure commands (Ansible runs, SSH to servers, DNS changes) run in your terminal — not from inside the AI workspace. The container is powerful within its boundary; it does not have unlimited production reach. You are always the gate.

4. Persistence is designed, not assumed. Tool authentication and configuration are stored in named Docker volumes rather than the container's writable layer. A bootstrap script verifies that everything is present after rebuilds. Nothing relies on the container surviving.

5. Start simple. Grow deliberately. The baseline is a laptop and a repo. Everything else — a VPS, a home server, local AI models, a monitoring stack — is an optional layer added when you actually need it. Most of the value is available from day one on modest hardware.

Part 1 — Get Running

Everything below this line gets you from zero to a working system. Five steps, one afternoon.

What You Actually Need to Start

A laptop or desktop (macOS, Linux, or Windows with WSL2)
Docker
Git
A Claude account (for Claude Code)

That is it. No server. No GPU. No VPN.

Step 1 — Understand Your Machine First

Before setting anything up, run hardware and environment discovery. This tells you what your machine can actually do and which setup path makes sense.

Paste this into Claude. Steps 1 and 2 work in any Claude interface — the web chat at claude.ai, the Claude desktop app, or Claude Code if you already have it. Claude Code is set up later in Step 3.

You are a workstation discovery agent.

Your job is to discover the real machine state before any architecture decisions are made.

If you have access to a terminal (e.g. Claude Code), run the commands directly.
Otherwise, give me the commands to run and I will paste the output back.

First ask me which OS I am using: macOS, Linux, or Windows.
Then give me only the commands for that OS.
Keep all commands read-only and safe. Do not recommend software yet.

Collect:
- OS and version
- CPU model and core count
- RAM
- GPU model and VRAM if present
- Storage free space
- Whether Docker, Git, and Python are already installed
- Whether the machine is suited to: cloud API only / cloud plus optional local LLM / serious local LLM use

For Apple Silicon: evaluate total unified memory as the practical local inference budget.
For discrete GPU: confirm VRAM and whether hardware acceleration is available for ML workloads.
If the machine is strong, say so clearly. If local inference is not realistic, say so clearly.

Output a Markdown report saved as machine-discovery.md.

Save the result as machine-discovery.md somewhere on your machine (outside the repo).

Step 2 — Design Your Setup

Now take that discovery report and use it to decide what to build.

Use Claude Opus for this step. Opus reasons more deeply about trade-offs and produces better plans than faster models. It burns more tokens — that is the trade-off. Use it when you are making architectural decisions. Use faster models for day-to-day work once the system is running.

In Claude Code, switch model with: /model claude-opus-4-6 or use the /fast toggle to switch back when you want speed over depth.

Paste this into Claude Opus, along with the contents of machine-discovery.md:

You are a systems architect helping design a safe, reproducible AI development workspace.

I will give you a machine discovery report. Your job is to produce a practical setup plan for that exact machine.

Goals:
- A clean, bounded workspace where AI agents can work safely
- Git as the persistent memory and control plane
- Secrets kept outside the repo at all times
- Production actions kept outside the container
- A setup that can be rebuilt from scratch without losing anything important
- Local AI inference only if the hardware genuinely supports it

Review my discovery report and recommend:
1. The right container strategy for my machine (native + venv / devcontainer / Docker-first)
2. The folder layout I should use
3. Whether local LLM is realistic and which runtime to start with if so
4. The first three things to set up

Machine discovery report:
[paste machine-discovery.md here]

Step 3 — Build the Baseline

Follow the plan Claude produces. The typical starting baseline:

Create a workspace root folder (e.g. ~/AgentLab/ on Mac/Linux, C:\AgentLab\ on Windows)
Create subfolders: your-repo/ (the Git repo), work-rw/ (read-write scratch space for agent output), drop-ro/ (read-only input — files you drop in for agents to read)
Initialise a Git repo inside your-repo/ and create a README.md
Install Docker if not already present
Set up a devcontainer or Docker Compose file based on the plan Claude produced
Authenticate Claude Code inside the container
Ask Claude to describe the repo structure and confirm it can navigate it

Do not add more until this works cleanly.

When it is working, your workspace should look like this:

~/AgentLab/
    your-repo/          ← Git repo — the persistent memory
        README.md
        CLAUDE.md       ← (created later, Step 5)
    work-rw/            ← Read-write scratch space for agent output
    drop-ro/            ← Read-only input — files you drop in for agents to read

The repo is version-controlled. The working folders are not. The container mounts all three and can read or write as appropriate.

Step 4 — Orient Claude to the Current Tooling

Claude's training data has a cutoff date. It does not know about recent releases, breaking API changes, or tools that have emerged in the last year. If you let it write code or configuration from memory, it will confidently use outdated method names, wrong import paths, and deprecated patterns. This is one of the most common sources of wasted time when working with AI agents.

The fix is simple: force Claude to fetch current documentation before writing anything.

Run this prompt once Claude is up and running in your repo:

Review the repo structure — read the README and any other docs present.

Then do the following:
1. Fetch the current Claude Code documentation from the official Anthropic docs site
2. Store a summary of the key capabilities, current model IDs, and any recent changes in docs/vendor-docs/anthropic/
3. Identify any repo rules or config that reference outdated model names, deprecated flags, or stale API patterns
4. Propose updates needed to bring the configuration in line with current Claude Code behaviour

After completing this, confirm what version of Claude Code is installed and whether it matches what the docs describe.

This seeds the repo with verified, current documentation that Claude can reference instead of guessing.

The general rule — enforce it every time:

Whenever you ask Claude to use any third-party library, API, CLI tool, or Docker image, tell it explicitly:

Before writing any code or configuration for [tool], fetch the current documentation.
Do not rely on your training data for API signatures, import paths, or config keys.
Store your findings in docs/vendor-docs/ before proceeding.

Claude will push back on this — it will say it already knows. It does not. Its training data is frozen. The tool you are using today may have had three major releases since Claude last saw it. The fetch takes thirty seconds. The debugging takes hours.

Step 5 — Build the Foundation Documents

The repo is the memory — but right now it is empty. These prompts tell Claude to build the documents that make the system work across sessions. Use the scaffold prompt to create all four at once, or the individual prompts to take it one document at a time.

Scaffold the full foundation in one pass:

I have a new Git repo for AI-assisted development. I need you to create the
following foundation documents, asking me questions before writing each one:

1. README.md — what this repo is, current state, key links
2. ENTRYPOINT.md — agent session startup, read order, execution boundaries
3. AGENTS.md — what agents can and cannot do, branch model
4. docs/open-loops.md — persistent backlog of unresolved items

For each document: ask me the questions you need answered, write a first draft,
show it to me, and wait for approval before moving to the next one.

Or build them individually:

Create your session entrypoint document

I am setting up a Git repo as a control plane for AI-assisted development.

Create an ENTRYPOINT.md for this repo that:
- Lists the key documents an AI agent should read before starting work
- States what this repo is and what it is for
- States the execution boundaries (what runs in the container vs what runs on the host)
- States where secrets are kept and that they must never enter the repo

Ask me questions about my setup before writing anything.

Create your agent operating contract

Create an AGENTS.md for this repo that defines:
- What AI agents are permitted to do autonomously
- What requires explicit human approval before proceeding
- The Git branch model (who works on which branch)
- What agents must never do (commit secrets, push to main directly, etc.)

Ask me about my setup and which agents I am using before writing anything.

Create your open loops document

Create a docs/open-loops.md for this repo.

This file tracks unresolved items across sessions — things that are in progress,
decisions that have not been made, or work that is blocked. It should be read at
the start of every session so nothing falls through the cracks between conversations.

Start it with the current open items from our conversation so far.

Part 2 — Make It Better

The baseline is running. Everything below this line makes it more effective. None of it is required. Pick what is useful to you.

Voice Input

Typing long prompts into a terminal is a friction point. Local voice-to-text removes it. Audio is transcribed on your machine and inserted as text into whatever is focused — your terminal, your editor, Claude Code — without going to a cloud transcription API.

This is genuinely useful once the system is running. It makes working with Claude feel closer to thinking out loud.

macOS: SuperWhisper runs locally via WhisperKit on the Apple Neural Engine. Audio is processed entirely on-device. System-wide hotkey, inserts into any focused app. Setup: open SuperWhisper → Modes → create a new mode → set type to Voice to Text → in the mode settings, change the voice model to Standard (the free model — select the highest-quality option that is not padlocked). Built-in Apple Dictation is a zero-setup fallback.

Windows: OmniDictate — free, open-source, uses faster-whisper as the backend. Push-to-talk via Right Shift (configurable), types directly into any focused Windows application. NVIDIA GPU with CUDA gives best performance; CPU-only mode works but is slower on larger models. Available as a .exe installer from the Releases page. Actively maintained — v2.0 shipped December 2025. Windows SmartScreen will warn on first run (unsigned binary); click through.

Linux: WhisperTux — free, open-source, uses whisper.cpp as the backend. Global keyboard shortcut for start/stop, auto-injects text into the focused application. Works without a GPU — whisper.cpp runs on plain x86 with AVX. GPU acceleration supported if available. Works on GNOME and KDE. Install via git clone + python3 setup.py which handles dependencies, model download, and service registration.

One practical note: voice dictation produces transcription artefacts — misheard words, missing punctuation, run-on sentences. Claude handles this well if you prompt it naturally. You do not need to clean up every word before submitting.

Back Up Your Persistent Mounts

The container is disposable. The bind-mounted folders on your host are not — they contain everything that matters: your repo, your working files, your logs, your secrets. If your machine dies without a backup, all of it goes with it.

What needs backing up:

Folder	What's in it
`~/AgentLab/your-repo/`	The entire repo — all docs, scripts, state, rules
`~/AgentLab/work-rw/`	Working files and scratch space written by agents
`~/.ssh/`	SSH keys — lose these and you are locked out of every server
Infrastructure secrets	Vault passwords, automation credentials — losing these makes encrypted secrets unrecoverable
Voice app model files	Usually in `~/Library/Application Support/<AppName>/` on Mac — 100MB–1.5GB, slow to re-download

The git repo itself is partially protected by any remote (GitHub, Forgejo) — every push is a backup of committed work. Uncommitted work and non-git content (work-rw/, logs) need explicit backup coverage.

macOS — Time Machine: The simplest full-machine backup. Enable it with an external drive. Check that /Users/Shared/ is included in the backup scope — it is sometimes excluded by default as a non-standard user directory. Go to System Settings → General → Time Machine → Options to verify.

macOS — offsite/incremental: For the AgentLab workspace specifically, Restic pointed at your ~/AgentLab/ folder gives versioned encrypted backups to any remote target (S3, B2, SFTP, local NAS). If you already have a VPS with Restic running, adding your local workspace as a backup source is a natural extension.

Linux: Snapshot tooling varies by filesystem. ZFS and Btrfs both support snapshots natively. For a simple backup-to-remote, Restic works the same as on Mac. If you are running the workspace on a standard ext4 filesystem, use rsync or Restic to an external drive or remote target.

Windows (WSL2): Your workspace lives inside the WSL2 filesystem. Windows File History does not see WSL paths by default. Two options: use Restic inside WSL pointed at an external drive or cloud target, or copy the workspace to a Windows-native path (\\wsl$\Ubuntu\home\...) and include it in File History. The WSL approach is simpler and avoids permission mismatches.

Before any important session: Take a quick backup — or at minimum ensure Time Machine or your snapshot tool has run recently. Container rebuilds are safe; host-side data loss is not recoverable from a Dockerfile.

Where This Can Go Next

Once the baseline is stable, these are natural next layers — each independent, each optional.

Layer	What it adds	Requires
Private Git hosting	Self-hosted Forgejo instead of GitHub	A VPS or home server
Infrastructure monitoring	Prometheus + Grafana + Alertmanager	Any always-on machine — local to learn, VPS for reliability
Log aggregation	Loki + Grafana Alloy	Any always-on machine
Secure remote access	WireGuard VPN	A VPS
Local AI inference	Ollama + Open WebUI	GPU with adequate VRAM
Multi-agent workers	Codex + Gemini alongside Claude	Existing workspace
Mobile operator access	Phone → agent pipeline	VPS + additional tooling
Business operations layer	Ticketing, CRM, project management	Depends on scale

None of these are required to get started. Each one is a decision made when you have a real reason to make it.

How to Work With This System

A few things that will save you significant frustration.

Claude is not a search engine. Do not ask it what the current version of something is, whether a tool is still maintained, or what the pricing is. It does not know — it guesses from training data, and that data is old. Ask it to fetch the answer and show you the source.

The repo is the source of truth, not the conversation. If Claude tells you something in chat and you want to keep it, put it in a file in the repo. Chat is ephemeral. Anything important that is not written down will be gone when the session ends.

Longer context ≠ better results. Pasting an entire codebase into a single prompt tends to degrade quality. The repo structure means Claude can read only what is relevant. Let it navigate rather than dumping everything in.

Skills are reusable prompts with superpowers. Claude Code supports /skills — slash commands that expand into full operating prompts. Once you have built a workflow you repeat often (a deployment check, a code review pattern, a release process), encode it as a skill rather than retyping it. Run /help in Claude Code to see what skills are available and how to add your own.

Use Opus for planning, faster models for execution. Opus reasons better and produces more thorough plans. It also costs more per token. A good pattern: use Opus to produce a plan, then switch to a faster model to execute it. Do not run Opus for every small task — you will burn through your token budget on work that does not need that level of reasoning.

Ask Claude to write a script, not paste a command. When Claude gives you a long command to run in your terminal — especially anything with pipes, SSH, escaping, or multiple steps — stop. Ask it to write the command as a shell script in your tools/ folder instead, then give you one short line to run it. Chat windows wrap long lines in ways that silently break them when pasted. A script in a persistent mount never has this problem, can be inspected before running, and can be re-run without regenerating it. This one habit saves a disproportionate amount of time.

When something breaks, read the error before prompting. The most effective Claude debugging prompt is: here is the exact error message, here is the exact file and line, here is what I expected. Vague prompts produce vague diagnoses.

Common Pitfalls

"Claude just confidently wrote code that doesn't work." It used training data instead of current docs. Enforce the fetch-before-write rule. See Step 4.

"The container lost my changes after a restart." Something was written to the container overlay layer instead of a bind mount or named volume. Everything important needs to live on a mounted path or in the repo. The container itself is disposable.

"Claude keeps asking about things I already told it last session." Claude has no memory of previous sessions unless it is given one. The repo is the memory — write state into documents and have Claude read them at the start of each session. Create a startup document (e.g. ENTRYPOINT.md) that Claude reads first every session to rehydrate context.

"Two sessions seem to be conflicting." Multiple Claude Code sessions can run simultaneously. Before editing any shared file (settings, hooks), check whether another session has recently modified it. The git history is your audit trail.

"Claude suggested a tool that turned out to be abandoned / broken / macOS-only." It recommended from training data without verifying current status. Ask it to check the GitHub repo for last commit date, open issue response rate, and platform-specific reports before recommending anything.

Prompts for Day-to-Day Work

Once the foundation is built, these prompts are useful starting points for daily sessions.

Review what is open and decide what to do next:

Read the repo README and any state documents present.
Tell me: what is the current system state, what is unresolved,
and what is the highest-leverage thing I can work on today?

Plan a new component:

I want to add [component] to this repo.
Before writing anything, fetch the current documentation for every
third-party component involved. Do not rely on training data.
Then propose a phased plan, flag any open decisions, and list what I need to provide.

Understand the agent collaboration model:

Review any agent configuration and rules in this repo.
Explain how the agents are expected to collaborate — what each one does,
how they hand off work, and what prevents them from conflicting.

24 KiB Raw Blame History Unescape Escape