My Current AI Codegen Setup

 This is as of May 2026, it’ll likely change.

Models: I mostly use the Anthropic models (a mix of opus, sonnet and haiku). At this point, that's partly because I also use their harnesses/coding agents and have a bit more trust in them as a company. That said, I'm trying to ensure my development environment is able to switch model providers easily as OpenAI and Google's models seem to be in the same ballpark of functionality and the open models are coming on strong. Exceptions to this are for images (Gemini & ChatGPT). I currently don't do much video beyond playing around.

Harness: Claude Code, mostly because it has consistently led the way in useful innovation. I'm looking to do more experiments with Codex. I'm also curious about the tools built for specific environments, like Google AI Studio. I use lots of subagents and sometimes (very rarely) have multiple agents working the same codebase.

Permissions: This deserves its own category because of how important it is. I use --dangerously-skip-permissions when I am running in a container (see environments below). I also am playing around with the "auto" permission mode. In each case, it is hard to overstate how different it is to the regular approval mode. The best analogy I can come up with is the difference between bluetooth and wired headphones. Wires don't seem very inconvenient but once you start using bluetooth you end up using audio on your phone a TON more. YOLO mode completely changed how I use AI codegen allowing for a lot more evaluation of results rather than the process to get them.

Skills: I am consistently shocked by how useful Jesse Vincent's Superpowers is. I need to do some experiments without it as it is so key to my coding workflow that I've lost any comparison. I also use a bunch of 2389's products, including Simmer, Scenario Testing, Review Squad and Fresh Eyes Review. I keep meaning to try their skills for giving agents journals and drugs (yes, you read that right). Roborev is my core catch all for reviews. And Jesse’s writing skill that puts Strunk & White in context is pretty great.

Mechanical Bread Kneader 
The Encyclopaedia Britannica. United Kingdom, 1875, p. 257.

Git: Git is key. Knowing how git is structured and how to ask your agent to do various things in git is very important. The build, test and revert to before the build cycle until the agent gets it right is a very good pattern.

Agent & Git Hooks: The various skills use a variety of session-start hooks. I also use roborev on git post-commit. I think there is a lot more power here that I am not using well.

Claude.md/Agents.md: Mine is Claude.md. It is key to review this over time. Jesse Vincent's and Harper Reed's are good places to start.

After-session review: I use /insights and also ask specific questions to figure out how things could have gone better. Make sure you have “"cleanupPeriodDays": 999” set in your Claude settings.json file to keep the logs longer. My environment script is a good example of the outcome (see below).

Environment: Depending on what I am doing, I'm either on my local host system (a Framework 16 running Fedora Silverblue) when rarely absolutely necessary, in a toolbox when I am doing very limited scope work, or, almost always, in a container managed by a custom version of packnplay. Running in multiple, weird environments confuses agents so I have a script and companion skill that tells the agent what it is running in and how to do some basic things (like which port is forwarded for me to access whatever web server it runs).

Launching: I use some combination of Railway and Github pages. Both are relatively easy for an agent to navigate.

Logs: I don't currently have a good system for monitoring logs and ensuring that agents have easy access to logs for debugging. It seems like that would be a good add.

I’d be interested in things you think I’m doing wrong or other peoples’ setups.