AI-Driven Development with Claude Code: A Security and Privacy Argument

Why AI-assisted coding with Claude Code often produces more secure and privacy-respecting software than manual engineering, and where it doesn't.

5 min read

AI Claude Code Security Development Process

Alexandr Bykov Founder, OxiSoft

The usual story about AI coding tools is about speed. Write code faster. It is a real effect, and also the least interesting thing about using these tools every day.

A more interesting claim is that AI-assisted development, done carefully, often produces safer software than working by hand. Not because the model is smarter than a good engineer (it is not), but because it applies the boring safety rules without getting tired, and it makes certain checks cheap enough to actually run. This post is the long version of that claim, plus the places where it breaks.

The consistency problem AI actually fixes

Most security bugs in the wild are not exotic. They’re the same handful of mistakes made by tired humans under deadline pressure. An unsanitised input. A forgotten auth check. A secret committed to a config file by accident. A CSP header dropped from a copy-pasted Nginx block.

Any decent engineer knows every one of these patterns. We still ship the bugs, because knowing a pattern and applying it consistently across every PR, every file, every late-night hotfix are two very different things.

This is where a coding assistant does its most underrated work. When we drive a change through Claude Code, the model doesn’t get tired on line 800. It doesn’t skip a step because it already did it yesterday. It applies the same set of defaults every single time: sanitise here, validate there, don’t log secrets, prefer allowlists over blocklists, add SRI hashes to CDN tags. The floor on a codebase stops being “the tired version of the developer” and starts being “the default behaviour of a model trained on a pile of secure code.”

That’s a real change in the shape of software quality. The best manual code is probably still better than AI code on any given line. But the average line of code shipped by a careful engineer using AI is often better than the average line shipped without, because the average gets pulled up by a tool that doesn’t have bad days.

The privacy side of the argument

The security case for AI is well-known. The privacy case is quieter. Here is ours.

Asking an AI “does this change leak anything about a user?” takes about fifteen seconds. Running the same check by hand takes much longer. Under deadline pressure, it gets skipped. Three months later, a log file somewhere has a user’s email address sitting in plain text.

With Claude Code in the loop, that privacy review runs on every change. Is this log line printing personal data? Does this new endpoint need auth? Does this feature quietly ship identifiers off the device? The model will miss things, but it asks the question. In most codebases, the question does not get asked at all.

What AI-driven development does not fix

This would be a bad post if it ended at “AI good, manual bad.” The honest version of the argument has a long list of places AI makes things worse, and engineers pretending otherwise are the reason half of the AI-coding content online is useless.

AI still hallucinates APIs. Every session we catch at least one call to a function that doesn’t exist, a config key that was never a config key, or a library version that was never shipped. If you’re not reading every diff, you will ship phantom code.

AI defaults to “plausible” over “correct.” Ask it to dedupe a list and you’ll get the pattern that works on 95% of the inputs you’ll see. The 5% case (trailing whitespace, Unicode normalisation, null values) is where production bugs live. Those are still the human’s job.

AI is not a threat modeller. It can apply known security patterns beautifully. It cannot tell you that the feature itself shouldn’t exist because it makes your product a subpoena target.

AI amplifies whatever’s already in your codebase. If your repo has bad conventions, it will propagate them at speed. If you don’t have tests, it won’t write them unless you ask. If you never sanitise user input, the model will happily match that pattern in new code. The model mirrors the shop. It doesn’t rescue the shop.

Supply-chain risk moved up a layer. You’re now trusting a model provider the way you used to trust a CI vendor. That’s a different, not a smaller, category of risk.

The operating principle we landed on is simple: the engineer is the reviewer, the model is the intern. Every diff is read. Every test run is watched. Every secret-handling change gets an extra eye. The speed gain doesn’t come from skipping review. It comes from the intern producing a much better first draft than a junior engineer would, so review takes less time per line.

What this looks like in practice

The short version of our workflow:

Write a plain-English specification before asking the model to code. If the spec is vague, the code is vague.
Break the work into small, reviewable changes, one concern per commit. AI-generated PRs of 2,000 lines are unreadable and un-reviewable.
Read every line. Every one. “It compiles, it tests, ship it” is how quiet bugs get into production.
Run the thing. Tests pass does not equal feature works. Always exercise the change in a browser, a CLI, a running binary.
Keep the model away from secrets. Real API keys, customer data, and production credentials never go into prompts. Not once.
Capture the reasoning. Every non-obvious decision (“why we chose X over Y”) goes into a commit message or a short design note. Future-you and future-model both need that context.

The bottom line

Manual development with a careful senior engineer is slow, expensive, and on a good day the highest quality work you can get. AI-assisted development with the same engineer is faster and, more importantly, more consistent. The ceiling is maybe a little lower. The floor is much higher.

For products where a quiet bug can leak a user’s data, consistency is the thing that matters. The best line of code never written is the one a model caught before review.