marcduerst.com

AI-Assisted Software Product Development — A Practical Field Report

Collaborative coding with AI assistant

Since the beginning of 2026 I’ve been doing more and more AI-assisted software development. The models reached a level where they are genuinely helpful — not perfect, but useful enough to change how I work every day. To explore not just the coding assistance but the full software product development cycle, I started building feldnah.ch — a SaaS application. From idea to business concept, requirements engineering, planning, refinement, development, deployment, security reviews, and marketing. I wanted to understand how future software product teams might look and act.

The following are my findings so far.

Spec-Driven Development is Back

If you’ve been in the industry long enough, you remember spec-driven development. BDD with Gherkin, SpecFlow, custom step definitions — it was a lot of ceremony. The idea was right but the tooling overhead killed it for many teams.

With AI-assisted development, spec-driven is making a comeback — but in a much more practical form:

  • You don’t need full specs. Write down the important points as bullet points. Then iterate with the AI on the final specification. The AI fills in the gaps you didn’t think of, and you correct what it got wrong.
  • Natural language, not DSLs. Specs are written in Markdown — no Gherkin syntax, no custom parsers, no SpecFlow bindings. GitHub, GitLab, and any editor renders them nicely as formatted documents.
  • No implementation from scratch. You don’t need to type every line of code. The spec becomes the input for AI-assisted implementation.

This is a fundamental shift: specs are no longer just documentation that gets outdated. They become the living input for your AI-powered development workflow.

Build a Machine-Readable Briefing

One of the most impactful things you can do is build a structured context that AI tools can read whenever they work on your project. I think of it as a “briefing” — everything the AI needs to know about your product, your standards, and your way of working.

This includes:

  • Business domain knowledge — what the product does and why
  • Corporate design — tone of voice, visual language, color schemes
  • User interaction guidelines — how your app should behave
  • Technology stack and priorities — what you use and what you prefer
  • Coding and pattern guidelines — e.g., “we do vertical slices combined with CQRS”

Keep it concise. Don’t be verbose — the AI reads it every time. Every unnecessary sentence burns tokens and dilutes the signal.

A Folder Structure That Works

Here’s the structure I’m currently using. It’s not set in stone — it evolves with every repo I work on. But it’s a solid starting point:

specs/
├── concept.md                   # Core business concepts, USPs, principles
├── architecture.md              # Technical architecture, tech stack, patterns
├── ubiquitous_language.md       # Shared terminology (DDD-inspired)
├── userinterface_guidelines.md  # UI/UX rules and guidelines
├── backlog.md                   # Human-written ideas and backlog items
├── flows/                       # Important user flows
├── stories/                     # One markdown file per story
│   ├── _template.md             # Template for new stories
│   └── done/                    # Archive of completed stories
└── adrs/                        # Architecture decision records
    └── _template.md             # Template for new ADRs

Your README.md should be a concise summary of the most important things — think of it as the landing page of your repo. It’s the first thing people see when they browse your repository or open a PR review on GitHub. Place links to the files in /specs for deeper reading.

Then configure your AI tools (like CLAUDE.md for Claude Code) to reference README.md and the /specs folder as context.

Make the AI Assistant Yours

You can brief the AI on how to behave and respond to you. A colleague told me he configured his Claude to yell back at him or respond in Swiss German when in “yelling mode.” This gives the AI a more personal touch — it makes those long AI-assisted sessions feel less like commanding a dumb machine all day.

I personally switched back to a default conversational mode. I found myself reading responses seriously until I realized the AI was just joking — it slightly increased my mental load. But try it. Some people love it.

My Refinement and Implementation Process

This is where it gets practical. Here’s my actual workflow for going from idea to production. It works very well as a solo developer, though it would need adaptation for larger teams.

  1. Capture the idea — Write rough ideas as headings in backlog.md with bullet points. Quick, unstructured, just getting thoughts down.
  2. Generate the story — Prompt: “Write a story for backlog item X.” The AI uses stories/_template.md, respects my specs, ADRs, and past stories. The result is usually more detailed than what a human team would write — and that’s intentional. A good story is key for a good implementation.
  3. Review and iterate the story — Review, edit manually or prompt the AI to fix specific aspects. When solid, commit backlog.md and all story files into Git — they’re part of the codebase.
  4. Implement — Prompt: “Story X looks good, implement it.” The AI reads the story, specs, existing code, and writes the implementation.
  5. Review and test — Quick code review of what changed, focusing on the important parts and unit test coverage. Then manually test every user-facing change. No exceptions.
  6. Iterate — Using AI prompts or, rarely, by coding manually.
  7. Document decisions — If architectural decisions were made, have the AI write a new ADR. If specs need updating, tell the AI — or it does it automatically because it’s part of my story template.
  8. Close and ship — Prompt: “Close story X.” This writes the changelog, updates the story status to done, removes it from the backlog, and moves the story file to /specs/stories/done. Then “commit and push” triggers CI/CD and the feature is live.

A Note on Code Reviews

How I review AI-generated code evolved over time. In the beginning I reviewed every single line. Now I focus on the big picture and the critical parts. I don’t spend much time reviewing CSS changes or straightforward UI code.

I also adjusted my mindset on code quality: not every line needs to be diamond-level from day one. Core logic and persistence absolutely should be. But the UI layer can have some redundancy — I can refactor it all at once later using AI. This would have been heresy for me two years ago.

Beyond Coding: AI in the Product Lifecycle

Human-AI collaboration across the software development lifecycle

AI-Assisted Market Analytics

I use AI to query various aspects of the business around my product. The specs provide context — they’re like a persisted memory for the AI. I always ask for sources to validate business numbers. If your market analysis is based on speculation, your product will hit the wall.

AI-Assisted Security Reviews

AI-assisted security reviews are remarkably effective when you have good specs. The AI understands what’s important for this specific product and at what security level you need to operate.

I had a genuine wow moment when I asked the AI to calculate worst-case scenarios for my SaaS app. “What happens if my app gets DDoS’ed? What will be the costs?” Claude Code produced a comprehensive analysis with serious calculations for various scenarios. I was already in good shape (it’s not my first cloud app), but it found interesting gaps that I then closed. The cost calculations were surprisingly accurate — I double-checked them against actual cloud pricing.

Always verify security findings. But as a starting point, it’s incredibly efficient.

AI-Assisted Operations Debugging

In my day job at Switzerland’s biggest e-commerce site, I started using AI for production issue debugging. My reasoning: if there’s one thing these models are good at, it’s finding patterns and correlations in large amounts of data.

I connected various data sources:

  • Git via CLI — for looking up implementations including infrastructure-as-code
  • GitHub via gh CLI — for analyzing deployments to production
  • Datadog via MCP — for standard and custom metrics
  • Google Cloud via gcloud CLI — for querying real infrastructure and GCP logs

I also wrote a docs/monitoring.md with all monitoring links: dashboards, metrics, notebooks, alerts. Then I created a skill (a reusable prompt) that tells the AI how to drill down production issues, references the monitoring doc and specs, and outputs a readable Markdown report.

I iterate on that report like a living document — it becomes the persisted context for the investigation.

What genuinely surprised me was the patterns the AI found — correlations I wouldn’t have spotted as quickly myself. It also figured out creative ways to query the information it needed, adapting well to our specific tooling and data structures. When standard queries weren’t enough, it wrote Python scripts on the fly for more complex data gathering and analysis.

That said, it’s critical to double-check every finding. You need the expertise yourself — you cannot blindly rely on AI for production debugging. I always include something like “no speculations, only hard and validated facts, double-check numbers if possible” in my prompts. Without that, the AI tends to fill gaps with plausible-sounding guesses — which is the last thing you want in a production incident.

AI-Assisted Cloud Cost Debugging

When cloud costs spike unexpectedly, you want to know why fast. This can be straightforward or extremely complex and time-consuming.

I use the same approach as operations debugging, but with an additional data source: cost reports (via gcloud CLI or CSV exports from the GCP console).

I was genuinely surprised at how well Claude Code correlated cost increases with operational events and traced complex, accumulated effects along a timeline. It can reduce a massive amount of time and effort that would otherwise go into manually cross-referencing billing data, metrics, and deployment events. But you need cloud cost expertise yourself to quickly judge when the AI is going off track — and it will, especially with complex multi-service billing.

More Ideas for AI Assistance

The list of things AI can assist with keeps growing. Here are some I’ve explored or seen in practice:

  • Implement UIs from Figma designs — AI reads existing Figma documents via MCP and generates the frontend code to match
  • Generate social media content like Instagram reels using Canva MCP
  • Web UI prototypes using Gemini’s web preview capabilities
  • UX documentation — tone of voice, color schemas, personas
  • AppStore screenshot updates — stitching new screenshots into marketing images
  • Release notes and changelogs — generated from commit history and stories, adapted to the target audience (users, not developers)
  • Dependency updates — update libraries and frameworks including migrating breaking changes across repositories
  • Concept drafts from Confluence and Jira using connectors/MCP
  • UX analysis — AI reviews your interface and suggests improvements
  • SEO analysis and optimization right in the code
  • Social media post optimization — tailored per platform (Instagram is different from LinkedIn)
  • Automated user testing — define different personas and their behavior, then let AI test your software automatically with each of them
  • Regional research — e.g., “find all farmers with farm shops within 20km that have online ordering, and compare their features”
  • Translation automation in web and native apps
  • Full native app releases — build, translate, publish to Apple, generate “what’s new” tailored to your audience in all languages, and more. Goes way beyond static CI/CD pipelines because it can adapt

Thoughts on What This Means

I want to close with some broader reflections. These are things I think about a lot, and I don’t have all the answers.

The Industry Can’t Ignore This

AI tooling has reached a level where the software industry cannot afford to look away. Companies that don’t adapt will face serious competitive disadvantages. The same goes for professionals — dismissive hate-speech about AI tools won’t age well. The pragmatic approach is to understand what these tools can and can’t do, and use them where they genuinely help.

Quality Still Requires Expertise

To get professional-quality results, you need to have the expertise. AI amplifies what you bring to the table — if you bring deep knowledge, you get excellent results. If you bring nothing, you get plausible-looking garbage. The human in the loop is not optional. It’s the quality gate.

The Junior Developer Question

This one keeps me up at night. AI handles most of the basic tasks that juniors traditionally learn on. But to work efficiently with AI, you need decent domain and technical knowledge. How do you build that knowledge if AI is doing the doing? Our industry needs to think seriously about mentorship, learning paths, and what “junior” even means in an AI-assisted world.

Organizational Change Beyond Code

This isn’t just about AI-assisted coding. Workflows, processes, and even team structures will change. Product owners will refine differently. QA will test differently. The roles of architect, developer, and operator are blurring. We’re looking at a fundamental shift in how software product teams are composed and how they collaborate.

I’m joining a temporary cross-functional team at Digitec Galaxus that’s exploring exactly these new ways of working. I’m looking forward to learning a lot about future team dynamics and workflows — and I’ll keep you posted.

The Pace and the Uncertainty

I can’t imagine what will be possible in just six months. The pace of improvement is breathtaking. But I suspect it will slow down at some point as things saturate. Every major AI company is running deep financial losses. If this technology persists — and I believe it will — it will likely become more expensive.

The Environmental Cost

The environmental impact of this way of working is significant, and I’m not a fan of this side of the medal. Training and running these models consumes enormous amounts of energy. As an industry, we need to ask: how do we make this sustainable? I don’t have a good answer yet, but ignoring the question isn’t one either.


This post reflects my experience as of early 2026. The tooling and best practices are evolving rapidly. What works today might look different in a few months — but the underlying principles of good specs, human expertise, and critical thinking will remain.