Hamilton Ridley Consulting

Field Notes · Vol. 03

From Prototype to Production
For New Builders · 2026

A short guide

"It works on my laptop."

Daniel Kemp·Published May 4, 2026·8 min read

The most expensive sentence in software. Most vibe-coded projects don't die from bad code — they die at the moment the first real user shows up and finds something nobody thought to check. The work between "demo" and "production" is the part nobody warns you about, and the part that separates a hobbyist from someone a paying client trusts.

The gap nobody warns you about

If you can't tell whether the app is up right now without opening it, you're not in production — you're hoping. Real production has someone (or something) watching, a way to find out before users do, and a one-click path back to the last version that worked.

Lane 01

Launch Readiness

The 30-min pre-flight pass

Lane 02

Operate

The mundane upkeep that matters

Lane 03

Observe

Knowing what's happening

Lane 04

Improve

The user feedback loop

§ 01

The four lanes

Lane 01

Launch Readiness

The 30-minute pre-flight pass before you flip the switch. The boring verification that catches the bugs you'd otherwise discover in front of your first real user.

What changes between optionsHow much you can verify before launch vs. how much only shows up live. More automated checks = fewer 11 p.m. surprises.

Manual smoke test

Walk the golden path one last time. Five minutes, catches the obvious.

Playwright (E2E)

Automated browser tests on critical paths. Re-runs on every deploy.

Lighthouse / Vitals

Performance baseline. If it's slow on launch, it's slower with users.

Staging environment

A copy of production for scary changes. Vercel preview URLs are this for free.

Lane 02

Operate

The mundane upkeep. Deploys, dependency updates, secrets rotation, scheduled jobs that need to actually run on schedule. The boring monthly stuff that keeps the project alive.

What changes between optionsHow much you've automated vs. how much you do by hand. Manual is fine at one user; punishing at fifty.

Auto-deploy on push

Vercel/Netlify default. Push to main = deploy. The lowest-friction operate loop.

Dependabot / Renovate

Auto-PRs for dependency updates. Free, on by default in GitHub.

runbook.md

A markdown file in the repo: "what to do when X breaks." Future you will thank present you.

Status page

Public or private. Sets expectations during incidents. Worth it once you have ≥5 users.

Lane 03

Observe

Logs, errors, metrics, user behavior. The instruments that turn "the app is fine" into a verifiable claim instead of a hope.

What changes between optionsWhat question you're trying to answer. "Did it crash?" "Is it slow?" "Did anyone use that new button?" — each needs a different tool.

Sentry

Errors with stack traces and user context. Free tier covers most small projects.

PostHog

Product analytics, feature flags, session replay. The "what are users doing?" tool.

Plausible / Vercel

Privacy-first pageview analytics. No cookies, no banner needed.

Structured logs

BetterStack, Logtail, Axiom. Searchable history for the "what happened at 3 a.m." moments.

Lane 04

Improve

The feedback loop. User input + observed behavior + bugs → ranked work that's actually worth doing. The discipline of building the right thing next, not just the next thing.

What changes between optionsHow short the loop is. Daily user feedback beats quarterly product reviews. Real telemetry beats opinions.

GitHub Issues

The cheapest issue tracker. Free, lives next to the code, AI agents can read it.

In-app feedback

A Tally form, Featurebase, or Canny. Lower the barrier to "tell me what's broken."

15-min user calls

Irreplaceable. One real conversation beats ten survey responses every time.

A/B tests

PostHog, Vercel, Statsig. Useful only after you have enough traffic to learn from.

§ 02

The same three projects, ready for the real world

Project A

Contractor lead form

One page, one form, one owner reading the emails. The cost of a 20-minute outage is "missed two leads." The whole production footprint should fit in a notebook page.

Set & forget

01 · Launch

Manual smoke test

Fill the form, confirm the email lands. Five minutes, no Playwright needed for one form.

02 · Operate

Auto-deploy only

Vercel handles deploys. Dependabot in the background. No scheduled jobs, nothing else to run.

03 · Observe

Vercel free tier

Pageviews and basic uptime, free. If something serious breaks, the owner stops getting emails — that's the alarm.

04 · Improve

Owner check-in

Quarterly email: "what's working, what isn't?" The user is one person — talk to them directly.

Project B

Multi-user CRM

Eight team members rely on it daily. Real revenue runs through it. A bad deploy at 9 a.m. ruins someone's morning. Real production discipline applies.

Real customers, real expectations

01 · Launch

Playwright + staging

E2E tests on the top three flows (login, create lead, log call). Vercel preview URLs as staging. Test scary changes on a copy first.

02 · Operate

Full automation + runbook

Auto-deploy + Dependabot + a runbook.md covering the five things that break most. Quarterly secrets rotation reminder on the calendar.

03 · Observe

Sentry + PostHog

Sentry pages you on errors. PostHog tells you which features actually get used. Two complementary lenses on the same app.

04 · Improve

Tally + monthly calls

In-app feedback widget collects pain points. One 15-min call/month with the owner. The two together drive the backlog.

Project C

Nightly automation

Runs at 2 a.m. with no one watching. Pulls four APIs, transforms data, sends a report. The failure mode that ruins this kind of project is silent failure — running for three nights and producing nothing.

No one's watching

01 · Launch

Replay last week

Run last week's input through the new code. Diff the outputs. The closest thing to a "test" for a data pipeline.

02 · Operate

Heartbeat + runbook

Healthchecks.io on success. runbook.md covers "API X timed out" and "disk full." Two failure modes cover ~80% of incidents.

03 · Observe

Logs → Slack

Structured logs to BetterStack. Slack notification on each completion (with row counts). You see the heartbeat, not just hear silence.

04 · Improve

Quarterly drift review

Once a quarter, look at: which APIs are flaky, which transforms got slow, what changed in the source data. Catch drift before it becomes outage.

The shortest possible advice

Production isn't a destination — it's a discipline.

Every choice in this guide answers one of three questions: did I check it before launch?, am I watching it now?, and am I listening to what users tell me? The smallest, scrappiest version of each is still infinitely better than nothing — and the gap between "running" and "running well" closes fast once you start.

Did I check it?

Even the cheapest pre-flight pass — five minutes filling out the form, walking the golden path — catches the bugs that ruin a launch. Free, fast, almost always skipped.

Am I watching it?

Pick one observation tool that fits the project's failure mode. Sentry for crashes. Healthchecks for silence. PostHog for "is anyone using this?" One is enough; zero is not.

Am I listening?

The shortest feedback loop wins. A 15-minute call, an in-app form, a quarterly check-in — it doesn't matter which, as long as something turns user reality into your backlog. Otherwise you're guessing.

Field Notes Subscribers

Vol. 04 — Working with AI agents productively.

Subscribe to get future volumes — model picking, context discipline, when to start over, and the partner channel where you can hire a real engineer to review your project. One email per release.