First experiment with Windsurf: Can I build and deploy I useful site in a single weekend?

2025-06-18

0. Who’s Writing This?

I’m have spent years whipping up internal tools: everything from CLI tools to full web-apps in Backbone.js, PHP, and plain‑vanilla Python. UI polish has never been my strong suit—most of my apps lived behind a VPN, and the users didn't generally have a choice about using them. Until this project, I had never touched React and had never shipped anything truly public‑facing. Aquacost.com was my experiment to see how far AI tooling—and a weekend’s worth of grit—could take someone who’s comfortable with back‑end logic but allergic to CSS quirks.


1. The Spark

Two forces collided:

  1. A brand‑new pool heat pump. I wanted to know—really know—how much running it would cost under different schedules and weather patterns.
  2. A desire to level‑up my AI‑coding chops. Windsurf’s “single‑prompt to production” pitch felt like the perfect playground, with ChatGPT o3 ready to double‑check my physics.

Result: Aquacost.com—a web app that simulates pool‑heating costs for any location—was born over one weekend.


2. The Tech Stack (and Why)

Layer Tech Why I Picked It
Frontend React + Vite on Cloudflare Pages + Render.com backend free.
Backend Flask (Python, no DB) Zero friction; familiar territory.
Physics Sim Custom water‑energy model, iterated with ChatGPT o3 Fast math and instant unit tests.
Dev Agent Windsurf Promised to scaffold the whole repo.
Manual AI Assist Carefully crafted o3 prompts in ChatGPT UI For precise, context‑rich edits.

3. What Worked

  • Lightning‑fast scaffolding. Windsurf handled the blank‑page terror.
  • Physics pair‑programming with o3. The LLM caught unit mistakes I would have missed.
  • Edge hosting. Static front end plus a tiny API for zero cost.

4. Where It Broke

  1. Complexity cliff. Once the repo topped a few hundred lines, Windsurf’s diffs touched everything—failing lint, renaming constants, undoing fixes.
  2. Tiny UI changes produced huge diffs. A request to center a button triggered 30‑file rewrites.
  3. Context drift. Windsurf’s “o3 under the hood” produced worse answers than direct o3 chats with explicit context.
  4. Opaque reasoning. When it went sideways, there were no breadcrumbs to debug.

Takeaway: a great intern, but a lousy senior engineer. Use agents for green‑field scaffolds, switch to surgical prompts for refactors.


5. Lessons Learned

I will say my critiques and commentary are mostly applicable to Windsurf. I can't speak to how other agentic coding apps perform at this point other than to say Claude Code is f*ing amazing and blows this experience out of the water.

# Insight Why It Matters
1 Curate your prompt. Direct, context‑rich chats with o3 beat any black‑box agent. (at least for Windsurf) Deterministic and debuggable.
2 Amazing at sketches, reckless at global refactors. Guardrails keep projects sane.
3 Architect your project up-front into modules which will fit in a reasonable context window Keep your code tractable and decoupled.

6. Final Thoughts

Aquacost.com proved that a back‑end‑leaning, UI‑shy coder can ship a polished public app in a weekend—with strategic help from modern AI tools. Windsurf got me airborne; ChatGPT o3 ensured the physics held up; Claude Code helped me land smoothly.

If you have been sitting on an idea, block off a weekend, grab your favorite LLM pair‑programmer, and just ship. Worst case, you learn a ton. Best case, you create something people actually use—while leveling up your own stack in the process.

Happy hacking!