Nolan Lawson

Author Archive

25 May

Using AI to write better code more slowly

Posted by Nolan Lawson in software engineering. Tagged: AI. 18 comments

A lot of people seem convinced that the point of AI coding is to write low-quality code as fast as possible. Spew out barely-passable slop, open massive PRs, and merge them unvetted. Ship it!

But the thing is, LLMs are very flexible. And you can use them just as effectively to write high-quality code more slowly.

This statement seems completely obvious to me at this point, and I almost didn’t want to write this post for that reason. But there seem to be enough people convinced that LLMs are only good as slop cannons that it’s worth making the opposite case.

If Mythos taught us anything, it’s that LLM agents are really good at finding bugs. Throw them at a codebase enough times, and they will find so many bugs that you’ll barely know what to do with them.

Like many others, I’ve also found this is true of non-Mythos models – some may be better than others at finding subtle bugs or avoiding false positives, but the fact is that the latest public models from Anthropic and OpenAI are good enough to find plenty of bugs in an unscrutinized codebase.

The problem is not so much finding the bugs, but instead prioritizing and validating them. For this reason I have a Claude skill I adapted from this article‘s core insight, which is that the more, different models you throw at a PR review, the less likely you are to get hallucinations or bogus bugs.

The skill says (paraphrasing):

Run a Claude sub-agent, Codex, and Cursor Bugbot to find bugs in this PR ranked by critical/high/medium/low. Once they’re all done, review their findings, do your own research to rule out false positives, and write a final report.

That’s basically it. You can add your own definition of “bug” if you want – mine has stipulations about the KISS and DRY principles, writing accessible HTML/JSX, using proper indexes for SQL queries, etc.

In my experience, this skill always finds tons of bugs in a PR, and the false positive rate is near zero. It finds so many bugs that you’ll be bored senseless if you try to tackle them all. They’ll range from critical security or correctness bugs to the more mundane medium-level perf bugs to low-level “this comment is misleading”-type bugs.

My typical workflow is:

Have an agent fix all the criticals and highs (with my guidance on the proper solution), then repeat until no criticals/highs
Skip highs/mediums where the juice isn’t worth the squeeze (e.g. 100 lines of code to fix a narrow edge case)
Abandon the PR if it has so many criticals that I realize the whole approach is misguided

When I use this technique, I haven’t necessarily seen my velocity go up. If anything, the review process often finds pre-existing bugs, so I end up on a tangential side-quest where I’m writing unit tests and fixing subtle flaws that pre-date the PR. This is the opposite of the “10x productivity” slop-cannon style of development that most people imagine when they think of vibe coding, but I find it very satisfying.

It’s a great way to improve the overall health of the codebase while also teaching you about the odd corners of it. In my experience, the happy-path of a complex architecture is less interesting than its failure modes. And pre-LLMs, this is usually how I got familiar with a codebase anyway: understanding where the assumptions break down, and then getting my hands dirty to fix it.

If you’re the kind of person who is skeptical that AI coding is good for anything, then I doubt this post will persuade you. But if you’re the kind of developer who uses agents to write multi-hundred-line PRs that you barely understand yourself, I’d invite you to slow down a bit and try this other, slower style of “vibe coding.” Ask an agent how your PR works and how it might fail. Have it write Markdown docs with Mermaid charts if necessary. Use Matt Pocock’s /grill-me skill until you understand the entire PR front-to-back.

You might not be more “productive” in terms of raw lines of code. You might burn a ton of tokens just to find out that your entire plan was wrongheaded from the start. But I find this style of coding to be a more super-powered version of the kind of programming I was already trying to do before LLMs: careful, methodical, quality-obsessed, focused on making things better for the next coder.

So take a deep breath, slow down, try this technique, and see if you don’t enjoy writing better code more slowly.

22 Mar

The diminished art of coding

Posted by Nolan Lawson in software engineering. Tagged: AI, art. 5 comments

Programming is an art. It’s less like fine art or music and closer to architecture or carpentry – combining form and function – but it is an art.

If you don’t believe me, consider code reviews. I’ve definitely done code reviews where I admired the mastery on display, where the elegance of the solution shone out like a brilliant gem, where I felt like Salieri overcome by the symphony in his head as he reads Mozart. Conversely, I’ve mentored juniors where I read their code and immediately saw opportunities to help them mature – this bit is repetitive, this bit could be expressed more succinctly, this bit hits a performance de-optimization, etc.

When I do PR reviews of AI-authored code, I feel none of these things. I can sometimes tell whether it’s authored by Claude or Codex (Claude likes lots of Official-Sounding Comments, Codex is more to-the-point), but my mind tends to wander to the intent behind the PR, to the prompt or the plan. Nitpicking details of the code, like the type of for-loop or the names of functions, feels entirely superfluous. Sometimes the best advice is to just choose a new plan and re-prompt.

For most of my career, I’ve held two contradictory views of coding in my head simultaneously:

Coding is an art form, but you shouldn’t get too sentimental about your code – most code eventually becomes technical debt that should be expunged
Code can express the creativity of the author, but the best code is idiomatic, reducing the WTFs per minute
In short: code is art, but it’s also a means to an end

With the advent of LLM coding agents, I think this contradiction has been firmly resolved in favor of function over form. Or as Les Orchard might put it, the “make-it-go people” have triumphed over the “craft-lovers.”

The craft is still there, of course, but it’s different. When I code with agents, I’m thinking at a much higher level of abstraction: architecture, resilience, systems, monitoring, testing. I used to sweat the small details – Claude starts comments with a capital letter, I rarely do; Claude names variables one way, I prefer another – but I quickly learned to stop caring. It’s simply a waste of time to nitpick, especially since the agent will likely undo your nitpicks on the next refactor.

In some ways I feel like a carpenter whose job is now to write the blueprints for the IKEA factory. Of course there is still artistry in designing the blueprints, but you don’t care if the factory spits out one or two tables with splinters in the legs. The point is to produce enough furniture fast enough that the little imperfections don’t matter. Taste and judgment still count, but they’re at the level of the overseer on the assembly line, not the master carpenter working a chisel.

Finding art elsewhere

For me, coding always occupied an odd place on the artistic spectrum. Some code is firmly art – Jenn Schiffer, for example, has been an artist-in-residence and has had numerous projects devoted to the intersection of art and programming. Whereas other code is purely functional: I’m sure many programmers have spent their entire careers pumping out glue code for enterprise CRMs without ever wondering if what they were making was “art.”

One worry I have for my entire generation of programmers is that many of us have been getting our artistic “fix” from coding: taking craftsmanship seriously, reviewing others’ code with the eye of a literary critic, trying to elevate the profession. Now the profession has been turned into an assembly line, and many of us are eagerly jumping into our new jobs as blueprint-designers without questioning what this will do to our souls. I believe art is necessary for a rich and full human life, so this isn’t an idle concern.

My advice to other coders, or at least the advice I’m taking myself, is that if you’re looking for art in coding: stop looking. If you’ve never taken an interest in poetry, or painting, or dance, or whatever, now would be a good time. In an era where the internet is increasingly full of bots pumping their bland bot ideas into everybody’s brains, seeking out distinctly human forms of expression has become vital.

It might sound corny (or like I’m going through a mid-life crisis), but I’ve done the following lately:

started painting (Bob Ross is of course an easy stepping stone)
gone to several ballets / contemporary dance performances
started reading more fiction (The Sun Magazine is a longtime favorite of mine, but I’m even reading the poetry now)
picked up my guitar again

You might have also noticed that this blog has gotten a lot more sentimental and experimental lately. I’ve never used LLMs to help with my writing (not even to spellcheck!), but lately I’ve tried to fight my own tendencies to write bland, predictable prose. In a world of machines that “predict the next token,” what’s the best reaction? Be less predictable. At least that’s what I hope I’m doing.

I don’t think coding is dead as an art form, and I do think that the “new” craftsmanship will have its own masters, its own styles, its own expressiveness. Heck, maybe I’ll be surprised and there will be an artist-in-residence somewhere wielding agent orchestrators like a paintbrush! But I kind of doubt it. If you’re not knitting, then you’re making clothes on an assembly line, and if the clothes are disposable, then it’s just fast-fashion. There’s artistry there, perhaps, but the end product is much less interesting artistically because there’s less of the human touch to it.

In my view, we’re firmly in the fast-fashion era of coding: software is vibe-coded, used up, thrown away, vibe-coded again. This is not a fully bad thing, and I’m sure many non-coders especially are giddy at the superpowers they’ve acquired. But as coders, we shouldn’t lose sight of what we’ve lost, and we should seek to make up for it with new sources of artistic sustenance.

18 Feb

You had a story

Posted by Nolan Lawson in software engineering. Tagged: AI, fiction. 5 comments

You had a story you used to tell yourself about how you got here in life.

You’d share the story with others. Maybe you’d be at a party, and someone would ask what you do, and you’d say, “I’m a programmer.” And their eyes would perk up and their mind would fill with images of ball pits and propeller beanies and that funny movie with Jesse Eisenberg, and they’d say, “Oh yeah, like you build apps?”

And you’d proudly brush it off and say something like, “Yeah I work on the backend, you don’t know what that is, but it’s basically the magical thing in the cloud that makes your apps work.” Or you’d say, “Yeah I work on the frontend, it’s the thing you touch all day long on your phone, I make it look good and run fast.”

And if you thought about what it took to get there, you’d think of the lines of code, fuzzy green text in a black terminal, the mystical incantations that lit up the glowing rectangles that everyone else was staring at all day. They didn’t know the effort, the raw-adrenaline flow of code bursting from your fingertips as you wove dreams into pixels. Or the writer’s block of the showstopper bug that nagged you on your jog and in the shower until suddenly the answer came to you when you awoke, as if from a dream, and you rushed to the keyboard to pound out the solution, the pure exhilaration of making the computer obey you.

Then someday somebody took that story from you. They told a different story: one where someone on a stage, in a business suit that you’d never wear, with a smug grin that you’d never wear, talked words into a computer and, the little traitor, it obeyed him. He talked words like you or I would, like any simpleton would, and it obeyed him. And the crowd cheered because they knew that now the glowing rectangles belonged to them as well, and not just you, with your wizardly spells that took years of study to master.

When you see your brother at Thanksgiving, he’s excitedly showing your parents an app he built to track football scores. Except he didn’t really build it, you think to yourself bitterly. He’s not a programmer, he’s a mechanic, and maybe he knows his way around cars, but he never grinded LeetCode to pass an interview, or stayed up late studying Data Structures and Algorithms to pass an exam. “Let you brother have this one,” your dad chides you after an outburst at the dinner table. “It’s just an app.” Just an app, you scoff with amazement.

When you go home, you pour yourself a stiff drink and wonder what kind of story you can tell yourself now. You were Superman, and now every schmuck puts on a cape and thinks they can fly. The fools! They’ll fall. They’ll fall and crash, you reassure yourself as you take a swig.

Your friends agree. “This stuff will never work,” they say, as if with bored detachment. “Remember low-code? Remember no-code? What a joke.” But you notice something: a fear in their eyes that you’ve never seen before. You don’t feel reassured.

Eventually you find that your own colleagues are warming to the stuff. “It’s actually pretty useful,” they say. “Give it a shot.” You’re astounded by the pure treason. Don’t they realize this is a rejection of everything they’ve done their entire careers, an insult to their very dignity as a programmer? They shrug. “Sure, but times change. I want to have a job in five years.”

Five years. Your retirement is looming. And you were looking forward to leaving at the top of your game, maybe to tinker on some side projects after money is no longer an issue. But now you don’t know about the money, or the side projects, or whether you’ll be at the top of your game anymore or just a washed-up has-been. The panic is really starting to set in now, and you’re looking for an out.

You approach the tool out of resentment – cautiously, like a cursed artifact. You hold the dead thing at arm’s length, as if its very aura might poison you. You try it out, and it chirps happy success but everything it spits out is failure. You close the laptop lid. Vindication! The thing truly is dead, a fraud, a sham. You can return satisfied to your beloved craft.

Except your craft doesn’t feel right anymore. More and more, you read about astounding feats created with the accursed tool. Colleagues you trust and admire are now reconfirming, in more strident tones, that it actually works. The treason is all around you, choking your joy, ruining what once gave you so much meaning. You can barely stand to look at the blinking cursor in your text editor anymore, which has also betrayed you with its daily offers to steal your voice, retire your fingers, extinguish your spark.

You start to wonder if this industry is even right for you anymore. How can it be right when everything around you feels so wrong? And always, there is the fear: fear that you are losing ground, fear that you won’t make it past the next layoffs, fear that you’ll be joining your brother in the auto shop and he’ll be showing you the ropes, you with your fumbling fingers that can barely hold a wrench, and oh by the way did you see the app he built to replace his content management system?

Where the story goes next is for you to decide. Maybe you can skate by for a few more years in a sinecure, holding onto a payslip for dear life, while the world moves whooshing around you. Maybe you’ll give in and learn the new tools, but with a kind of detached ambivalence, going through the motions but no longer feeling joy or meaning or like you have something worth talking about at a dinner party.

Or maybe you’ll look around and study those who have adapted to the new world and are thriving. Maybe you’ll notice the coworker who brushed off all the doom and gloom and just says, “Hey, look at this cool thing I built.” And maybe you’ll notice that this coworker has their own kind of mastery, their own tools and workflows with odd names that turn you off at first but ultimately pique your curiosity. Maybe you’ll wonder if it makes more sense to hang out with the people building and sharing and having fun instead of those who mope and whinge and cry for a lost golden age. Maybe instead of being ruled by fear, you’ll find your creative spark again.

Because isn’t that the point in the first place? Isn’t that why you got into programming? Wasn’t it to make something, to put it out in the world and bring joy to others with your creation? Wasn’t it to make a song out of sand, a painting out of pure thought, a miracle out of nothing, regardless of how you did it?

If it is, then you might find that the story you need to tell about yourself, about who you are and where you came from and why you create, was right there all along. The story has been told hundreds of times throughout history – only the characters and the scenery change – and you still have it in you. You found it once before, and if you search for it with curiosity and an open heart, you’ll find it again. It never really left you.

15 Feb

Days of miracle and wonder

Posted by Nolan Lawson in software engineering. Tagged: AI. 6 comments

Oprah Winfrey and I have something in common, which is that our favorite album is Paul Simon’s Graceland.

I’ve been thinking a lot recently about the opening track, “The Boy in the Bubble”. The song can be read a few different ways, but I read it as an aging man amazed by modernity but also kind of frightened by it, and comforting his loved one with:

These are the days of miracle and wonder, and don’t cry baby, don’t cry, don’t cry

If these are “the days of miracle and wonder,” then why would anyone want to cry about it? Well, a few different reasons:

These are the days of lasers in the jungle, lasers in the jungle somewhere

Staccato signals of constant information

A loose affiliation of millionaires and billionaires

Sound familiar? It was written in 1986, but it could have been written today.

One thing I’ve noticed during our recent technological turbulations is that some people seem to have lost the capacity for wonder, or are willfully ignorant of the wonders around them. And if you can’t acknowledge that something extraordinary has happened, then you can’t start grieving for what has been lost (the subject of my last post). To me, this is the opposite of what Paul Simon is advocating: awe for the future combined with reverence for the past.

For example, I have a lot of conversations on Mastodon that start with me acknowledging some flabbergasting feat that coding agents have accomplished lately, like one-shotting a browser API that passes 77% of the relevant Web Platform Tests, or building a rudimentary browser that can render basic web pages, or building a C compiler that can compile the Linux kernel, etc. Then the interlocutor says something like, “Sure, but are the Web Platform Tests really representative of a working browser?” (Short answer: yes, it’s the entire basis of cross-browser projects like Interop.) Or: “Well sure, but is the code maintainable and bug-free?”

I find these conversations kind of baffling. It’s as if you’ve been shown a talking dog that can also sing the blues and play steel-string guitar, and your first response is, “Yeah, but the second verse was a bit off-key.” I understand skepticism – being skeptical is good, and there is a ton of hype and hogwash out there in the “AI era,” but like… can we just take a moment to be amazed? None of this was imaginable even three years ago, and now it’s practically worthy of the snooze button.

In fact, some of my more AI-adept colleagues are actually not much impressed with these stories, precisely because they know that even more amazing stories are likely around the corner. The lasers in the jungle have become so commonplace that we hardly notice them anymore.

Personally, I’m trying to maintain my skepticism as well as my sense of wonder. There’s so much breathless hype out there that it clouded my judgment for a while, but I’m also humbled by how fast things have moved, defying my early expectations.

I don’t consider myself a tech optimist – I seriously doubt we’ll ever travel to Mars, let alone colonize it, and I think predictions of the singularity or uploading our brains into the cloud are fun science fiction but hardly a bet I would take on the optimists’ side. But I have to admit that I was wrong on AI coding, so I’m prepared for my expectations to be defied again.

In many ways, I feel like the last year has been a victory for the techno-optimists – LinkedIn bros, Elon Musk stans, former NFT-peddlers – over artists, tech critics, and left-leaning intellectuals, which has been a bitter pill for me to swallow, since I identify with the second group much more than the first. This is what I was trying to get at with “AI tribalism”, although in retrospect I was a bit clumsy about it.

So if you’re feeling like me, and a bit bitter that the tech bros are taking a victory lap right now, and maybe hoping that they realize their shoelaces are untied and fall flat on their faces, I’d suggest taking a different tack. Disregard the hype, ignore the breathless prognostications of eternal abundance, and just look around and ask yourself if you would have been impressed by any of this three years ago. If so, take a moment to be amazed. It doesn’t make you a stooge or a credulous mark; it just makes you human.

And if you have to grieve, grieve. Technology is changing in scary and unpredictable ways, and not all the changes are positive. (Far from it – I wonder if someday we’ll look back on the invention of LLMs like the invention of the atom bomb.) But eventually we should move on from our grief, because the world is not ending; it’s just turning, as it always has.

In other words:

These are the days of miracle and wonder, and don’t cry baby, don’t cry, don’t cry

7 Feb

We mourn our craft

Posted by Nolan Lawson in software engineering. Tagged: AI. 65 comments

I didn’t ask for this and neither did you.

I didn’t ask for a robot to consume every blog post and piece of code I ever wrote and parrot it back so that some hack could make money off of it.

I didn’t ask for the role of a programmer to be reduced to that of a glorified TSA agent, reviewing code to make sure the AI didn’t smuggle something dangerous into production.

And yet here we are. The worst fact about these tools is that they work. They can write code better than you or I can, and if you don’t believe me, wait six months.

You could abstain out of moral principle. And that’s fine, especially if you’re at the tail end of your career. And if you’re at the beginning of your career, you don’t need me to explain any of this to you, because you already use Warp and Cursor and Claude, with ChatGPT as your therapist and pair programmer and maybe even your lover. This post is for the 40-somethings in my audience who don’t realize this fact yet.

So as a senior, you could abstain. But then your junior colleagues will eventually code circles around you, because they’re wearing bazooka-powered jetpacks and you’re still riding around on a fixie bike. Eventually your boss will start asking why you’re getting paid twice your zoomer colleagues’ salary to produce a tenth of the code.

Ultimately if you have a mortgage and a car payment and a family you love, you’re going to make your decision. It’s maybe not the decision that your younger, more idealistic self would want you to make, but it does keep your car and your house and your family safe inside it.

Someday years from now we will look back on the era when we were the last generation to code by hand. We’ll laugh and explain to our grandkids how silly it was that we typed out JavaScript syntax with our fingers. But secretly we’ll miss it.

We’ll miss the feeling of holding code in our hands and molding it like clay in the caress of a master sculptor. We’ll miss the sleepless wrangling of some odd bug that eventually relents to the debugger at 2 AM. We’ll miss creating something we feel proud of, something true and right and good. We’ll miss the satisfaction of the artist’s signature at the bottom of the oil painting, the GitHub repo saying “I made this.”

I don’t celebrate the new world, but I also don’t resist it. The sun rises, the sun sets, I orbit helplessly around it, and my protests can’t stop it. It doesn’t care; it continues its arc across the sky regardless, moving but unmoved.

If you would like to grieve, I invite you to grieve with me. We are the last of our kind, and those who follow us won’t understand our sorrow. Our craft, as we have practiced it, will end up like some blacksmith’s tool in an archeological dig, a curio for future generations. It cannot be helped, it is the nature of all things to pass to dust, and yet still we can mourn. Now is the time to mourn the passing of our craft.

1 Feb

15 years of blogging

Posted by Nolan Lawson in Life. Tagged: blogging. 8 comments

My first blog post was published just under 15 years ago in March of 2011. Since then, I’ve published 151 posts, including this one. (If I was a numerologist, I’d think it had something to do with Pokémon.)

This blog has covered a wide variety of topics, including Pokémon in fact (I wrote the first Pokédex app for Android). The topics largely followed the trajectory of my career: starting with machine learning, veering into Android, taking a detour into Solr/Lucene, and eventually settling on JavaScript and web development. Later I wrote about web performance, accessibility, web components – basically whatever topic crossed my desk. You’re reading the unfiltered output of my brain here, more or less.

I don’t publish a lot – less than one post per month apparently. My main guiding principle is that I don’t write unless some topic is itching to get off my chest, or I think I have something novel to say. There are no ads on this humble WordPress blog, and I don’t have anything to sell you, so there’s nothing motivating me except my own desire to explore an idea, make a point, or just stretch the old writing muscle.

As you can tell, I’m too lazy to even update the vintage WordPress theme – odd for someone who pretends to be a professional web developer! I’ve always felt like that was a distraction, though. I know some bloggers who spend more time tweaking their CSS than writing content, and that’s fine, but it was just never my goal for this blog. If I want to scratch that itch, I write an oddball project like Pokedex.org or Pinafore instead.

The biggest challenge of the early days of this blog was just getting anybody to read it. The biggest challenge later on was dealing with the overwhelming anxiety of realizing “Oh shit, people are actually reading this”, followed by the inevitable fears of:

blowback (I got too controversial)
audience capture (I got soft)
being ignored (I got boring)

I feel like I’ve careened between all of these extremes over the past 15 years. Overall my writing was a lot more freewheeling in the past, and I’ve tried to recapture some of that lately, but having an audience just naturally gnaws at your mind in a way that (I find) I can’t totally ignore.

Quitting Twitter (and wasn’t that a weird story arc on my blog!) helped a lot, although there’s still of course Mastodon and Lobsters and Hacker News and all the rest where the comments can be a vicious cesspool if you spend too much time there. (If you’re reading this from RSS: you’re my favorite readers, and they can take my RSS reader from my cold dead hands!)

Not that this blog has a ton of readers. There’s actually a list of the most popular blogs on Hacker News, and mine hit #631 last year. This puts me somewhere between “I think I’ve seen his face on the internet once” and “never heard of him.” Although based on my WordPress stats, my best days are somewhat behind me, with my all-time most popular posts being:

2015: Safari is the new IE (I will never be able to escape this one; it’ll be in my damn obituary)
2016: The cost of small modules (I’m very proud of this one – I got the bundlers to optimize their implementations!)
2015: IndexedDB, WebSQL, LocalStorage – what blocks the DOM? (Horribly out of date by now, but somehow perfect SEO-bait so people seem to land on it)
2020: Linux on the desktop as a web developer (Really?)
2022: The collapse of complex software (My favorite thing I’ve ever written)

Although rounding out the top 10, we do get some more recent hits:

2023: Let’s learn how modern JavaScript frameworks work by building one (I loved this one! It was so much fun to do a bunch of research and write a nice breezy tutorial. Makes me wonder if I should have tried my hand at educational writing.)
2020: Fixing memory leaks in web applications (Oh man, we’ll never be free of this scourge, will we? Better buy more RAM.)
2022: The balance has shifted away from SPAs (One of my favorite series on this blog, and I think I totally called it. Cloudflare buying Astro really proves that it’s time for a new model.)
2012: Better synonym handling in Solr (One of my earliest stabs at open source, and surprisingly I’ve gotten a lot of emails about this one.)
2017: What it feels like to be an open-source maintainer (Maybe one of my more timeless posts, getting quoted in Nadia Eghbal’s Working in Public, getting me a somewhat emotional podcast interview, and probably making a lot of people remember me as “that open source burnout guy.”)

Interestingly to me, though, the work that I’m most proud of didn’t get a lot of traction. It wasn’t a polemic, or a thinkpiece, or a perhaps-too-glib takedown of a major browser vendor (sigh) but instead my work on performance optimizations, benchmarking, etc. A lot of my blog posts are basically: “People say this thing is fast. Is it though? Let’s run some numbers.” For example:

2022: Style performance and concurrent rendering (“Is CSS-in-JS actually fast?” Answer: “No.”)
2022: Style scoping versus shadow DOM: which is fastest? (“Is shadow DOM fast?” Answer: “Kinda!”)
2024: Improving rendering performance with CSS content-visibility (“Is CSS content visibility fast?” Answer: “Yes, but not as much as I wish!”)

This is the kind of stuff that (I like to think) really moves the needle in the web development space, because words are cheap but numbers talk. I’ve heard from some folks that a post of mine immediately short-cutted some internal discussion about whether they should choose Strategy A or B for their web app. I love having that kind of impact!

This experience has taught me that the page stats aren’t everything. Sometimes it’s not about how many people read your post, but whether the right people read it. One example is my recent post on the js-framework-benchmark. This was a sleeper hit: barely touching any of the socials, never high on Hacker News or Lobsters, and yet I know from personal anecdotes that performance experts read it and appreciated it. Not every post needs to be a thrilling dimestore paperback: some can be a ponderous Tolstoy or Joyce.

Conclusion

When I first started this blog, I was early in my career and didn’t really know what I was doing. I made a blog because I guess I thought it was the cool thing to do, and that maybe it would help me land a job someday.

I named it “Read the Tea Leaves” because I thought you were supposed to have a punchy title like “Daring Fireball” or “Coding Horror”. If I were starting it today, I’d probably just call it “Nolan Lawson’s blog” and be done with it (although I do love tea).

Keeping this blog has been a great source of passion for me, and it has indeed opened doors in my career. I landed my most recent job in no small part thanks to this blog (good foresight, 2011 me!), and it’s also just really fun to get recognized at a conference or have a coworker mention that they enjoyed one of my posts.

I’ve found though that the greatest value is just the act of writing itself. That’s one reason I don’t use AI for any of these posts (heck, I don’t even use Grammarly – all the spelling mistakes are mine!). The act of writing is also the act of thinking, and my thoughts are (usually) sharpened by typing words into a blank page.

And if nothing else, I can put an idea out there, let it get tossed in the wind, and see if anybody picks it up to do something useful with it (even if that something is just to denounce or refute it). For that reason, I don’t even regret my most controversial posts, and not even the ones I disagree with today, because I think there’s still value in being wrong in public and at least trying to stimulate people’s minds in the right direction.

Would I recommend that young coders take up blogging? Absolutely. Start up a blog anywhere – WordPress, Squarespace, a file server of HTML files, whatever. Write all the time, even if you don’t always hit “publish.” (I certainly don’t – 63 unpublished drafts!)

Get yourself in the habit of being brave in public, and try to ignore that voice that says “This isn’t good enough” or “You’re not smart enough” or “People will hate you for this.” I’ve sacrificed a lot of sacred cows over the years, and maybe even burned some bridges (man, my recent AI posts may have done that). But if I hadn’t tried to speak my mind, put my thoughts out there, and find the courage to be vulnerable in public, then I would have just felt limp and cowardly and boring. That’s what I regret most: all the blog posts I didn’t write.

Still, I do have a high bar for this blog (despite what some haters may believe!), so I don’t intend to become one of those “post-a-day” kind of people. It’s just not my style. But I do hope to be the kind of person who has more ideas worth expressing and worth putting into cyberspace. Here’s to 15 more years of blogging.

31 Jan

Building a browser API in one shot

Posted by Nolan Lawson in Web. Tagged: AI. 5 comments

When I learned that two simple browser engines had been vibe-coded, I was not particularly surprised. A browser engine is a well-understood problem with multiple independent implementations, whose codebases have no doubt been slurped up into LLM training data.

What did surprise me is that neither project seemed to really leverage the Web Platform Tests (WPTs), which represent countless person-hours of expertise distilled into a precise definition of how a browser should work, right down to the oddest of edge cases. (The second project does make partial use of WPTs, but it doesn’t seem to be the primary testing strategy.)

LLMs work great when you give them a clear specification (or PRD) and acceptance tests. This is exactly what the web standards community has been painstakingly building for the past few decades: the browser standards themselves (in plain English as HTML files) and the WPTs. The WPT pass rate in particular gives you a good measure of how “web-compatible” a browser is (i.e. can it actually render websites in the wild). This is why newer browsers like Ladybird and Servo heavily rely on it.

I don’t have the patience (or cash) to build an entire browser, but I thought it would be interesting to build a single browser API from scratch using a single prompt, and to try to pass a non-trivial percentage of the Web Platform Tests. I chose IndexedDB because it’s a specification that I’m very familiar with, having worked on both PouchDB and fake-indexeddb, as well as having opened small PRs and bugs on the spec itself.

IndexedDB is not a simple API: it’s a full NoSQL database with multiple key types (including array keys and arrays-as-keys), cursors, durability modes, transactions, scheduling, etc. If you build on top of SQLite, then you can get some of this stuff for free (which is probably why both Firefox’s and WebKit’s implementations use it), but you still have to handle JavaScript object types like Dates and ArrayBuffers, JavaScript-specific microtask timing, auto-transactions, and plenty of other idiosyncrasies.

The experiment

So here was the experiment:

Create a repo with submodules containing both the Web Platform Tests and IndexedDB specification.
Tell Claude (in plan mode) to create a plan to build a working implementation of IndexedDB in TypeScript and Node.js on top of SQLite, passing >90% of the tests.
Plug the plan into a Ralph loop so multiple agents can iterate sequentially on solving the problem.
Go to sleep and wake up the next morning.

If you’re not familiar with the so-called “Ralph Wiggum” technique, it’s dead simple: run Claude in a Bash loop, giving it a markdown file of instructions and a text file to track its progress. (That’s literally it.) The main insight is to avoid context rot by frequently starting a brand-new session. In other words: the LLM gets dumber the longer the conversation goes on, so have shorter conversations. I used Matt Pocock’s implementation (which is literally 24 lines of Bash) in --dangerously-skip-permissions mode, in a Podman container for safety.

The project completed in a few hours of work, and the agent decided to disobey my instructions and pass well over 90% of the target tests, reaching 95%. (Naughty robot!) Note that it omitted some tests because they weren’t deemed appropriate for a Node.js environment, but it still amounts to 1,208 passing tests out of the 1,272 target subset.

Here was the prompt. You’ll note I had some typos and grammatical errors (e.g. I meant instanceof, not typeof), but the agent still figured it out:

Click to see prompt

Help me plan a project. You have the entire IndexedDB spec and web-platform-tests checked out in git submodules.

Here’s the project: build a TypeScript-based project that implements IndexedDB in raw JavaScript (no dependencies) on top of SQLite (so okay, SQLite is the one dependency). You should try to pass at least 90% of the IndexedDB tests from WPT.

Stipulations:

Use TypeScript and run in native Node (you have Node v24 already installed which supports TS out-of-the-box). Use tsc for linting though
Write tests using node:test
You must run the WPT tests UNMODIFIED in Node. To achieve this you will no doubt have to use some shims since the tests were designed to run in the browser, not Node. But as much as possible, you should prefer built-ins. Node supports a lot of built-ins now like Event and EventTarget so this shouldn’t be super hard.
You should start first by setting up the basic project scaffolding and test scaffolding. To start, try to get ONE test passing, even if you have to do a basic pure-JS implementation of IndexedDB (i.e. a “hello world”) to get that to work.
You should store some of these basic stipulations and project structure in CLAUDE.md as you go for the next agent. E.g. how to run tests, how to lint, etc.
Your implementation should ultimately store data in sqlite. You should use the better-sqlite3 package for this. Again, no dependencies other than this one. (You may have as many devDependencies as you want, e.g. typescript)
We’re building a plan, and I want this plan to encompass everything that’s needed to get to roughly 90% test coverage. To do so, we should probably divide up the PRD into some subset of tests that make sense to tackle first, but we can leave it up to future agents to change the order if it makes sense
As much as possible, try to make your implementation JS-environment-agnostic. We’ll be running in Node, but if someday we want this running in a browser on top of SQLite-on-WASM then that shouldn’t be impossible. Your test harness code can have Node-specific stuff in it if necessary, but the actual library we’re building should strive to be agnostic.
In the end, your test suite should have a manifest file of which tests are passing, failing, timing out, etc. This will be a good way to judge progress on the test suite and give guidance to the next agent on what to tackle next. Ideally this manifest file will have comments so that agents know if certain tests are tricky or outright impossible (toml or yaml may be a good format).
You’re running in a sandbox with sudo so if you need to install some tool just do it.
The project is complete when you reach 90% test coverage on the IndexedDB tests in wpt. Note that this number should be based on the number of passing tests, not the passing test files.
Your test script should OUTPUT the manifest of passing/failing tests. This allows the next agent to know which tests are passing/failing WITHOUT having to actually run the tests (which takes time). You should also commit this manifest file whenever you commit to git.
For simplicity, your tests should use sub-processes/workers for isolation rather than any kind of vm technique since this can introduce JavaScript cross-realm issues (e.g. typeof Array not being right).
For the purposes of this project, “one task” should be considered to be ONE TEST (or maybe two) at a time to keep things simple. Don’t try to bite off huge entire feature of IndexedDB (e.g. cursors, indexes, etc.) and instead try to break work up into small chunks.
The main goal of this project is to be spec-compliant, but being performant is great too. Try to leverage SQLite features for maximum performance (and don’t fake it by doing things in raw JavaScript instead). If a task is just “improve performance” then that’s fine.

And here is the project itself.

If you can’t tell from the git history, the hardest part was just keeping the loop running. Despite the relentlessness of the Bash loop, Claude Code kept occasionally erroring out with:

Error: No messages returned
    at FKB (/$bunfs/root/claude:6151:78)
    at processTicksAndRejections (native:7:39)

This seems to be a bug. Annoying, but not a dealbreaker since I could just restart the loop when it crashed. So it didn’t finish “overnight,” but it was done by the time I finished breakfast.

Evaluating the code

Looking at the project structure, it’s pretty straightforward and the files have familiar (to me) names: IDBCursor.ts, IDBFactory.ts, etc. This isn’t surprising because it follows the spec naming conventions, as well as the patterns of projects like fake-indexeddb (which I’m sure was part of the LLM training data). The test harness has to shim some browser APIs like window.addEventListener and ImageData to get certain tests to pass, which is exactly what we did in fake-indexeddb as well.

According to cloc, the src directory is 4,395 lines of code. Looking through some of the bits that I knew would be challenging, like event dispatching, I wasn’t surprised to see that it took a similar strategy to fake-indexeddb, shimming the event dispatch / listener logic rather than relying on the Node.js built-ins. (This is really not straightforward!)

Interestingly though, it deviated from fake-indexeddb by implementing its own structuredClone logic using v8.serialize(). I assume the reason for this is that, unlike fake-indexeddb, it doesn’t have the luxury of keeping JavaScript objects in memory, and instead has to serialize to SQLite. So although you could argue that it’s cribbing from its training data, it’s also doing something pretty unique in this case.

As for its transaction scheduler, this doesn’t look anything like fake-indexeddb‘s logic, but it does look sensibly designed and is at least readable. Then there’s also of course sqlite-backend.ts which deviates from the only comparable implementation I’m aware of (IndexedDBShim) by having a proper “backend” for the SQL logic rather than mixing SQL into the APIs as IndexedDBShim does (which is a bit hacky in my opinion).

One annoying thing about its coding style is that it doesn’t make much reference to the actual spec. If you read fake-indexeddb or the source code of a browser (especially Ladybird and Servo in my experience), there are often comments quoting the literal spec language. This is great, since the spec is often pseudocode anyway, so it helps the reader to keep track of whether the browser implementation actually matches the spec or not. Claude seemed to avoid this altogether; perhaps relying entirely on the WPTs, or perhaps just not seeing it worth a word-for-word comment.

Another thing I noticed during code review is that the agent fibbed a bit on the pass rate: out of the original test files it targeted, 9 crashed, and so they weren’t counted in the denominator (presumably because it didn’t know how many tests would have run). So the “real” pass rate is actually 92%, if we consider all crashed tests to be failures: 1208 / 1313 (I got the true denominator using wpt.fyi). Although to be fair, 95% is accurate for the test files that ran without crashing.

As a final test, I ran the code against fake-indexeddb‘s own WPT test suite – just to make sure there was no funny business, and the LLM didn’t cherry-pick tests to make itself look good. The two test suites aren’t 1-to-1 – the agent had decided to skip some large but tricky tests like the IDL harness, plus there are the 9 crashed tests mentioned above. So using fake-indexeddb‘s own tests gives us a more accurate way to judge this code against a comparable IndexedDB implementation.

In this more rigorous test, the implementation scores 77.4%, which compares favorably to fake-indexeddb's own 82.8% (only ~5% off). We can also compare it with browsers:

Implementation	Version	Passed	%
Chrome	144.0.7514.0	1651	99.9%
Firefox	146.0a1	1498	90.6%
Safari	231 preview	1497	90.6%
Ladybird	1.0-cde3941d9f	1426	86.3%
fake-indexeddb	6.2.5	1369	82.8%
One-shot		1279	77.4%

77.4% vs 82.8% is really not bad, given that fake-indexeddb is ~10 years old and has 15 contributors. Although I think once you get past roughly ~40%, you have a largely working implementation – many of the WPTs are corner cases or IDL quirks, e.g. whether a property is enumerable/configurable or not.

The one-shot implementation actually passes 30 tests that fake-indexeddb fails, mostly in the zone of IDL harness tests. As for the 88 tests fake-indexeddb passes but the one-shot fails, they are mostly in structured cloning and blob serialization, properties on the IDBCursor object, errors for invalid keys such as detached ArrayBuffers, and other edge cases.

fake-indexeddb‘s WPT tests also ran in 49.2s versus 125.5s for the one-shot implementation (2.5x slower, median of 3 iterations), so there’s definitely room for improvement on performance. Although to be fair, this is comparing an actual persisted SQLite implementation versus in-memory, and boy did I work to optimize fake-indexeddb! I suspect another issue is that it chose a basic setTimeout for task queuing, whereas we used a much more optimal strategy in fake-indexeddb.

Conclusion

I’ve been talking a lot about LLMs recently and how they’ve changed my coding workflow. A large part of my audience has ethical concerns with LLMs around energy use, copyright, the motivations of big tech companies, etc., but my goal has just been to show that these things work. It would be easy to dismiss them if the technology was merely overhyped, but (somewhat sadly for me) it actually works.

This experiment is a good example of how far the latest models like Opus 4.5 have come: given a good enough prompt with clear tests and a specification, you can go to sleep at night and wake up the next morning to a working codebase. Before LLMs, you might have been able to count on two hands the number of actual independent IndexedDB implementations (~5 browser vendors plus fake-indexeddb and IndexedDBShim). Whereas now you can make a new one on-demand.

And it wasn’t that expensive, either: this project used roughly 20% of my weekly budget on a $100 Claude monthly plan, so let’s just say it cost me 7 bucks. Of course some will say that the costs are subsidized and likely to rise (and I won’t dispute that), but still: this is what you pay today. A new IndexedDB implementation can be had for roughly the price of a side of fries at a fancy pub.

So where does this project go next? If this was five years ago, and I had a halway-decent IndexedDB implementation in my hands, I’d open source it, publish to npm, accept PRs, etc. As is, I don’t really see the point. You can have a better version of the code yourself if you make it two-shot rather than one-shot. Or you can think of a better one-shot. Or you can build it on top of LevelDB or Rust or whatever you want. This is kind of what I was getting at in “The fate of ‘small’ open source”, although the definition of “small” seems to be growing every day.

How do I feel about this? Not great, to be honest. I poured tons of time into fake-indexeddb in the last year, using no AI at all (just my own feeble primate intelligence). I enjoyed the experience and don’t regret it, but experiments like this cheapen the efforts I’ve made over the years. It reduces the value of things. I think this is partly why so many of us have a knee-jerk reaction to reject these tools: if they work, then they’re frankly insulting.

However, I don’t think I or anyone else can wish LLMs away. Given their capabilities, it seems pretty clear that they’re going to become a core part of building software in the future. Maybe that’ll be good, maybe it’ll be bad, but their dominance seems inevitable to me now. I’m trying to not be so glum about it, though: if you follow some “AI influencers” like Matt Pocock, Simon Willison, and Steve Yegge, they seem to be having a tremendous amount of fun. As my former Edge colleague Kyle Pflug said recently:

AI-first development is making it once again joyful and eminently possible for anyone to create on the Web. It’s a feeling I’ve missed since View Source became illegible, and a silver lining that’s arriving just in time.

As a middle-aged fuddy-duddy trying to understand what all these kids are excited about, I have to agree. Even if vibe coding doesn’t feel particularly joyful to me right now, I can see why others like it a lot: it gives you a tremendous amount of creative power and dramatically lowers the barrier to entry. Simon Willison predicts that we’ll see a production-grade web browser built by a small team with AI by 2029. I wouldn’t bet against him on that.

24 Jan

AI tribalism

Posted by Nolan Lawson in software engineering. Tagged: AI. 11 comments

“Heartbreaking: The Worst Person You Know Just Made a Great Point” – ClickHole

“When the facts change, I change my mind. What do you do, sir?” – John Maynard Keynes, paraphrased

2025 was a weird year for me. If you had asked me exactly a year ago, I would have said I thought LLMs were amusing toys but inappropriate for real software development. I couldn’t fathom why people would want a hyperactive five-year-old to grab their keyboard every few seconds and barf some gobbledygook into their IDE that could barely compile.

Today, I would say that about 90% of my code is authored by Claude Code. The rest of the time, I’m mostly touching up its work or doing routine tasks that it’s slow at, like refactoring or renaming.

By now the battle lines have been drawn, and these arguments are getting pretty tiresome. Every day there’s a new thinkpiece on Hacker News about how either LLMs are the greatest thing ever or they’re going to destroy the world. I don’t write blog posts unless I think I have something new to contribute though, so here goes.

What I’ve noticed about a lot of these debates, especially if you spend a lot of time on Mastodon, Bluesky, or Lobsters, is that it’s devolved into politics. And since politics long ago devolved into tribalism, that means it’s become tribalism.

I remember when LLMs first exploded onto the scene a few years ago, and the same crypto bros who were previously hawking monkey JPEGs suddenly started singing the praises of AI. Meanwhile upper management got wind of it, and the message I got (even if they tried to use euphemisms, bless their hearts) was “you are expendable now, learn these tools so I can replace you.” In other words, the people whose opinions on programming I respected least were the ones eagerly jumping from the monkey JPEGs to these newfangled LLMs. So you can forgive me for being a touch cynical and skeptical at the start.

Around the same time, the smartest engineers I knew were maybe dabbling with LLMs, but overall unimpressed with the hallucinations, the bugs, and just the overall lousiness of these tools. I remember looking at the slow, buggy output of an IDE autocomplete and thinking, “I can type faster than this. And make fewer mistakes.”

Something changed in 2025, though. I’m not an expert on this stuff, so I have no idea if it was Opus 4.5 or reinforcement learning or just that Claude Code was so cleverly designed, but some threshold was reached. And I noticed that, more and more, it just didn’t make sense for me to type stuff out by hand (and I’m a very fast typist!) when I could just write a markdown spec, work with Claude in plan mode to refine it, and have it do the busywork.

Of course the bugs are still there. It still makes dumb mistakes. But then I open a PR, and Cursor Bugbot works its magic, and it finds bugs that I never would have thought of (even if I had written the code myself). Then I plug it back into Claude, it fixes it, and I start to wonder what the hell my job as a programmer even is anymore.

So that’s why, when I read about Steve Yegge’s Gas Town or Geoffrey Huntley’s Ralph loops (or this great overview by Anil Dash), I no longer brush it off as pure speculation or fantasy. I’ve seen what these tools can do, I’ve seen what happens when you lash together some very stupid barnyard animals and they’ve suddenly built the Pyramids, so I’m not surprised when smart engineers say that the solution to bad AI is to just add more AI. This is already working for me today (in my own little baby systems I’ve built), and I don’t have to imagine some sci-fi future to see what’s coming next.

The models don’t have to get better, the costs don’t have to come down (heck, they could even double and it’d still be worth it), and we don’t need another breakthrough. The breakthrough is already here; it just needs a bit more tinkering and it will become a giant lurching Frankenstein-meets-Akira-meets-the-Death-Star monster, cranking out working code from all 28 of its sub-agent tentacles.

I can already hear the cries of protest from other engineers who (like me) are clutching onto their hard-won knowledge. “What about security?” I’ve had agents find security vulnerabilities. “What about performance?” I’ve had agents write benchmarks, run them, and iterate on solutions. “What about accessibility?” Yeah they’re dumb at that – but if you say the magic word “accessibility,” and give them a browser to check their work, then suddenly they’re doing a better job than the median web dev (which isn’t saying much, but hey, it’s an improvement).

And honestly, even if all that doesn’t work, then you could probably just add more agents with different models to fact-check the other models. Inefficient? Certainly. Harming the planet? Maybe. But if it’s cheaper than a developer’s salary, and if it’s “good enough,” then the last half-century of software development suggests it’s bound to happen, regardless of which pearls you clutch.

I frankly didn’t want to end up in this future, and I’m hardly dancing on the grave of the old world. But I see a lot of my fellow developers burying their heads in the sand, refusing to acknowledge the truth in front of their eyes, and it breaks my heart because a lot of us are scared, confused, or uncertain, and not enough of us are talking honestly about it. Maybe it’s because the initial tribal battle lines have clouded everybody’s judgment, or maybe it’s because we inhabit different worlds where the technology is either better or worse (I still don’t think LLMs are great at UI for example), but there’s just a lot of patently unhelpful discourse out there, and I’m tired of it.

To me, the truth is this: between the hucksters selling you a ready-built solution, the doomsayers crying the end of software development, and the holdouts insisting that the entire house of cards is on the verge of collapsing – nobody knows anything. That’s the hardest truth to acknowledge, and maybe it’s why so many of us are scared or lashing out.

My advice (and I’ve already said I know nothing) would just be to experiment, tinker, and try to remain curious. It certainly feels to me like software development is unrecognizable from where it was 3 years ago, so I have no idea where it will be 3 years from now. It’s gonna be a bumpy ride for everyone, so just try have some empathy for your fellow passengers in the other tribe.

30 Dec

2025 book review

Posted by Nolan Lawson in Books. Leave a comment

A stack of books on a shelf, most of which are mentioned in this post

My reading appetite has been weak again this year, which I blame on two things: 1) Slay the Spire being way too good of a video game, and 2) starting a new job, and thus having more of my mental energy focused on that.

But I did manage to read some stuff! So without further ado, here are the book reviews:

Quick links

Great Maria by Cecilia Holland
Antichrist: A Novel Of The Emperor Frederick II by Cecilia Holland
Jerusalem by Cecilia Holland
More Everything Forever: AI Overlords, Space Empires, and Silicon Valley’s Crusade to Control the Fate of Humanity by Adam Becker
Meditations by Marcus Aurelius
Roadside Picnic by Arkady and Boris Strugatsky
Gandhi’s Passion: The Life and Legacy of Mahatma Gandhi by Stanley Wolpert

Great Maria by Cecilia Holland

Like last year, I read quite a few Cecilia Holland books. I still think she’s an extraordinary writer of historical fiction, and her ability to conjure up so many vivid worlds across so many different eras and cultures is remarkable.

This book, though, I had a really hard time getting into. It’s simply a lot slower-paced than her other books I’ve read, which focus on male characters and are more about swashbuckling action, war, etc. Maybe I’m just a simple-headed man, but I like that kind of stuff.

The main character here is (somewhat unusually for Holland) a woman, and it’s mostly about how she asserts control over her life despite the domineering men around her (who are in fact doing a lot of warring and swashbuckling, often off-screen).

The most interesting bit for me (as in Jerusalem below) is about the clash of cultures between east and west – in this case, Normans in Sicily (which is a thing I had no idea happened) colliding with Muslims in the same region (another thing I was ignorant of – blame my American education!).

My mom and sister absolutely adore this book, and I can’t say I wouldn’t recommend it, but it’s kind of a slow burn compared to Holland’s other books.

Antichrist: A Novel Of The Emperor Frederick II by Cecilia Holland

This is another great medieval tale, also featuring a clash of civilizations between Christian and Muslim, but featuring the larger-than-life character of Frederick II, who apparently had plenty of enemies on the Christian side and a lot of sympathy for the Muslim side. A polymath who reportedly spoke six languages (including Arabic and Greek), he spends much of the book trying to dream up a Crusade mostly for his own glory, while still being labeled a heretic and “antichrist” by the Pope. He’s a big bombastic character, full of contradictions, containing multitudes.

I knew very little about Frederick II before reading this book, so I really enjoyed the way Holland brought him to life. I’ll definitely never see the character the same way again when I decide to play as the Holy Roman Empire in Civ.

Jerusalem by Cecilia Holland

Perhaps unsurprisingly, since I said I love Holland’s more action-packed books focused on clashes of civilizations, this is perhaps my all-time favorite book of hers. I just find the Crusades fascinating overall (the apocalyptic mindset of the crusaders, the strangeness and surprising tolerance of the Muslims compared to their European counterparts, the religious fervor on both sides).

I knew very little about the “Crusader kingdoms” of the Middle Ages and was surprised to learn that there used to be a French-speaking king in Jerusalem. Just one of the many surprises and vivid details you get from Cecilia Holland’s work.

More Everything Forever: AI Overlords, Space Empires, and Silicon Valley’s Crusade to Control the Fate of Humanity by Adam Becker

I read enough tech journalism that most of this book didn’t surprise me, but I still enjoyed it. I mostly appreciated the realist perspective on how silly the idea is of terraforming Mars, or that we should forgo generosity for existing humans in favor of trillions of unborn theoretical humans (“longtermism”). I get the feeling that a lot of the tech elite have watched a little too much Star Trek and would benefit from rounding out their education with a bit more physics, ecology, and philosophy.

Meditations by Marcus Aurelius

…Which leads us to the next book I read. I’ve been getting more and more interested in religion and philosophy lately (Bart Ehrman‘s writings on Christianity started that), and I wanted to read one of the “founding” books of stoicism by the 2nd-century Roman emperor.

I admit I struggled to find much in this book that I could apply to my own life (at one point he says “avoid looking on your slaves with lust” – okay Marc, I’ll remember that), but it is always interesting to read primary documents and understand how the ancients actually thought. The other jarring thing is the contrast between his chill, thoughtful philosophy and the relentless warmongering of his actual emperorship. I guess like Frederick II, a lot of historical figures contained multitudes.

Roadside Picnic by Arkady and Boris Strugatsky

A few years ago I made an effort to read the greatest hits of sci-fi and dystopian fiction, and somehow I missed this one. It’s a short and really fun read, with lots of vivid characters and surprising twists. I would hate to spoil anything about it, but I’d say that if you enjoyed the Annihilation series, you’ll love this one.

Gandhi’s Passion: The Life and Legacy of Mahatma Gandhi by Stanley Wolpert

I haven’t finished this book yet, but I’ll optimistically add it to this year’s list. I happened to re-watch Richard Attenborough’s Gandhi this year, and I found myself riveted. Gandhi to me is more interesting as a religious or philosophical figure than a historical one, but I wanted to learn a bit more about his life since the film is just a summary (and has been accused of skipping important details and indulging in hagiography).

So far, the most interesting part for me is how much of Gandhi’s philosophy was informed by Christianity and Christian thinkers (the sermon on the mount was a huge inspiration for him) as well as his vegetarianism, which seems to have effectively been the start of his career in community organizing and advocacy (as he struggled to find meatless dishes in London). As a vegetarian myself, I find a lot of his perspective persuasive, although I doubt I could subject myself to the monk-like discipline that he tries to achieve.

28 Dec

An experiment in vibe coding

Posted by Nolan Lawson in Web. Tagged: AI. 10 comments

For the holidays, I gave myself a little experiment: build a small web app for my wife to manage her travel itineraries. I challenged myself to avoid editing the code myself and just do it “vibe” style, to see how far I could get.

In the end, the app was built with a $20 Claude “pro” plan and maybe ~5 hours of actual hands-on-keyboard work. Plus my wife is happy with the result, so I guess it was a success.

Screenshot of a travel itinerary app with a basic UI that looks like a lot of other CRUD apps, with a list of itinerary agenda items, dates and costs, etc.

There are still a lot of flaws with this approach, though, so I thought I’d gather my experiences in this post.

The good

The app works. It looks okay on desktop and mobile, it works as a PWA, it saves her itineraries to a small PocketBase server running on Railway for $1 a month, and I can easily back up the database whenever I feel like it. User accounts can only be created by an admin user, which I manage with the PocketBase UI.

I first started with Bolt.new but quickly switched to Claude Code. I found that Bolt was fine for the first iteration but quickly fell off after that. Every time I asked it to fix something and it failed (slowly), I thought “Claude Code could do this better.” Luckily you can just export from Bolt whenever you feel like it, so that’s what we did.

Bolt set up a pretty basic SPA scaffolding with Vite and React, which was fine, although I didn’t like its choice of Supabase, so I had Claude replace it with PocketBase. Claude was very helpful here with the ideation – I asked for some options on a good self-hosted database and went with PocketBase because it’s open-source and has the admin/auth stuff built-in. Plus it runs on SQLite, so this gave me confidence that import/export would be easy.

Claude also helped a lot with the hosting – I was waffling between a few different choices and eventually landed on Railway per Claude’s suggestion (for better or worse, this seems like a prime opportunity for ads/sponsorships in the future). Claude also helped me decipher the Railway interface and get the app up-and-running, in a way that helped me avoid reading their documentation altogether – all I needed to do was post screenshots and ask Claude where to click.

The app also uses Tailwind, which seems to come with decent CSS styles that look like every other website on the internet. I didn’t need this to win any design awards, so that was fine.

Note I also ran Claude in a Podman container with --dangerously-skip-permissions (aka “yolo mode”) because I didn’t want to babysit it whenever it wanted permission to install or run something. Worst case scenario, an attacker has stolen the app code (meh), so hopefully I kept the lethal trifecta in check.

The bad

Vibe-coding tools are decidedly not ready for non-programmers yet. Initially I tried to just give Bolt to my wife and have her vibe her way through it, but she quickly got frustrated, despite having some experience with HTML, CSS, and WordPress. The LLM would make errors (as they do), but it would get caught in a loop, and nothing she tried could break it out of the cycle.

Since I have a lot of experience building web apps, I could look at the LLM’s mistakes and say, “Oh, this problem is in the backend.” Or “Oh, it should write a parser test for this.” Or, “Oh, it needs a screenshot so it can see why the CSS is wrong.” If you don’t have extensive debugging experience, then you might not be able to succinctly express the problem to an LLM like this. Being able to write detailed bug reports, or even have the right vocabulary to describe the problem, is an invaluable skill here.

After handing it over from Bolt to Claude Code and taking the reigns myself, though, I still ran into plenty of problems. First off, LLMs still suck at accessibility – lots of <div>s with onClick all over the place. My wife is a sighted mouse user so it didn’t really matter, but I still have some professional pride even around vibe-coded garbage, so I told Claude to correct it. (At which point it promptly added excessive aria-labels where they weren’t needed, so I told it to dial it back.) I’m not the first to note this, but this really doesn’t bode well for accessible vibe-coded apps.

Another issue was performance. Even on a decent laptop (my Framework 13 with AMD Ryzen 5), I noticed a lot of slow interactions (typing, clicking) due to React re-rendering. This required a lot of back-and-forth with the agent, copy-pasting from the Chrome DevTools Performance tab and React DevTools Profiler, to get it to understand the problem and fix it with memoization and nested components.

At some point I realized I should just enable the React Compiler, and this may have helped but didn’t fully solve the problem. I’m frankly surprised at how bad React is for this use case, since a lot of people seem convinced that the framework wars are over, since LLMs are so “good” at writing React. The next time I try this, I might use a framework like Svelte or Solid where fine-grained reactivity is built-in, and you don’t need a lot of manual optimizations for this kind of stuff.

Other than that, I didn’t run into any major problems that couldn’t be solved with the right prompting. For instance, to add PWA capabilities, it was enough to tell the LLM: “Make an icon that kind of looks like an airplane, generate the proper PNG sizes, here are the MDN docs on PWA manifests.” I did need to follow up by copy-pasting some error messages from the Chrome DevTools (which required even knowing to look in the Application tab), but that resolved itself quickly. I got it to generate a CSP in a similar way.

The only other annoying problem was the token limits – this is something I don’t have to deal with at work, and I was surprised how quickly I ran into limits using Claude as a side project. It made me tempted to avoid “plan mode” even when it would have been the better choice, and I often had to just set Claude aside and wait for my limit to “reset.”

The ugly

The ugliest part of all this is, of course, the cheapening of the profession as well as all the other ills of LLMs and GenAI that have been well-documented elsewhere. My contribution to this debate is just to document how I feel, which is that I’m somewhat horrified by how easily this tool can reproduce what took me 20-odd years to learn, but I’m also somewhat excited because it’s never been easier to just cobble together some quick POCs or lightweight hobby apps.

After a couple posts on this topic, I’ve decided that my role is not to try to resist the overwhelming onslaught of this technology, but instead to just witness and document how it’s shaking up my worldview and my corner of the industry. Of course some will label me a collaborator, but I think those voices are increasingly becoming marginalized by an industry that has just normalized the use of generative AI to write code.

When I watch some of my younger colleagues work, I am astounded by how “AI-native” their behavior is. It infuses parts of their work where I still keep a distance. (E.g. my IDE and terminal are sacred to me – I like Claude Code in its little box, not in a Warp terminal or as inline IDE completions.)

Conclusion

The most interesting part of this whole experiment, to me, is that throwing together this hobby app has removed the need for my wife to try some third-party service like TripIt or Wanderlog. She tried those apps, but immediately became frustrated with bugs, missing features, and ad bloat. Whereas the app I built works exactly to her specification – and if she doesn’t like something, I can plug her feedback into Claude Code and have it fixed.

My wife is a power user, and she’s spent a lot of time writing emails to the customer support departments of various apps, where she inevitably gets a “your feedback is very important to us” followed by zilch. She’s tried a lot of productivity/todo/planning apps, and she always finds some awful showstopper bugs (like memory leaks, errors copy/pasting, etc.), which I blame on our industry just not taking quality very seriously. Whereas if there’s a bug in this app, it’s a very small codebase, it’s got extensive unit/end-to-end tests, and so Claude doesn’t have many problems fixing tiny quality-of-life bugs.

I’m not saying this is the death-knell of small note-taking apps or whatever, but I definitely think that vibe-coded hobby apps have some advantages in this space. They don’t have to add 1,000 features to satisfy 1,000 different users (with all the bugs that inevitably come from the combinatorial explosion of features) – they just have to make one person happy. I still think that generative UI is kind of silly, because most users don’t want to wait seconds (or even minutes) for their UI to be built, but it does work well in this case (where your husband is a professional programmer with spare time during the holidays).

For my regular dayjob, I have no intention to do things fully “vibe-coded” (in the sense that I barely look at the code) – that’s just too risky and irresponsible in my opinion. When the code is complex, your teammates need to understand it, and you have paying customers, the bar is just a lot higher. But vibe coding is definitely useful for hobby or throwaway projects.

For better or worse, the value of code itself seems to be dropping precipitously, to be replaced by measures like how well an LLM can understand the codebase (CLAUDE.md, AGENTS.md) or how easily it can test its “fixes” (unit/integration tests). I have no idea what coding will look like next year, but I know how my wife will be planning our next vacation.

« Older Entries

Read the Tea Leaves Software and other dark arts, by Nolan Lawson

Author Archive

Using AI to write better code more slowly

The diminished art of coding

Finding art elsewhere

You had a story

Days of miracle and wonder

We mourn our craft

15 years of blogging

Conclusion

Building a browser API in one shot

The experiment

Evaluating the code

Conclusion

AI tribalism

2025 book review

Quick links

Great Maria by Cecilia Holland

Antichrist: A Novel Of The Emperor Frederick II by Cecilia Holland

Jerusalem by Cecilia Holland

More Everything Forever: AI Overlords, Space Empires, and Silicon Valley’s Crusade to Control the Fate of Humanity by Adam Becker

Meditations by Marcus Aurelius

Roadside Picnic by Arkady and Boris Strugatsky

Gandhi’s Passion: The Life and Legacy of Mahatma Gandhi by Stanley Wolpert

An experiment in vibe coding

The good

The bad

The ugly

Conclusion

Recent Posts

About Me

Archives

Tags

Links