February 10, 2026

Is AI helping you to do work, or just to do stuff?

For someone who’s neither all-in on AI nor a passionate anti-AI crusader, I think about it a lot. From 2018 through 2023, I worked for Viv Labs, a startup from the folks who built Apple’s Siri; they had been bought by Samsung in order to fix Bixby, their much-maligned assistant. (The punchline here is that, to a large degree, they did fix it, but nobody noticed.) And my novel Kismet has emergent AI as a subtheme.

I don’t want to belabor, yet again, the very valid complaints about AI—the horrendous energy consumption, the way it’s being jammed desperately into everything, how all current models are trained on illegally-obtained material, and so on. But let’s nonetheless consider that Anthropic, the company that’s paid millions to settle copyright violation lawsuits, may well be one of the least unethical AI companies compared to their competitors with suicide advice engines and child porn generators. The industry as a whole is plagued with moral issues that go above and beyond the rest of Silicon Valley’s, and that’s saying something.

And yet, and yet, I find myself uncomfortable with a maximalist anti-AI position. Yes, current models incorporate pirated material, but there’s nothing intrinsic to the technology that suggests you couldn’t have a fine model trained entirely on legally accessible content. (I don’t buy the argument that the act of training itself is a copyright violation; if a search engine can legally get to it, a large language model should be able to, also. That LLM scrapers seem to be incredibly bad net citizens is an artifact of LLM companies being run by shitty people.) Yes, LLMs make mistakes when they parse natural language, screw up summaries in sometimes egregious ways, make imperfect transcriptions of audio—but they’re nonetheless a quantum leap past previous attempts.

But the sudden virality of ~~Clawdbot~~ ~~Moltbot~~ OpenClaw snapped something else into focus for me recently. In case you missed OpenClaw’s story, I’ll let TechCrunch explain it:

According to its tagline, [OpenClaw] is the “AI that actually does things”—whether it’s managing your calendar, sending messages through your favorite apps, or checking you in for flights. This promise has drawn thousands of users willing to tackle the technical setup required, even though it started as a scrappy personal project built by one developer for his own use.

On one level, that is absolutely the kind of Star Trek stuff AI optimists dream of. MacStories’ Federico Viticci, who’s recently poured his long-standing love of automation and scriptability into a 24/7 bear hug for productivity-focused generative AI tools, enthused, “OpenClaw showed me what the future of personal AI assistants looks like.” (I am convinced that an alternate universe version of Federico is a legendary Emacs guru, given that I doubt there’s a more scriptable editor in the world.)

On another level, it’s a massive risk to both your security and your wallet. That TechCrunch article notes an investor wryly pointing out, “‘actually doing things’ means ‘can execute arbitrary commands on your computer.’” If you have OpenMoltClawBot taking actions for you based on incoming mail and messages, it’s possible for someone to send a message to you that takes action you don’t want. As for the risk to the wallet, Viticci breezes past “I’ve been playing around with [OpenClaw] so much, I’ve burned through 180 million tokens on the Anthropic API”—but that’s over $500.

In part, these flaws stem from OpenMoltClawBot being dim in the way all LLMs are. If you ask it to run a task at sunrise, it may write a cron job that calls the Claude API every fifteen minutes with the question “is it sunrise yet?” By literal definition, an LLM’s “smarts” are the median of what it’s been trained on: they will never write like the mythical “10× programmer.” Now, there may be some other flaws stemming from OpenClaw itself being vibe-coded. Its author has literally said “I ship code I don’t read.” I, for one, see no problem giving code no human has so much as glanced at access to my personal files and bank accounts. What could possibly go wrong?

But let’s assume that your personal installation of OpenMoltClawBot doesn’t wire your entire 401(k) to a script kiddie in Hackensack. Even with that relieving condition met, how much time are you spending with it that’s doing stuff rather than doing work?

See, this is an old joke/truism about productivity systems: it’s easy to mistake Getting Things Done™ for getting things done. Spending time trying out different task managers, defining tags and projects, setting up your Inbox Zero system—it feels productive! It feels like you’re making progress! Variants of this appeal to a certain kind of mindset: you’re always on a search for the perfect task manager, the perfect pen and paper notebook system, the perfect text editor.

But a lot of your time is actually being spent in overhead, whether it’s managing the system you’ve chosen or switching between systems you’re trying out. Time spent organizing your tasks is time not spent doing tasks. Time spent fiddling with new text editors is time not spent writing. (As much as I’ve taken to Emacs, its endless fiddlability can be a double-edged sword.) A lot of the time you think you’re doing work, you’re actually just doing stuff. Stuff is not work.

If you’re spending a lot of time brainstorming things you can get OpenClaw to do for you, you’re spending more time than you think doing stuff. Maybe more time than you’d be spending actually just doing the work yourself. The examples of integration that Viticci shows in his article are, without a doubt, damn cool, but some have a distinct “I built it because it was possible” air to them. Is it neat that I could tell my virtual assistant in Telegram to add two things to my to-do list and then resume playing Spotify on the computer that I’m sitting in front of at the moment? Yes, but I can already press a global shortcut anywhere to pop up a window to add something to my to-do list. If I want to resume playing Apple Music on the computer that, again, I am sitting in front of at the moment, I can (ahem) press the play button.

But what’s started to get a little crazy-making for me is the “AI will inevitably eat everything” attitude, whether in Viticci’s article or in Marco Arment’s prophesying in a recent episode of the Accidental Tech Podcast. To quote Viticci:

When the major consumer LLMs become smart and intuitive enough to adapt to you on-demand for any given functionality—when you’ll eventually be able to ask Claude or ChatGPT to do or create anything on your computer with no Terminal UI—what will become of “apps” created by professional developers? I especially worry about standalone utility apps: if OpenClaw can create a virtual remote for my LG television (something I did) or give me a personalized report with voice every morning (another cron job I set up) that work exactly the way I want, why should I even bother going to the App Store to look for pre-built solutions made by someone else?

Any given functionality? Really? Do you actually think you’ll eventually be able to ask Claude or ChatGPT to write a word processor for you? A photo editor? The next Zelda game? No, let me rephrase that; I mean, you can ask Claude or ChatGPT anything, and it’ll plunge ahead with the absolute confidence that comes from being a predictive text generator being sold as a robot brain in a jar. But do you think that what comes out is going to be good? Really good? Overcast good? Halide good? Scrivener good? How long do you think it’d take you to even begin to approach any of those doing vibe coding? How much human interaction would be needed? How much actual programming and design knowledge would the creator need to arrive at anything remotely usable?

Once again, the best an LLM can generate is by definition statistically median. This is not just a temporary limitation of current models; it’s a fundamental artifact of the vector mathematics that makes LLMs work. If you, the user, are patient and clever enough like Federico, you’ll be able to cajole better output. But cajoling an LLM into designing a large, consumer-focused application, from a clean internal architecture to all the nuances of good UX? I’m pretty skeptical. I know many people aren’t. I suspect those people are, over the next few years, going to prove me right in the most depressing way possible: flooding app stores with LLM-generated crap riddled with confusing UX choices, ugly design, memory leaks, and terrifying security holes.

I don’t doubt that LLMs can become a primary interaction method we use to interact with certain apps for certain tasks, and yes, agentic AI could become a primary automation tool for those who care about that level of automation. But how many people is that? Viticci describes having set up a Zapier automation that creates a new project in his to-do app for the next issue of his weekly newsletter by monitoring the newsletter’s RSS feed and creating the project after the current newsletter appears in it. Now, he’s replicated that in OpenClaw, which potentially saves money! (Accent on “potentially.”) But most people, faced with a similar problem, would have just created a repeating weekly reminder for “start new newsletter project” and called it good enough.

This is a chimera nerds have been chasing for decades: surely, surely, the only reason the majority of computer users don’t write their own programs and automations is that it just isn’t easy enough to do it yet. When I say that I don’t think generative AI is going to finally crack that particular nut, it has nothing to do with AI: I don’t think there’s actually a nut there to crack. To most users, programming and automating is doing stuff, and they want to do work. (Or play the next Zelda game.)

Again, I’m not rabidly anti-AI. But in the same way generative AI actively encourages us to ascribe powers to chatbots that they simply don’t have—a simulacra of absolute confidence at any given task is literally baked into LLMs—it encourages us to mistake doing stuff for doing work. Studies show that developers who use AI assistance consistently overestimate how fast it makes them, and I’m not surprised: it feels fast to use AI assistance. My own experiments using AI to draft technical documentation (at the behest of an all-in-on-AI employer) are great illustrations of that. The AI “writer” generated hundreds of lines of stuff that was mostly correct, most of it usable with a few tweaks here and there. But when I look at the diffs between the machine-generated first draft and my final copy, I wasn’t making “a few tweaks,” I was massively rewriting the whole thing.

Can I swear the LLM didn’t help? No. If nothing else, it kickstarted the process. And I know that whether I’m coding, writing docs, or even writing fiction, I feel really damn productive when I’m refactoring or editing line after line instead of staring an empty, foreboding text window.

But am I doing a lot of work, or am I just doing a lot of stuff?

To support my writing, throw me a tip on Ko-fi.com!