Don't really agree, in my experience the switching context is extremely costly. I personally have trouble having even a couple of sessions running in parallel,Especially when I'm talking difficult hard to solve problems. Of course it's easy for trivial jobs, but it's not always the case. I have been much more successful in making my time worth by taking a look at the model's output and actively participating.It gives me time to think as well.When I have a list of simple tasks I just tell it to the model and it executes one after another.
rybosworld 1 days ago [-]
There's a lot more "telling" than "showing" going on.
By that I mean - the people claiming hyper-productivity from their GasTown setup never have actual products to demo.
hirako2000 1 days ago [-]
Perhaps they earn $500k and worry spending any less than $250k in token may raise suspicion.
mcintyre1994 1 days ago [-]
Something would be deeply wrong!
pllbnk 1 days ago [-]
So far the only company that is really outspoken about the scale of their vibe coding has been Anthropic. However their uptime and bug count is atrocious.
ikidd 20 hours ago [-]
They should vibecode their account management into the 21st century and let users change their email address without deleting their account.
switchbak 1 days ago [-]
There's also a concern I don't hear folks talk about: the potential for all of this multi-tasking to be causing issues in your wellbeing or even harming your brain.
Eg: "For example, functional magnetic resonance imaging (fMRI) studies have shown that multitasking reduces activation in brain regions involved with cognitive control while increasing activation in areas associated with stress and arousal" - from https://pmc.ncbi.nlm.nih.gov/articles/PMC11543232/
I've tried hard to stay away from Instagram, TikTok, etc - for this very reason. Now my day job is going to be attacking me in much the same way? Great.
InsideOutSanta 24 hours ago [-]
I tried running multiple agents concurrently, but it is exhausting. My brain feels completely murdered and unhappy at the end of the day. I can do two, and keep both contexts active in my head, but not more than that. And even that feels stressful.
jeapostrophe 21 hours ago [-]
I agree that it is very exhausting
lanyard-textile 1 days ago [-]
I can do two or three at a time. I treat them a bit like queues: Last in first out, sort of like we do with our human peers.
We delegate work, we tend to some other work, and we code review much later in the day.
The secret to this mindset is that it doesn't always have to line up. Let your agent wait for you; You'll get to their output next.
sarchertech 1 days ago [-]
I don’t know about you but I’m not constantly round robin delegating work to peers and reviewing it on a 10-20 minute cadence. No one works like that. I don’t know if anyone is even capable of working like that day in and day out long term for any meaningful definition of review.
lanyard-textile 5 hours ago [-]
Who said a 10-20 minute cadence? :)
Sometimes yes, but often no. It takes time to write what you need into claude code and to review what it makes.
I do whatever work I need to do, however long it takes, send to claude, and immediately pop off the next queue item.
sarchertech 3 hours ago [-]
If it’s less than a 24 hours cadence it’s nothing like delegating work to colleagues.
Micromanaging 3 junior engineers working on different tasks to the point where you are reviewing each one’s work multiple times per day and assigning new tasks multiple times per day sounds like a quick ticket to burnout.
jcadam 1 days ago [-]
True. Sometimes I'll run front-end and backend work in two different claude instances, but always on the same project/product. I'll have "reviewer" instances in opencode using a different (non-Claude) model doing reviews, that's about as much as I can handle. You've got to supervise it while it works. I do have to stop claude from time to time when I catch it doing something naive or unnecessarily complex.
jeapostrophe 24 hours ago [-]
My tool supports doing many, but I find it hard to use it for much more than 3 or 4 concurrent projects. I've tried more than that open and I fail. I find that 3 project with 2 or 3 concurrent tasks at the same time works best for me. But I think I'm learning.
EdNutting 1 days ago [-]
“Don’t pay attention to what Claude is doing, just spam your way through code and commands and hope nothing went wrong and you catch any code issues in review afterwards” is what this sounds like.
I will run parallel Claude sessions when I have a related cluster of bugs which can be fixed in parallel and all share similar context / mental state (yet are sufficiently distinct not to just do in one session with subagents).
Beyond that, parallel sessions to maybe explore some stuff but only one which is writing code or running commands that need checking (for trust / safety / security reasons).
Any waiting time is spent planning next steps (eg writing text files with prompts for future tasks) or reviewing what Claude previously did and writing up lists (usually long ones) of stuff to improve (sometimes with drafts prompts or notes of gotchas that Claude tripped up on the first time which I can prompt around in future).
Spend time thinking, not just motoring your way through tokens.
jeapostrophe 23 hours ago [-]
I disagree. My workflow is built around reviewing what it produces and trying to build a process where it is effective to do that. I definitely can't and don't watch edits as they go by because it is too fast, but I want to easily review every line of code. If you're not "reviewing afterwards", then when would you be reviewing?
As far as planning the next steps, that's definitely a valuable thing and often times I find myself spending many cycles working on a plan and then executing it, reviewing code as I go. I tend to have a plan-cycle and a code-cycle going on at the same time in different projects. They are reactive/reviewing in different ways.
phainopepla2 1 days ago [-]
Just show us the prompt you used to produce this post instead of the output
the_af 1 days ago [-]
Nice catch. Look at this at the end:
> jc is open source. If you have improvements, have your Claude open a PR against mine. I don’t accept human-authored code.
So it seems not only does the author reject human-authored PRs, they also refuse human-authored blog posts.
snovv_crash 1 days ago [-]
I wonder if they also only want agents to read it, not people.
oidar 1 days ago [-]
I disagree with this take. I get that LLM produced text is filled with crappy, over the top writing in pretty much all cases, but if a prompter/writer/blogger is using it iteratively, the LLM output is going to be way better than their writing. Also, if a person is using LLMs to write articles, do you really want to see their likely even worse writing?
satisfice 24 hours ago [-]
Yes, I want to see the prompts. Yes.
But I won’t promise to read it, because it’s bad writing.
So maybe it would be better to not use the LLM to draft writing that pretends to be you. That would be easier on everyone who reads.
Instead we live in a world where all of us are reading through a cynical lens.
This comment was written without using any form of AI.
fragmede 23 hours ago [-]
Was this written by an LLM?
> This comment was written without using any form of AI.
That's exactly what ChatGPT would write if it didn't want us to think it wrote that comment!
satisfice 23 hours ago [-]
In this ever-changing world, it pays to delve beneath the surface of a casual claim— if you know what I mean.
oytis 1 days ago [-]
When computer works, it's sword fighting time. I don't make the rules
I think my sweetspot is having one (30min+) features a day. And then after spend synchronous time iterating on it to fix edgecases or tweak stuff.
The rest of my time goes to prepping those big features (designing, speccing, talking, thinking, walking).
Going to see how big a feature can be before the quality suffers too much and it becomes unmaintainable. This highly depends on how good I spec it out and how good I orchestrate the agentic workflow.
jeapostrophe 23 hours ago [-]
I've gone through a bunch of different processes learning how to use Claude.
Giving it large tasks that take 40 minutes basically always fails for me. Giving it small tasks that take 30s to a minute feels like it is my typist and not a worker. I find that I am happiest and most effective at the 5 to 7 minute cycle timeframe.
samename 1 days ago [-]
Why should Claude finish complex tasks in less than seven minutes?
bwestergard 1 days ago [-]
The need for "complex tasks" should be exceptional enough that you're not building your workflow around them. A good example of such an exception would be kickstarting a port of a project for which you have a great test suite from one language to another. This is rare in most professional settings.
btown 23 hours ago [-]
I wholeheartedly disagree with this. For any iteration, Claude should be reading your codebase, reading hundreds of thousands of tokens of (anonymized) production data, asking itself questions about backwards compatibility that goes beyond existing test suites, running scripts and CI to test that backwards compatibility, running a full-stack dev server and Chrome instance to QA that change, across multiple real-world examples.
And if you're building a feature that will call AI at runtime, you'll be iterating on multiple versions of a prompt that will be used at runtime, each of which adds token generation to each round of this.
In practice on anything other than a greenfield project, if you're asking for meaningful features in complex systems, you'll be at that 10 minute mark or more. But you've also meaningfully reduced time-to-review, because it's doing all that QA, and can provide executive summaries of what it finds. So multitasking actually works.
skydhash 1 days ago [-]
Computers are fast. If a physic engine can compute a game world in 1/60 of a second. The majority of the tasks should be done in less than 7 minutes.
Whenever I see transcript of a long running task, I see a lot of drifting of the agent due to not having any context (or the codebase is not organized) and it trying various way to gather information. Then it settle on the wrong info and produce bad results.
Greppability of the codebase helps. So do following patterns and good naming. A quick overview of the codebase and convention description also shortens the reflection steps. Adding helper tools (scripts) help too.
pllbnk 1 days ago [-]
I don't know how and if people really manage to run many tasks in parallel and also not check the output. Very recently I had two items that for a reasonably intelligent engineer wouldn't be very complex, but would take time to implement.
One of them was vibe-coding an Electron app for myself that was running a Llama server. Claude couldn't find out why it wasn't running on Windows while it worked fine on Linux and Mac. I obviously didn't check all its output but after several hours had a feeling that it was running in circles. Eventually we managed to cooperatively debug it after I gave it several hints but it wasted a a lot of time for a rather simple issue which was a challenge for me also because I didn't know well how the vibe-coded app worked.
The second one (can't go into details) was also something that's reasonably simple but I was finding awfully many bugs because unlike the first app, this one was for my job and I review everything. So we had to go back and forth for multiple hours.
How can someone just switch to another task while the current one requires constant handholding?
jeapostrophe 23 hours ago [-]
My personal experience matches this. When I'm "succeeding", I am at the 5-to-7 minute cycle time and when I (or Claude) are failing, there's constant attention and no ability to switch away.
My human programming experience is encouraging me to keep going on the debugging, like I did when it was my code that I invested a lot of time and energy into.
Now that the code is cheap, I am trying to "learn" to throw away everything, go back to a stable checkpoint, and try a different approach that is more likely to succeed. (Probably having the new plan incorporate the insights I gained the first round.)
It is hard to do that when you coded for a week (or even a weekend) but it should be much easier when you got it faster with Claude. I think people (me at least) need to learn new norms.
trjordan 1 days ago [-]
I'd offer a different approach: think about how you're going to validate. An only-slightly-paraphrased Claude conversation I had yesterday:
> me: I want our agent to know how to invoke skills.
> Claude: [...]
> Claude: Done. That's the whole change. No MCP config, no new env vars, no caller changes needed.
> me: ok, test it.
> Claude: This is a big undertaking.
That's the hard part, right? Maybe Claude will come back with questions, or you'll have to kick it a few times. But eventually, it'll declare "I fixed the bug!" or summarize that the feature is implemented. Then what?
I get a ton of leverage figuring this out what I need to see to trust the code. I work on that. Figure out if there's a script you can write that'll exercise everything and give you feedback (2nd claude session!). Set up your dev env so playwright will Just Work and you can ask Claude to click around and give you screenshots of it all working. Grep a bunch and make yourself a list of stuff to review, to make sure it didn't miss anything.
jeapostrophe 23 hours ago [-]
Amen. Making the checking painless and easy to do is a major boon. There's a spectrum of "checking is easy": the compiler telling you the code doesn't compile is the easiest, but doesn't capture "is this the program I want". Some checks like that are inherently not mechanically checkable and some sort of written "testing protocol" is necessary.
tkzed49 1 days ago [-]
was this written using a LinkedIn skill
BeetleB 1 days ago [-]
Very painful to read.
jolt42 1 days ago [-]
Yes. I agree with the problem statement, but have no idea what the solution is BUT it does involve lots of key bindings.
jeapostrophe 23 hours ago [-]
LOTs of key bindings.
flakiness 1 days ago [-]
The old saying is "don't multitask" but apparently that time is gone.
I wonder what people think about this. I know there is a class of SWE/dev who now consider oneself as "the manager of agents". Good luck to them and articles like this would work for these people.
I'm not there yet and I hope I don't have to. I'm not a LLM and my mental model is (I believe) more than a markdown. But I haven't figured out the mental model that works for me, still staring at the terminal Claude blinking the cursor, sticking to "don't multitask" dogma.
jeapostrophe 23 hours ago [-]
Whether multi-tasking is good or bad, I think that if you're "waiting for Claude" at all, you're going to be multi-tasking or staring into space. I try to stare into space when I'm pumping gas, but I don't want to do that when I'm "working" at my laptop... in part because I know I'd be more likely to check my email or HN.
kubb 1 days ago [-]
Wasn't there some recent discovery that context switching is harmful to your brain?
mbo 1 days ago [-]
I'm not sure I'm understanding this workflow. Perhaps a small tutorial / walkthrough hosted on YouTube or asciinema might help people understand.
hirako2000 1 days ago [-]
It's just a process to loop over a number of cycle prompting each thing taking minutes to run. It's a recipe for a massive headache as context switching costs more than 7 minutes (that arbitrary number the article came up with)
4b11b4 23 hours ago [-]
At the end the author says that they don't accept human authored code... gg my friend, you've contracted pychosis
servercobra 1 days ago [-]
This looks absolutely wonderful. Is it possible to run against Claude remotely (e.g. on a VM?). Or should I ask Claude to add that?
jeapostrophe 23 hours ago [-]
Ask Claude ;) Right now it is hard-coded to run `claude --resume <uuid>` but there's a natural abstraction to use a different script to start the Claude session.
If you're being sarcastic, I love you anyways.
servercobra 22 hours ago [-]
Not sarcastic at all! This is a novel way to handle something that's been in the back of my mind (managing a lot of agents successfully).
teaearlgraycold 1 days ago [-]
> The fix is obvious: work on something else while Claude runs.
Disagree. The fix is actually counter-intuitive: give Claude smaller tasks so that it completes them in less time and you remain in the driver's seat.
weakfish 1 days ago [-]
> jc is open source. If you have improvements, have your Claude open a PR against mine. I don’t accept human-authored code.
Is this sarcasm? If not, I wonder why.
chis 1 days ago [-]
Hackernews needs to nominate an elite crew of individuals who can tell when an article is AI slop and flag it.
jedberg 1 days ago [-]
Or, wait and take a little break so you don't burn out. I miss the days where you had to wait for code to compile or for your "big data" job to run, so you could give yourself a little mini break.
For every single post of this type: please stop writing as if you know that any of this works well.
You don’t know! You are experimenting, speculating, and excited to share. That’s fine.
What’s not okay is presenting a false impression that you have deep experience and did sufficient experimentation and that you know the risks and have experienced the problems associated with your wonderful idea. This takes time.
I want to know:
- Caveats
- Variations
- Descriptions of things that went wrong
- Self-critical reflection
- Awareness of objections that others will probably have
- Comparison with viable alternatives
If you want to credibly say “Don’t do this! Do that!” there is a high bar to meet.
jeapostrophe 23 hours ago [-]
I agree. This is what has worked for the past few weeks and I want to share. Maybe I will regret my life choices. If I'm still doing it next year, that will be something different to say. But I want to try to help and share before I really know. <3
cruffle_duffle 1 days ago [-]
This advice will be very dated when inference gets an order of magnitude faster. And it will happen—it’s classic tech. Probably will even follow moores law or something.
Wait until that 8 minute inference is only a handful of seconds and that is when things get real wild and crazy. Because if the time inference takes isn’t a bottleneck… then iteration is cheap.
jeapostrophe 23 hours ago [-]
Yea, I think it will be totally useless to switch at that level and instead it will be about reviewing the work more effectively. I think I would believe in the more "autonomous Claude" systems in that world.
cruffle_duffle 22 hours ago [-]
It will be crazy. Because the cost of “failure” will be dramatically lower, meaning these things can sometimes just throw educated darts at the wall until a solution is found. It’s way too slow to do that kind of thing now.
(Presumably cost per token will be dramatically lower as well)
avazhi 1 days ago [-]
Wtf is this LLM slop
ahaucnx 1 days ago [-]
[dead]
bfbsoundetch 1 days ago [-]
[dead]
leoc 1 days ago [-]
[dead]
Bnjoroge 1 days ago [-]
Lots of LLM-isms in the article from a very casual scan so going to assume nothing interesting here
Rendered at 19:50:38 GMT+0000 (Coordinated Universal Time) with Vercel.
By that I mean - the people claiming hyper-productivity from their GasTown setup never have actual products to demo.
Eg: "For example, functional magnetic resonance imaging (fMRI) studies have shown that multitasking reduces activation in brain regions involved with cognitive control while increasing activation in areas associated with stress and arousal" - from https://pmc.ncbi.nlm.nih.gov/articles/PMC11543232/
I've tried hard to stay away from Instagram, TikTok, etc - for this very reason. Now my day job is going to be attacking me in much the same way? Great.
We delegate work, we tend to some other work, and we code review much later in the day.
The secret to this mindset is that it doesn't always have to line up. Let your agent wait for you; You'll get to their output next.
Sometimes yes, but often no. It takes time to write what you need into claude code and to review what it makes.
I do whatever work I need to do, however long it takes, send to claude, and immediately pop off the next queue item.
Micromanaging 3 junior engineers working on different tasks to the point where you are reviewing each one’s work multiple times per day and assigning new tasks multiple times per day sounds like a quick ticket to burnout.
I will run parallel Claude sessions when I have a related cluster of bugs which can be fixed in parallel and all share similar context / mental state (yet are sufficiently distinct not to just do in one session with subagents).
Beyond that, parallel sessions to maybe explore some stuff but only one which is writing code or running commands that need checking (for trust / safety / security reasons).
Any waiting time is spent planning next steps (eg writing text files with prompts for future tasks) or reviewing what Claude previously did and writing up lists (usually long ones) of stuff to improve (sometimes with drafts prompts or notes of gotchas that Claude tripped up on the first time which I can prompt around in future).
Spend time thinking, not just motoring your way through tokens.
As far as planning the next steps, that's definitely a valuable thing and often times I find myself spending many cycles working on a plan and then executing it, reviewing code as I go. I tend to have a plan-cycle and a code-cycle going on at the same time in different projects. They are reactive/reviewing in different ways.
> jc is open source. If you have improvements, have your Claude open a PR against mine. I don’t accept human-authored code.
So it seems not only does the author reject human-authored PRs, they also refuse human-authored blog posts.
But I won’t promise to read it, because it’s bad writing.
So maybe it would be better to not use the LLM to draft writing that pretends to be you. That would be easier on everyone who reads.
Instead we live in a world where all of us are reading through a cynical lens.
This comment was written without using any form of AI.
> This comment was written without using any form of AI.
That's exactly what ChatGPT would write if it didn't want us to think it wrote that comment!
https://xkcd.com/1053/
And for those who are feeling smug, that last one (which I still consider fairly recent) was 14 years ago
If you start trying to juggle multiple agents, you are doubling down on the wrong strategy.
https://hbr.org/2010/12/you-cant-multi-task-so-stop-tr
The rest of my time goes to prepping those big features (designing, speccing, talking, thinking, walking).
Going to see how big a feature can be before the quality suffers too much and it becomes unmaintainable. This highly depends on how good I spec it out and how good I orchestrate the agentic workflow.
Giving it large tasks that take 40 minutes basically always fails for me. Giving it small tasks that take 30s to a minute feels like it is my typist and not a worker. I find that I am happiest and most effective at the 5 to 7 minute cycle timeframe.
And if you're building a feature that will call AI at runtime, you'll be iterating on multiple versions of a prompt that will be used at runtime, each of which adds token generation to each round of this.
In practice on anything other than a greenfield project, if you're asking for meaningful features in complex systems, you'll be at that 10 minute mark or more. But you've also meaningfully reduced time-to-review, because it's doing all that QA, and can provide executive summaries of what it finds. So multitasking actually works.
Whenever I see transcript of a long running task, I see a lot of drifting of the agent due to not having any context (or the codebase is not organized) and it trying various way to gather information. Then it settle on the wrong info and produce bad results.
Greppability of the codebase helps. So do following patterns and good naming. A quick overview of the codebase and convention description also shortens the reflection steps. Adding helper tools (scripts) help too.
One of them was vibe-coding an Electron app for myself that was running a Llama server. Claude couldn't find out why it wasn't running on Windows while it worked fine on Linux and Mac. I obviously didn't check all its output but after several hours had a feeling that it was running in circles. Eventually we managed to cooperatively debug it after I gave it several hints but it wasted a a lot of time for a rather simple issue which was a challenge for me also because I didn't know well how the vibe-coded app worked.
The second one (can't go into details) was also something that's reasonably simple but I was finding awfully many bugs because unlike the first app, this one was for my job and I review everything. So we had to go back and forth for multiple hours.
How can someone just switch to another task while the current one requires constant handholding?
My human programming experience is encouraging me to keep going on the debugging, like I did when it was my code that I invested a lot of time and energy into.
Now that the code is cheap, I am trying to "learn" to throw away everything, go back to a stable checkpoint, and try a different approach that is more likely to succeed. (Probably having the new plan incorporate the insights I gained the first round.)
It is hard to do that when you coded for a week (or even a weekend) but it should be much easier when you got it faster with Claude. I think people (me at least) need to learn new norms.
> me: I want our agent to know how to invoke skills.
> Claude: [...]
> Claude: Done. That's the whole change. No MCP config, no new env vars, no caller changes needed.
> me: ok, test it.
> Claude: This is a big undertaking.
That's the hard part, right? Maybe Claude will come back with questions, or you'll have to kick it a few times. But eventually, it'll declare "I fixed the bug!" or summarize that the feature is implemented. Then what?
I get a ton of leverage figuring this out what I need to see to trust the code. I work on that. Figure out if there's a script you can write that'll exercise everything and give you feedback (2nd claude session!). Set up your dev env so playwright will Just Work and you can ask Claude to click around and give you screenshots of it all working. Grep a bunch and make yourself a list of stuff to review, to make sure it didn't miss anything.
I wonder what people think about this. I know there is a class of SWE/dev who now consider oneself as "the manager of agents". Good luck to them and articles like this would work for these people.
I'm not there yet and I hope I don't have to. I'm not a LLM and my mental model is (I believe) more than a markdown. But I haven't figured out the mental model that works for me, still staring at the terminal Claude blinking the cursor, sticking to "don't multitask" dogma.
If you're being sarcastic, I love you anyways.
Disagree. The fix is actually counter-intuitive: give Claude smaller tasks so that it completes them in less time and you remain in the driver's seat.
Is this sarcasm? If not, I wonder why.
Of course there is a relevant XKCD: https://xkcd.com/303/
You don’t know! You are experimenting, speculating, and excited to share. That’s fine.
What’s not okay is presenting a false impression that you have deep experience and did sufficient experimentation and that you know the risks and have experienced the problems associated with your wonderful idea. This takes time.
I want to know:
- Caveats - Variations - Descriptions of things that went wrong - Self-critical reflection - Awareness of objections that others will probably have - Comparison with viable alternatives
If you want to credibly say “Don’t do this! Do that!” there is a high bar to meet.
Wait until that 8 minute inference is only a handful of seconds and that is when things get real wild and crazy. Because if the time inference takes isn’t a bottleneck… then iteration is cheap.
(Presumably cost per token will be dramatically lower as well)