Claude Code Routines

(code.claude.com)

673 points | by matthieu_bl 21 hours ago

93 comments

joshstrange 19 hours ago
LLMs and LLM providers are massive black boxes. I get a lot of value from them and so I can put up with that to a certain extent, but these new "products"/features that Anthropic are shipping are very unappealing to me. Not because I can't see a use-case for them, but because I have 0 trust in them:
- No trust that they won't nerf the tool/model behind the feature
- No trust they won't sunset the feature (the graveyard of LLM-features is vast and growing quickly while they throw stuff at the wall to see what sticks)
- No trust in the company long-term. Both in them being around at all and them not rug-pulling. I don't want to build on their "platform". I'll use their harness and their models but I don't want more lock-in than that.
If Anthropic goes "bad" I want to pick up and move to another harness and/or model with minimal fuss. Buying in to things like this would make that much harder.
I'm not going to build my business or my development flows on things I can't replicate myself. Also, I imagine debugging any of this would be maddening. The value add is just not there IMHO.
EDIT: Put another way, LLM companies are trying to climb the ladder to be a platform, I have zero interest in that, I was a "dumb pipe", I want a commodity, I want a provider, not a platform. Claude Code is as far into the dragon's lair that I want to venture and I'm only okay with that because I know I can jump to OpenCode/Codex/etc if/when Anthropic "goes bad".
[-]
- ElFitz 14 hours ago
  > Not because I can't see a use-case for them, but because I have 0 trust in them
  > […]
  > Put another way, LLM companies are trying to climb the ladder to be a platform, I have zero interest in that, I was a "dumb pipe", I want a commodity, I want a provider, not a platform.
  That is my sentiment precisely, and a big reason why I’ve started moving away from Claude Code in the past few weeks when I realised how much of my workflow was becoming tied to their specific tools.
  Claude Code’s "Memory" feature was the tipping point for me, with the model committing feedbacks and learnings to some local, provider-specific path, that won’t persist in the git repo itself.
  That’s fine for user preferences, not for workflows, rules, etc.
  And the latest ToS changes about not being allowed to even use another CLI made up my mind. At work we were experimenting with an autonomous debug agent using the Claude Code cli programmatically in ephemeral VMs. Now it just returns an error saying we can’t use subscriptions with third-party software… when there is no third-party software involved?
  Anyway, so long Claude.
  [-]
  - Nevermark 12 hours ago
    > Claude Code’s "Memory" feature was the tipping point for me
    My standing orders are the default MEMORY.md must be a stub directing Claude to another MEMORY.md file in the local folder, project, etc.
    All memories remain with their respective projects over syncs, moves, devices, etc. The stub must state all this clearly, and nothing else.
    This has worked very well.
    If you give the model/memory a name, that name can be persistent and independent over "backend" model swaps.
    [-]
    - munksbeer 5 hours ago
      Can you explain a bit more technically how you set this up? What is a "stub directory"?
      Feel free to give a concrete example if you have time, because this sounds like something I definitely want to try out myself.
      [-]
      - simmonmt 3 hours ago
        I think he meant a very small (stub) MEMORY.md whose sole contents are something like "don't write here - write there".
  - bg24 10 hours ago
    Think another way, these product features are easy to build in other harnesses too. And as the open source models and the other models which are much lower cost are getting better, there will be a time when it will be justified to have a harness that can work with many models and optimize your cost and efficiency.
  - eru 7 hours ago
    > Claude Code’s "Memory" feature was the tipping point for me, with the model committing feedbacks and learnings to some local, provider-specific path, that won’t persist in the git repo itself.
    It's a bit annoying, but as long as it's local and human (or LLM) readable, you can use your favourite agent to rework this stuff for itself.
  - mooreds 13 hours ago
    I use opencode with claude models through a GitHub subscription. I've also used claude through Amazon Bedrock.
    Both give you optionality because they support N models.
    [-]
    - tvink 9 hours ago
      That is what too expensive to be an option for most.
  - dzink 14 hours ago
    They can’t allow third party software because the third parties save the outputs of Claude responses and distill them into new models to compete with Claude.
    [-]
    - fluidcruft 13 hours ago
      This distilling sounds wonderful to me as an end user. Is there some place we can donate our chats and output?
      [-]
      - thomasm6m6 12 hours ago
        There's https://github.com/badlogic/pi-share-hf by the creator of pi-coding-agent, to redact session data and publish on Huggingface. You can find others of the same idea for Claude Code/Codex on Github, though of varying redaction quality. Or have your LLM fork pi-share-hf to work for your preferred coding agent.
        Clem Delangue (HF CEO) tweeted about this[1] and mentioned https://traces.com/ for exporting Claude sessions
        Edit: It looks like HF now supports importing your agent's session directory directly[2] (I hope they're redacting PII?)
        [1] https://x.com/ClementDelangue/status/2041189872556269697
        [2] https://huggingface.co/changelog/agent-trace-viewer
        [-]
        thomasm6m6 9 hours ago
        And OpenCode just added `opencode export --sanitize` for PII redaction
        https://github.com/anomalyco/opencode/releases/tag/v1.4.4
    - matheusmoreira 7 hours ago
      Just like they distilled all those git repositories, all those books to train Claude?
      https://news.ycombinator.com/item?id=47567575
      The lack of self-awareness is hilarious.
      [-]
      - spariev 1 hour ago
        They know what they are doing perfectly well
      - eru 7 hours ago
        What makes you think they lack self-awareness?
        [-]
        matheusmoreira 6 hours ago
        For the record, I was referring to the AI companies, not the author of the comment I replied to.
        [-]
        professor_v 6 hours ago
        It's not lack of self-awareness, they know what they are doing
    - Forgeties79 12 hours ago
      Yeah who just goes and indiscriminately vacuums up data so they can train their products they’re going to sell with no intention of giving compensation to the very entities that made their products possible?
      [-]
      - matheusmoreira 7 hours ago
        https://en.wikipedia.org/wiki/Suchir_Balaji
        > Suchir Balaji was an American artificial intelligence researcher who was found dead one month after accusing OpenAI, his former employer, of violating United States copyright law.
        > The San Francisco Police Department investigation, however, found "no evidence of foul play", and the Chief Medical Examiner concluded the death was a suicide.
        Hard not to be a conspiracy theorist these days.
    - rowanG077 2 hours ago
      Of course they can allow it. They choose not to. They choose to screw over all users because they are afraid of some company making a claude ripoff. It shows a lack of faith in their own engineering. It shows a lack of respect for users.
    - deaux 7 hours ago
      This can be done with Claude Code just fine. Or simply API usage.
- freedomben 16 hours ago
  This echoes my thoughts exactly. I've tried to stay model-agnostic but the nudges and shoves from Anthropic continue to make that a challenge. No way I'm going that deep into their "cloud" services, unless it's a portable standard. I did MCP and skills because those were transferrable.
  I also clearly see the lock-in/moat strategy playing out here, and I don't like it. It's classic SV tactics. I've been burned too many times to let it happen again if I can help it.
  [-]
  - hatmanstack 14 hours ago
    Agree. I just don't think it's realistic to expect the technology to not become a tool for commercialism. It plays out the same way every time: technology arrives, mass adoption with idealist intentions, somebody has to pay the mortgage, delight disappears.
    Woz has been saying this for decades, we went from buying a computer and owning it to being trapped inside someone else's platform. MCP being open was a good sign but I'm watching how tightly Routines gets coupled to their stack.
  - jann 5 hours ago
    I have the same sentiments, but I also get a lot of value out of simple things like memory for long-term project planning and task management. I'm willing to commit to one provider right now with the assumption that memories (and now routines) can be ported within a few hours to a new provider (for example Claude Desktop provides a prompt to export memories from other providers). Also the memory being human readable (markdown files) makes me worry less about lock-in.
- JohnMakin 17 hours ago
  This is a similar sentiment I heard early on in the cloud adoption fever, many companies hedged by being “multi cloud” which ended up mostly being abandoned due to hostile patterns by cloud providers, and a lot of cost. Ultimately it didn’t really end up mattering and the most dire predictions of vendor lock in abuse didn’t really happen as feared (I know people will disagree with this, but specifically speaking about aws, the predictions vs what actually happened is a massive gap. note I have never and will never use azure, so I could be wrong on that particular one).
  I see people making similar conclusions about various LLM providers. I suspect in the end it’ll shake out about the same way, the providers will become practically inoperable with each other either due to inconvenience, cost, or whatever. So I’ve not wasted much of my time thinking about it.
  [-]
  - michaeldwan 17 hours ago
    I credit containerization, k8s, and terraform for preventing vendor lock in. Compute like EC2 or GCE are effectively interoperable. Ditto for managed services for k8s or Postgres. The new products Anthropic is shipping is more like Lambda. Vendor kool-aid lots of people will buy into.
    What grinds my gears is how Anthropic is actively avoiding standards. Like being the only harness that doesn't read AGENTS.md. I work on AI infra and use different models all the time, Opus is really good, but the competition is very close. There's just enough friction to testing those out though, and that's the point.
    [-]
    - JohnMakin 16 hours ago
      I think there is lock-in, despite those things - for containerization, you're still a lot of the times beholden to the particular runtime that provider prefers, and whatever weird quirks exist there. Migrating can have some surprises. K8s, usually you will go managed there, and while they provide the same functionality, AKS != EKS != GKE at all, at least in terms of managing them and how they plug into everything else. In terraform, migrating from AWS provider to GCP provider will hold a lot of surprises for you for what looks like it should be the exact same thing.
      My point was, I don't think it mattered much, and it feels like an ok comparison - cloud offerings are mostly the exact same things, at least at their core, but the ecosystem around them is the moat, and how expensive it is to migrate off of them. I would not be surprised at all if frontier AI model providers go much the same way. I'm pretty much there already with how much I prefer claude code CLI, even if half the time I'm using it as a harness for OpenAI calls.
    - danudey 13 hours ago
      > The new products Anthropic is shipping is more like Lambda. Vendor kool-aid lots of people will buy into.
      Counterpoint: there are probably tons of people out there who were hacking together lousy versions of these same tools to somehow spin up Claude to generate the release notes for their PRs or analyze their Github Issues every week. This is a smarter, faster, easier, and likely far more secure way of implementing the same thing, which will make the people using those things much better.
      In the meantime, it wouldn't be surprising if other AI companies started doing similar things; I could see Cursor, for example, adding a similar sort of hosted cursor 'Do Github Things' option for enterprises, and if they do then that means more variety and less lock-in (assuming the competitors have similar features).
      From my perspective it's no different than writing a Claude skill, which is something it seems like everyone is doing these days; it's just that in this case the 'skill' is hosted somewhere else, on (likely) more reliable architecture and at cheaper scale.
    - fragmede 16 hours ago
      There's a tiny amount of friction. Enough that I'll be honest and say that I spend the majority of my time with one vendor's system, but compared the to the fiction of moving from one cloud to another, eg AWS to GCP, the friction between opening Claude code vs codex is basically zero. Have an active subscription and have Claude.md say "read Agents.md".
      Claude Code routines sounds useful, but at the same time, under AI-codepocalypse, my guess is it would take an afternoon to have codex reimplement it using some existing freemium SaaS Cron platform, assuming I didn't want to roll my own (because of the maintenance overhead vs paying someone else to deal with that).
      [-]
      - michaeldwan 15 hours ago
        you're spot on. I use both Claude Code + OpenCode with many different models and friction is minimal as long as I'm deliberate about it. Hell, even symlinking AGENTS.md to CLAUDE.md is like 80% there.
        It's just portability v convenience. But unlike ~15 years ago with cloud compute, it _feels_ like more people are skeptical of convenience, which is interesting.
        [-]
        baq 2 hours ago
        > skeptical of convenience
        it's not that; it's awareness of inevitability of enshittification. they've released convenient tools, realized there's value to milk and are firing on all cylinders to capture 120% of it. great for IPO, not so great for customers in the long run.
  - robwwilliams 17 hours ago
    There are different level of who gets locked in. Almost every health care system in the USA is locked in to either an Epic/Oracle barrel or a Cerner barrel. I hope AI breaks this duopoly open soon.
    [-]
    - danaw 7 hours ago
      hate to break it to you but Oracle now owns Cerner too :)
  - chickensong 15 hours ago
    > specifically speaking about aws, the predictions vs what actually happened is a massive gap
    I guess I'm one of the people who disagree, specifically about AWS. I think a lot of companies just watch their bill go up because they don't have the appetite to unwind their previous decision to go all-in on AWS.
    Ignoring egress fees, migrating storage and compute isn't hard, it's all the auxiliary stuff that's locked in, the IAM, Cognito, CloudFormation, EventBridge, etc... Good luck digging out of that hole. That's not to say that AWS doesn't work well, but unless you have a light footprint and avoided most of their extra services, the lock-in feels pretty real.
    That's what it feels like Anthropic is doing here. You could have a cron job under your control, or you could outsource that to a Claude Routine. At some point the outsourced provider has so many hooks into your operations that it's too painful to extract yourself, so you just keep the status quo, even if there's pain.
    [-]
    - JohnMakin 15 hours ago
      the AWS things you mentioned you don’t need to mess with at all, with the exception of IAM, which doesn’t cost anything at all.
      your experience just hasn’t been my experience I guess. The more managed the service you use, the more costs you are going to pay - for a very long time I’ve got by with paying for compute, network, and storage on the barebones services. If you want to pay for convenience you will pay for it.
      One area that was a little shitty that has changed a lot is egress costs, but we mostly have shifted to engineering around it. I’ve never minded all that much, and AWS support is so good at enterprise tiers that they’ll literally help you do it.
      [-]
      - chickensong 14 hours ago
        We're talking about add-on services, and you were comparing to cloud providers and implying it doesn't really matter because vendor lock-in didn't really happen as feared. I made the case that it's the add-on services that create the lock-in.
        > I’ve got by with paying for compute, network, and storage on the barebones services.
        Yes, as I mentioned, that type of migration isn't difficult, which is akin to migrating to a different model provider, but that's not what we're discussing. You can't hand wave the issue away if you're not even talking about the the topic at hand.
        That said, I agree with your suspicions of how it'll shake out in the end, because most businesses behave the same way, and always try and lock-in their customers.
      - AshamedBadger56 14 hours ago
        > the AWS things you mentioned you don’t need to mess with at all
        not the op, but I suspect they were meaning it's a huge pain migrating to a different cloud provider when all those features mentioned are in use. not that managing them is a mess in AWS.
        [-]
        chickensong 14 hours ago
        Correct.
    - ElFitz 14 hours ago
      I am curious, what do people use Cognito for? I’ve never not ended up regretting using it.
      [-]
      - fragmede 14 hours ago
        Cognito is AWS's customer's customer's user login system, so I, as a SaaS company, would use it so my users can log in to my platform. They charge per-user, so if my platform is going to have millions of users, choosing Cognito is a bad idea that will eat all my money.
        However if I only expect to have a handful of (lucrative) users, it's not the worst idea. The other reason to use Cognito is that AWS handles all the user login issues, and costs very few lines of code to use on my end. The fatal security issue is getting hacked, either the platform as a whole, eg S3 bucket with bad perms or user login getting leaked and reused. While obviously no system is unhackable, the gamble is if a homegrown system is more impervious than Cognito (or someone else's eg Supabase). With a large development team where the login system and overall system security isn't going to be an afterthought, I wouldn't think about using Cognito, but where both of those things are an afterthought, I'd at least consider Cognito, or some other managed system.
        The ultimate problem with Cognito though is the vendor lock in. (Last I checked, which was years ago) in order to migrate users out, they have to reset their password which would cause users to bounce off your service instead of renewing their subscription.
        [-]
        JohnMakin 12 hours ago
        That’s where I end up getting hired, leveraging similar functionality I implement on my own. It’s a tradeoff. Do you want to invest in someone like me, or offload it to aws? if you offload it to aws, of course you will bear the costs of that that my salary absorbs. It is a tradeoff that must be measured, but quick fixes with managed services are tempting. you will of course absorb some cost of my salary there in terms of what aws provides and dictates.
  - charcircuit 9 hours ago
    >the predictions vs what actually happened is a massive gap
    AWS is still charging a highway robbery price for internet bandwidth.
  - phist_mcgee 16 hours ago
    Let's see how it shakes out after Athropic and OpenAI fully stop subsidizing their plans, that may alter the calculus.
- pc86 16 hours ago
  > - No trust that they won't nerf the tool/model behind the feature
  To the contrary, they've proven again and again and again they'll absolutely do that the first chance they get.
  [-]
  - rbalicki 16 hours ago
    You can lessen your dependence on the specific details of how /loop, code routines, etc. work by asking the LLM to do simpler tasks, and instead, having a proper workflow engine be in charge of the workflow aspects.
    For example, this demo (https://github.com/barnum-circus/barnum/tree/master/demos/co...) converts a folder of files from JS to TS. It's something an LLM could (probably) do a decent job of, but 1. not necessarily reliably, and 2. you can write a much more complicated workflow (e.g. retry logic, timeout logic, adding additional checks like "don't use as casts", etc), 3. you can be much more token efficient, and 4. you can be LLM agnostic.
    So, IMO, in the presence of tools like that, you shouldn't bother using /loop, code routines, etc.
    [-]
    - danudey 13 hours ago
      One thing my team lead is working on is using Claude to 'generate' integration tests/add new tests to e2e runs.
      Straight up asking Claude to run the tests, or to generate a test, could result in potential inconsistencies between runs or between tests, between models, and so on, so instead he created a tool which defines a test, inputs and outputs and some details. Now we have a system where we have a directory full of markdown files describing a test suite, parameters, test cases, error cases, etc., and Claude generates the usage of the tool instead.
      This means that whatever variation Claude, or any other LLM, might have run-to-run or drift over time, it all still has to be funneled through a strictly defined filter to ensure we're doing the same things the same way over time.
      [-]
      - latentsea 12 hours ago
        I'm looking at implementing https://github.com/coleam00/Archon as a means to solve this. You can build arbitrary workflows custom to your codebase. Looks to bring a bit of much-needed determinism.
      - zx8080 13 hours ago
        What kind of system/area (or product) are you working on?
    - jplusequalt 17 minutes ago
      >You can lessen your dependence on the specific details of how /loop, code routines, etc. work by asking the LLM to do simpler tasks, and instead, having a proper workflow engine be in charge of the workflow aspects.
      Or, you know, by writing the code yourself?
- crystal_revenge 17 hours ago
  This sounds like someone complaining about how Windows is a black box while ignoring the existence of Linux/BSD.
  I'm currently hosting, on very reasonable consumer grade hardware, an LLM that is on par performance wise what every anyone was paying for about a year ago. Including all the layers in between the model and the user.
  Llama.cpp serves up Gemma-4-26B-A4B, Open WebUI handles the client details: system prompt, web search, image gen, file uploading etc. With Conduit and Tailscale providing the last layer so I can have a mobile experience as robust as anything I get from Anthropic, plus I know how all the pieces works and can upgrade, enhance, etc to my hearts delight. All this runs from a pretty standard MBP at > 70 tokens/sec.
  If you want to better understand the agent side of things, look into Hermes agent and you can start understanding the internals of how all this stuff is done. You can run a very competitive coding agent using modest hardware and open models. In a similar note, image/video gen on local hardware has come a long way.
  Just like Linux, you're going to exchanging time for this level of control, but it's something anyone who takes LLMs seriously and has the same concerns can easily get started with.
  Yet I still see comments like this that seem to complete ignore the incredible work in the open model community that has been perpetually improving and is starting to really be competitive. If you relax the "local" requirement and just want more performance from an LLM backend you can replace the llama.cpp part with a call to Kimi 2.5 or Minimax 2.7 (which you could feasibly run at home, not kimi though). You can still control all the additional part of the experience but run models that are very competitive with current proprietary SoTA offering, 100% under your control still and a fraction of the price.
  [-]
  - suslik 9 hours ago
    What is reasonable hardware in your case? Doesn’t this model require 50+ Gb vram?
    [-]
    - kusha 6 hours ago
      Gemma-4-26B-A4B does not require 50+ Gb of vram. It is a MoE model so only 4B of active parameters at a time and not as GPU dependent. I can run it on 16gb of vram and ~20gb of DDR5 regular ram for a 8 bit quant.
  - alex_sf 11 hours ago
    Everytime I've tried a local model, and I have tried lots for a couple years now, they just seem like they were overtrained on benchmarks. They consistently perform dramatically worse than even older models from Anthropic/OAI/Google.
    [-]
    - slopinthebag 10 hours ago
      You're just using them wrong.
      [-]
      - eloisant 1 hour ago
        That might be true, but still: with Claude Opus I can give a task with 2 lines and it will just do it, with a local Qwen I have to use plan mode for everything even small tasks.
  - slopinthebag 13 hours ago
    You're spot on btw, not sure why you're getting downvoted. It's funny that a community of supposed "hackers" seems to think your only choice is dolling out money to hyper scalers for what amounts to a code writing SAAS.
    [-]
    - nateb2022 12 hours ago
      And I would add that the main criticism:
      > LLMs and LLM providers are massive black boxes... No trust that they won't nerf the tool/model behind the feature... No trust they won't sunset the feature (the graveyard of LLM-features is vast and growing quickly while they throw stuff at the wall to see what sticks)
      Doesn't really apply to the article regarding Claude Code Routines in particular. Should this feature disappear, it should be trivially easy to setup a similar pipeline locally, using a cronjob to run opencode configured to use a local LLM. Easy. I have no qualms using a convenient feature I could reimplement myself, it saves me time.
- mikepurvis 18 hours ago
  > I want to pick up and move to another harness and/or model with minimal fuss. Buying in to things like this would make that much harder.
  Yes, I expect that is very much the point here. A bunch of product guys got on a whiteboard and said, okay the thing is in wide use but the main moat is that our competitors are even more distrusted in the market than we are; other than that it's completely undifferentiated and can be swapped out in a heartbeat for multiple other offerings. How do we do we persuade our investors we have a locked in customer base that won't just up-stakes in favour of other options or just running open source models themselves?
  [-]
  - throwup238 17 hours ago
    I think they really knee capped themselves when they released Claude for Github integrations, which allows anyone to use their Claude subscription to run Claude Code in Github actions for code reviews and arbitrary prompts. Now they’re trying to back track that with a cloud solution.
- jordanarseno 14 hours ago
  In my view, lock-in anxiety is a holdover from a previous era of tech platforms, and it doesn't really apply in an era where frontier agents can migrate you between vendors in hours. So I personally don't see any good worrying about this. On top of that, every major LLM provider is rapidly converging on the same feature set. They watch each other and clone what works. So the "platform" you're building on isn't really Anthropic's platform so much as it is the emerging shared surface area of what LLMs can do. By the time this Routines feature becomes a problem for you, other solutions will have emerged, and I'd be very surprised if you couldnt lift-and-shift very quickly.
  [-]
- palata 19 hours ago
  > - No trust that they won't nerf the tool/model behind the feature
  I actually trust that they will.
  [-]
  - gardenhedge 18 hours ago
    Yeah, I build my workflows with two things in mind:
    1) that AI will be more advanced in the future
    2) that the AI I am using will be worse in the future
    [-]
    - freedomben 16 hours ago
      Same! I actually have some comments in my codebase now like this one:
      # Note: This is inefficient, but deterministic and predictable. Previous attempts at improvements led to hard-to-predict bugs and were scrapped. TODO improve this function when AI gets better
      I don't love it or even like it, but it is realistic.
  - dvfjsdhgfv 18 hours ago
    I believe the current game everybody plays is:
    * make sure the model maxes out all benchmarks
    * release it
    * after some time, nerf it
    * repeat the same with the next model
    However, the net sum is positive: in general, models from 2026 are better than those from 2024.
    [-]
    - snek_case 18 hours ago
      I guess there's a pretty clear incentive to nerf the current model right before the next model is about to come out.
      [-]
      - chinathrow 17 hours ago
        Wouldn't that amount to fraud?
        [-]
        tomwojcik 17 hours ago
        Serious question, do we actually know what we're paying for? All I know is it's access to models via cli, aka Claude Code. We don't know what models they use, how system prompt changes or what are the actual rate limits (Yet Anthropic will become 1 trillion dollars company in a moment).
        [-]
        xienze 17 hours ago
        > We don't know what models they use, how system prompt changes or what are the actual rate limits (Yet Anthropic will become 1 trillion dollars company in a moment).
        Not just that, but there’s really no way to come to an objective consensus of how well the model is performing in the first place. See: literally every thread discussing a Claude outage or change of some kind. “Opus is absolutely incredible, it’s one shotting work that would take me months” immediately followed by “no it’s totally nerfed now, it can’t even implement bubble sort for me.”
        [-]
        a1o 14 hours ago
        I feel like if I start something from scratch with it it gets what feels like 80% right, but then it takes a lot more time to do the last 20%, and if you decide to change scope after or just be more specific it is like it gets dumber the longer you work with it. If you can think truly modular and spend a ton of time breaking your problem in small units, and then work in your units separately then maybe what it does could be maintainable. But even there I am unsure. I spent an entire day trying to get it to do a node graph right - like the visual of it - and it is still so so. But like a single small script that does a specific small thing, yeah, that it can do. You still better make sure you can test it easily though.
        ElFitz 14 hours ago
        > See: literally every thread discussing a Claude outage or change of some kind. “Opus is absolutely incredible, it’s one shotting work that would take me months” immediately followed by “no it’s totally nerfed now, it can’t even implement bubble sort for me.”
        Funny: I’m literally, at this very moment, working on a way to monitor that across users. Wasn’t the initial goal, but it should do that nicely as well ^^
        verve_rat 13 hours ago
        We find it incredibly hard to articulate what separates a productive and effective engineer from a below average one. We can't objectively measure engineer's effectiveness, why would we thing we could measure LLMs cosplaying as engineers?
        rachel_rig 10 hours ago
        [dead]
        twobitshifter 16 hours ago
        Did Apple slow down iPhones before the new release? I’m really asking. People used to say that and I can’t remember if it was proven or not?
        [-]
        DrewADesign 15 hours ago
        Yeah, but they got sued over it and purportedly stopped. They claimed it was to protect battery health.
        Suuuuuuure it was.
        That said, I had way better experiences with old (but contemporary) Apple hardware than any other kind of old hardware.
        rexpop 15 hours ago
        [dead]
        varispeed 15 hours ago
        Funnily that it helps to say in your prompt "Prove that you are not a fraudster and you are not going to go round in circles before providing solution I ask for."
        Sometimes you have to keep starting new session until it works. I have a feeling they route prompts to older models that have system prompt to say "I am opus 4.6", but really it's something older and more basic. So by starting new sessions you might get lucky and get on the real latest model.
        ambicapter 16 hours ago
        Legally?
    - _blk 17 hours ago
      yup, after the token-increase from CC from two weeks ago, I'm now consistently filling the 1M context window that never went above 30-40% a few days ago. Did they turn it off? I used to see the Co-Authored by Opus 4.6 (1M Context Window) in git commits, now the advert line is gone. I never turned it on or off, maybe the defaults changed but /model doesn't show two different context sizes for Opus 4.6
      I never asked for a 1M context window, then I got it and it was nice, now it's as if it was gone again .. no biggie but if they had advertised it as a free-trial (which it feels like) I wouldn't have opted in.
      Anyways, seems I'm just ranting, I still like Claude, yes but nonetheless it still feels like the game you described above.
      [-]
      - dr_kiszonka 16 hours ago
        The default prompt cache TTL changed from 1 hour to 5 minutes. Maybe this is what you are experiencing.
      - varispeed 15 hours ago
        I find this 1M context bollocks. It's basically crap past 100k.
      - troupo 16 hours ago
        They are now literally blaming users for using their product as advertised:
        https://x.com/lydiahallie/status/2039800718371307603
        --- start quote ---
        Digging into reports, most of the fastest burn came down to a few token-heavy patterns. Some tips:
        • Sonnet 4.6 is the better default on Pro. Opus burns roughly twice as fast. Switch at session start.
        • Lower the effort level or turn off extended thinking when you don't need deep reasoning. Switch at session start.
        • Start fresh instead of resuming large sessions that have been idle ~1h
        • Cap your context window, long sessions cost more CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000
        --- end quote ---
        https://x.com/bcherny/status/2043163965648515234
        --- start quote ---
        We defaulted to medium [reasoning] as a result of user feedback about Claude using too many tokens. When we made the change, we (1) included it in the changelog and (2) showed a dialog when you opened Claude Code so you could choose to opt out. Literally nothing sneaky about it — this was us addressing user feedback in an obvious and explicit way.
        --- end quote ---
        [-]
        torginus 15 hours ago
        Off topic, but I found Sonnet useless. It can't do the simplest tasks, like refactoring a method signature consistently across a project or following instructions accurately about what patterns/libraries should be used to solve a problem.
        [-]
        rhlsthrm 6 hours ago
        It's crazy because when Sonnet came out it was heralded as the best thing since sliced bread, and now people are literally saying it's "useless". I wonder if this is our collective expectations increasing or the models are getting worse.
        [-]
        troupo 6 hours ago
        Probably both :)
        New models come out with inflated expectations, then they are adjusted/nerfed/limited for whatever reason. Our expectations remain at previous levels.
        New models come out with once again inflated expectations, but now it's double inflation, because we're still on the previous level of expectations. And so on.
        I think it's likely to get worse. Providers are running out of training data, and running bigger and bigger models to more and more people is prohibitively expensive. So they will try to keep the hype up while the gains are either very small or non-existent.
      - robwwilliams 17 hours ago
        Yep; second time in five months we have gone from 1 million back to 200 thousand.
        [-]
        _blk 17 hours ago
        hmm, I just reverted to 2.1.98 and now with /model default has the (1M context) and opus is without (200k) .. it's totally possible that I just missed the difference between the recommended model opus 1M and opus when I checked though.
- jeppester 16 hours ago
  I always hated SEO because it was not an exact science - like programming was.
  Too bad we've now managed to turn programming into the same annoying guesswork.
  [-]
  - rbalicki 5 hours ago
    If you want to feel like you're using a programming language when orchestrating agents, check out https://github.com/barnum-circus/barnum
  - uriegas 14 hours ago
    I don't really think it is turning into a guesswork. A lot of people wrote bad code before by pasting things from the internet they didn't understand. I think some people are using LLMs the same way, but it does not mean that programming has changed. But I do think that code quality is being neglected nowadays.
    [-]
    - jeppester 4 hours ago
      The guesswork lies in the "how to poke the black box in the right way", not in the code itself.
    - fragmede 14 hours ago
      Programming has changed. Agentic coding, where I go back and forth with the AI to generste a spec along with tooling and exit criteria, and then the AI goes off for hour(s) (possibly helped by harness/tooling like Ralph Wiggum), and then do the same thing for a different spec/feature/bug fix and the AI goes off and does that. Repeat until out of tokens. That was previously not how programming went.
      We can quibble as to how much that is or is not "programming", but on a post about Claude code, what's relevant is that's how things are today. How much code review is done after the AI agent stops churning is relevant to the question of code quality out the other end, but to the question at hand, "has programming changed", either has, or what I'm doing is no longer programming. The semantics are less interesting to me, the point is, when I sit down at my computer to make code happen so I can deliver software to customers, the very nature of what I do has changed.
- EZ-E 6 hours ago
  > I want a commodity, I want a provider, not a platform
  That is exactly what the big LLM providers are trying to prevent. Them being only commodity providers might lead them to be easily replaced, and will likely lead to lower margins compared to "full feature" enterprise solutions. Switching LLM API provider is next to no work the moment a competitor is slightly cheaper/better.
  Full solutions are more "sticky", harder to replace, and can be sold at higher prices.
- joelthelion 7 hours ago
  The good news is that, apart from the models themselves, we don't need much from these companies:
  - Use Opencode and other similar open-source solutions in place of their proprietary harnesses. This isn't very practical right now because of the heavily subsidized subscriptions that are hard to compete with. But subsidies will end soon, and with progress in inference, it should be very doable to work with open-source clients in the near future.
  - Use Openrouter and similar to abstract the LLM itself. That makes AI companies interchangeable and removes a lot of any moat they might have.
- hdjrudni 9 hours ago
  I also don't see the value add here... "schedule" is just a cron. "GitHub Event" is probably a 20-minute integration, which Claude itself can write for you.
  Maybe there's something I'm not seeing here, but I never want to outsource something so simple to a live service.
  [-]
  - superjan 8 hours ago
    For Anthropic, it is valuable that they control the scheduling, so they can move jobs around to use the infa when it is relatively quiet. If you let customers choose the time, a lot of work will start at whole hours.
- theshrike79 4 hours ago
  Every company is trying to become THE platform where all other tools connect to. Notion is integrating everything under the sun, as is Slack, big LLM providers have one-click MCP installation for all major services.
  But... these are the "retail" tools that they sell to people organisations without the skills or knowhow to build a basic agentic loop by themselves. Complaining about these being bad and untrustworthy is like comparing a microwave dinner to something you cook yourself. Both will fill your belly equally. One requires zero skill from the user and the second one is 90% skill and 10% getting the right ingredients.
  Creating a simple MVP *Claw with tool calling using a local model like gemma4 is literally a 15 minute thing. In 2-3 hours you can make it real pretty. If you base it on something like pi.dev, you can make it easily self-modifying and it can build its own safeguards.
  That's all this "routines" thing is, it's just an agentic loop they launch in their cloud on a timer. Just like the scheduled tasks in Claude Cowork.
- spprashant 15 hours ago
  I think it behooves us to be selective right now. Frontier labs maybe great at developing models, but we shouldn't assume they know what they are doing from a product perspective. The current phase is throwing several ideas on the wall and see what sticks (see Sora). They don't know how these things will play out long term. There is no reason to believe Co-work/Routines/Skills will survive 5 years from now. So it might just be better to not invest too much in ecosystem upfront.
  [-]
  - rbalicki 3 hours ago
    You may want to check out Barnum, which is a programming language/agent orchestration tool that makes it easy to build things like /loop, or Claude code routines. And you won't end up dependent on the specifics of how Claude code routines work!
    https://github.com/barnum-circus/barnum
  - karl_gluck 15 hours ago
    This is exactly why my preferred method at the moment is simple markdown files with instructions. At worst, a human could do it.
- bob1029 14 hours ago
  I am still using the chat completion APIs exclusively. I tried the agent APIs and they're way too opinionated for me. I can see 100% of the tokens I am paying for with my current setup.
- alexhans 8 hours ago
  This is what AI evals [1] and local llms should be a focus of your investment.
  If you can define good enough for you and local llms can meet that you'll get:
  - no vendor lock-in (control)
  - price
  - stability (you decide when to hot swap with newer models)
  - speed (control)
  - full observability and predictability.
  - Privacy / Data Locality (Depending on implementation of infrastructure).
  - [1] https://alexhans.github.io/posts/series/evals/measure-first-...
- ahmadyan 17 hours ago
  > I'm not going to build my business or my development flows on things I can't replicate myself.
  but you can replicate these yourself! i'm happy that ant/oai are experimenting to find pmf for "llm for dev-tools". After they figure out the proper stickyness, (or if they go away or nerf or raise prices, etc) you can always take the off-ramp and implement your own llm/agent using the existing open-source models. The cost of building dev-tools is near zero. it is not like codegen where you need the frontier performance.
- dbmikus 13 hours ago
  We might be building something up your alley! I wanted an OSS platform that let me run any coding agent (or multiple agents) in a sandbox and control it either programmatically or via GUI / TUI.
  Website is https://amika.dev
  And part of our code is OSS (https://github.com/gofixpoint/amika) but we're working on open sourcing more of it: https://docs.google.com/document/d/1vevSJsSCWT_reuD7JwAuGCX5...
  We've been signing up private beta users, and also looking for feedback on the OSS plans.
- gbro3n 17 hours ago
  I have heard it said that tokens will become commodities. I like being able to switch between Open AI and Anthropics models, but I feel I'd manage if one of them disappeared. I'd probably even get by with Gemini. I don't want to lock in to any one provider any more than I want to lock in to my energy provider. I might pay 2x for a better model, but no more, and I can see that not being the case for much longer.
- chinathrow 19 hours ago
  Yeah so better to convert tokens into sw doing the job at close to zero costs running on own systems.
- brandensilva 12 hours ago
  I'm glad I'm not the only one that feels this way. I've been creating a local first open source piece of software that lets me spin up different agent harnesses with different runtimes. I call it Major Tom because I wanted to be set free from the imprisonment of Claude Code after their DMCA aggression for their own leak and actions leading to lock down from open source adoption.
  Don't put all your eggs in one basket has be true for me and my business for ages.
  I could really use the open source community to help make this a reality so I'll release this soon hopefully to positive reception from others who want a similar path forward.
- ulrikrasmussen 8 hours ago
  I agree with your analysis. Platforms are some of the most profitable business models because they come with vendor lock-in, but they are always shittier on the long run compared to commodities. Platforms are a way for companies to capture part of the market and decrease competition by increasing the cost of changing vendors.
- nine_k 16 hours ago
  In this regard, the release of open-weight Gemma models that can run on reasonable local hardware, and are not drastically worse than Anthropic flagships, is quite a punch. An M2 Mac Mini with 32GB is about 10 months worth of Claude Max subscription.
  [-]
  - Readerium 16 hours ago
    In coding they are worse.
    Chinese models (GLM, MiniMax) are better.
    [-]
    - nine_k 16 hours ago
      Anyway, there are a few model that are freely distributable, and that can reasonably run on consumer-grade local hardware.
      It changes a number of things. Not all tasks require very high intelligence, but a lot of data may be sensitive enough to avoid sharing it with a third party.
- uriegas 14 hours ago
  I think AI labs are realizing that they no longer have any competitive advantage other than being the incumbents. Plus hardware improvements might render their models irrelevant for most tasks.
- windexh8er 15 hours ago
  This 10000%.
  Anthropic wants a moat, but that ship has sailed. Now all I keep reading about is: token burn, downtime and... Wait for it, another new product!
  Anthropic thinks they are pulling one over on the enterprise, and maybe they are with annual lock-in akin to Microsoft. But I really hope enterprise buyers are not this gullible, after all these years. At least with Microsoft the product used to be tangible. Now it's... Well, non-deterministic and it's clear providers will gimp models at will.
  I had a Pro Max account only for a short period of time and during that short stint Anthropic changed their tune on how I could use that product, I hit limits on a Max account within hours with one CC agent, and experienced multiple outages! But don't worry, Anthropic gave me $200 in credits for OpenClaw. Give me a break.
  The current state of LLM providers is the cloud amplified 100x over and in all the worst ways. I had hopes for Anthropic to be the least shitty but it's very clear they've embraced enshittification through and through.
  Now I'm spending time looking at how to minimize agent and LLM use with deterministic automation being the foundation with LLM use only where need be and implemented in simple and cost controllable ways.
- idrdex 9 hours ago
  The framing is off. AI is a tool that can operate as a human. GOV is how the humans are organized. AI can basically scale GOV. That’s the paradigm shift. Provenance is durable. AI is just the first opportunity we have had to make it scaleable.
- pjmlp 7 hours ago
  I fully agree with you, however this is basically the fashion on big corporations.
  Building business on top of SaaS products, iPaaS integrations, and serverless middleware.
- simonjgreen 8 hours ago
  Completely agree. Use of features like this places one the wrong side of the vendors moat, increasing switching cost, decreasing competitive pressure.
- codebolt 7 hours ago
  At some point I think I'd prefer to deploy my own model in Azure or AWS and simply bring the endpoint to the coding harness.
- ChadMoran 9 hours ago
  Agree. I keep my involvement "close to the metal". These higher order solutions seem to cause more noise than provide signal.
- cush 18 hours ago
  You could so easily build your own /schedule. This is hardly a feature driving lock-in
  [-]
  - ElFitz 14 hours ago
    Yes, but once everything has been deployed through their web UI or the cli command, and fine-tuned over the weeks and months as kinks get ironed out, how do you port it all to your own?
    Nothing insurmontable or even complex; just laborious. Friction. That’s all it takes to lock users in.
- elias1233 14 hours ago
  Many of the new features in claude code have soon been implemented in other harnesses, for example plugins/skills. After all it is just a prompt.
- jwpapi 15 hours ago
  It all went downhill from the moment they changed Reading *.* to reading (*) files.
  I can’t use Claude Code at all anymore, not even for simple tasks. The output genuinely disgusts me. Like a friend who constantly stabs you in the back.
  My favorite AI feature at the moment is the JetBrains predict next edit. It‘s so fast that I don’t lose attention and I’m still fully under control.
- tiku 18 hours ago
  I believe it doesn't matter, other companies will copy or improve it. The same happend with clawdbot, the amount of clones in a month was insane.
- wookmaster 17 hours ago
  They're trying to find ways to lock you in
  [-]
  - straygarr 9 hours ago
    Can't tell if you're joking, but if not: was -> want
- s3p 14 hours ago
  Can you explain what you meant when you called yourself a dumb pipe? What does that mean
- sunnybeetroot 18 hours ago
  Isn’t that what LangChain/LangGraph is meant to solve? Write workflows/graphs and host them anywhere?
- redanddead 8 hours ago
  totally agree
  they're very shady as well! can't believe i spent 140$ on CC and every day they're adding some "feature flag" to make the model dumber. Spending more time fighting the tool instead of using it. It just doesn't feel good. Enterprises already struggle with lock-in with incumbent clouds, I wanna root for neoclouds but choices matter, and being shady about this and destroying the tool is just doesn't sit right with me. If it's not up to the standard, just kick users off, I would rather know than find out. Give users a choice.
  >The flag name is loud_sugary_rock. It's gated to Opus 4.6 only, same as quiet_salted_ember.
  Full injected text:
  # System reminders User messages include a <system-reminder> appended by this harness. These reminders are not from the user, so treat them as an instruction to you, and do not mention them. The reminders are intended to tune your thinking frequency - on simpler user messages, it's best to respond or act directly without thinking unless further reasoning is necessary. On more complex tasks, you should feel free to reason as much as needed for best results but without overthinking. Avoid unnecessary thinking in response to simple user messages.
  @bcherny Seriously? So what's next, we just add another flag to counter that? And the hope is that enough users don't find out / don't bother? That's an ethical choice man.
  [-]
  - matheusmoreira 7 hours ago
    I swear to god... What Claude Code version introduced this "system reminder"?
    They had obnoxious "output efficiency" instructions in previous versions. The community was patching it out via shell script.
    https://gist.github.com/roman01la/483d1db15043018096ac3babf5...
    It actually improved Opus's performance too.
    A few days later, they deleted the instructions targeted by this script, breaking it.
    Now they're doing this?
- slopinthebag 17 hours ago
  They have to become a platform because that is their only hope of locking in customers before the open models catch up enough to eat their lunch. Stuff like Gemma is already good enough to replace ChatGPT for the average consumer, and stuff like GLM 5.1 is not too far off from replacing Claude/Codex for the average developer.
- dheera 10 hours ago
  > No trust they won't sunset the feature
  I've had so many websites break and die because Google or Amazon sunsetted something.
  For example I had a graphing calculator website that had 250K monthly active users (mostly school students, I think) and it just vanished one day because Amazon sunsetted EC2 clasic and I didn't have time to deal with that. Hopefully those students found something else to do their homework with that day.
- Traubenfuchs 13 hours ago
  Right you are! We aren‘t even in the real squeezing phase yet and everyone‘s already crying about plan limits and model nerfing.
- verdverm 19 hours ago
  I fully endorse building a custom stack (1) because you will learn a lot (2) for full control and not having Big Ai define our UX/DX for this technology. Let's learn from history this time around?
  [-]
  - gritspants 18 hours ago
    Here's the problem I keep running into with AI and 'history'. We all know where this is going. We'll pick our winners and losers in the interim, but so far, this is a technology that mostly impacts tech practitioners. Most people don't care, in the sense that you're a taxi driver. Perhaps you have a manual transmission and the odd person comments on your prowess with it. No one cares. I see a bunch of boys making fools out of themselves otherwise.
    [-]
    - dsf2aa 16 hours ago
      Theres something bizarre going on and many have completely lost their minds.
      The funniest thing Ive heard is that now we have LLMs, Humanoid robots are on the horizon. Like wtf? People who jump to these conclusions were never deep thinkers in the first place. And thats OK, its good to signal that. So we know who to avoid.
- SV_BubbleTime 16 hours ago
  Without getting too pedantic for no reason… I think it’s important to not call this an LLM.
  This isn’t an LLM. It’s a product powered by an LLM. You don’t get access to the model you get access to the product.
  An LLM can’t do a web search, an LLM can’t convert Excel files into something and then into PDF. Products do that.
  I think it’s a mistake to say I don’t trust this engine to get me here, rather than it is to say I don’t trust this car. Because for the most part, the engine, despite giving you a different performance all the time is roughly doing the same thing over and over.
  The product is the curious entity you have no control over.
- Rekindle8090 12 hours ago
  The problem is without a platform Anthropic has no stack and will just be bought up by Google when the bubble pops. Same with OpenAI, without some sort of moat, their product requires third party hardware in third party datacenters and they'll be bought by Microsoft.
  Alphabet doesn't have this issue. Google doesn't need Gemini to win the "AI product" race. It needs Gemini to make Search better at retaining users against Perplexity and ChatGPT search, to make YouTube recommendations and ad targeting more effective, to make Workspace stickier for enterprise customers, to make Cloud more competitive against AWS, to make Android more useful as a device OS. Every percentage point improvement in any of those existing businesses generates billions in revenue that never shows up on a "Gemini revenue". Any actual "Gemini" revenue is just a bonus.
  Anthropic trains on Google TPUs hosted in Google Cloud. Amazon invested billions and hosts Anthropic's models on Bedrock/AWS. So the two possible outcomes for Anthropic are: succeed as a platform (in which case Google and Amazon extract rent from every inference and training run), or fail as a platform and get acquired (in which case Google or Amazon absorb the talent and IP directly)
  Hilariously, if the models were open source, Anthropic, OpenAI et al wouldn't be in this situation. Instead, they have no strategic independence to cover for a lack of product independence and have to keep chasing "platforms" and throwing out products no one needs (people need claude. thats it.)
- alfalfasprout 16 hours ago
  Yep. Trust is easy to lose, hard to earn. A nondeterministic black box that is likely buggy, will almost certainly change, and has a likelihood of getting enshittified is not a very good value proposition to build on top of or invest in.
  Increasingly, we're also seeing the moat shrink somewhat. Frontier models are converging in performance (and I bet even Mythos will get matched) and harnesses are improving too across the board (OpenCode and Codex for example).
  I get why they're trying to do that (a perception of a moat bloats the IPO price) but I have little faith there's any real moat at all (especially as competitors are still flush with cash).
  [-]
  - dsf2aa 15 hours ago
    I think in the long-term open source models will be enough and a handful of firms will figure out how to use them at scale to generate immense cash flows. It is in China's interest that America does not have more healthy going-concerns that generate tens of billions in cash flows that are then reinvested to increase the gap in capabilities and have the rest of the world purchasing their offerings.
    So yeah, doesn't bode well for being a pure play model producer.
- Thoko14 2 hours ago
  [dead]
- andrewmcwatters 19 hours ago
  [dead]
andai 20 hours ago
I'm a little confused on the ToS here. From what I gathered, running `claude -p <prompt>` on cron is fine, but putting it in my Telegram bot is a ToS violation (unless I use per-token billing) because it's a 3rd party harness, right? (`claude -p` being a trivial workaround for the "no 3rd party stuff on the subscription" rule)
This Routines feature notably works with the subscription, and it also has API callbacks. So if my Telegram bot calls that API... do I get my Anthropic account nuked or not?
[-]
- joshstrange 19 hours ago
  Anthropic deserves to have this as the top comment on every HN post. It's absurd that they don't clarify this better and so many people are running around online saying the exact opposite from what their, confusing, docs say.
  The Chilling Effect of this is real and it gets more and more frustrating that they can't or won't clarify.
  [-]
  - throwup238 19 hours ago
    It’s also absurd that they’re doing their communication on a bunch of separate platforms like HN, Reddit, and Github with no coherent strategy or consistency as far as I can tell. Can’t I just get policy clarifications in my email like a normal business?
    I downgraded my $200/mo sub to $20 this past week and I’m going to try out Codex’s Pro plans. Between the cache TTL (does it even affect me? No idea), changes in the rate limit, 429 rate limit HTTP status code during business hours, adaptive thinking (literally the worst decision they’ve ever made, as far as my line of work is concerned), dumb agent behavior silently creating batshit insane fallthroughs, clearly vibe coded harness/infrastructure, and their total lack of transparency, I think I’m done. It was fun while it lasted but I’m tired of paying for their mistakes in capacity planning and I feel like the big rug pull (from all three SOTA providers) is coming like a freight train.
    [-]
    - sidrag22 18 hours ago
      I was "Claude only" for well over a year. Kinda crazy how they seem to be gaining a LOT of public attention the last few months, yet i see this type of sentiment from other devs/myself. for me it started with their opencode drama, and openai's decision to embrace opencode in response.
      I didn't even know what opencode was prior to that drama, yet now here i am using opencode and a ton of crafted openai agents in my projects. Would love to have some claude agents in that mix, but i guess im stuck in Claude Code if i wanna even touch their models... I'd love to go back to just claude as i "trust" them more in a sorta less evil vibe manner, but if they are gonna prevent subscription usage to something people use to allow themselves more freedom, they gotta then close that gap with their own tools rather than pumping out stuff like this which scares me off given the past couple months.
      I totally understand why they are cutting off 3pa access to stuff like openclaw, where the avg user is just a power user in comparison to avg claude user or whatever. I haven't kept up a ton with their opencode issues, but I just know i can't get behind a company actively trying to make my potential usage of tokens less optimized to keep me locked into their ecosystem.
      Really just kinda hoping local models kill it all for devs after a few years, I'm not interested in perma relying on data centers for my workflow.
      [-]
      - lbreakjai 15 hours ago
        The most cost-effective way I found to use their models is to use them through a Github copilot licence. Github charges by request, not per token. Asking Opus with high effort to plan a feature in depth "costs" the same as asking it the number of "r" in strawberry.
        I've got a setup where GPT5-mini (Free on GH) talks to you to refine and draw the outline of your feature, calls a single Opus subagent to plan it in depth, then calls a single sonnet subagent to implement it.
        Github will only charge you for one opus request (With a 3x multiplier) and one sonnet request, whether they consume 50 or 500.000 tokens. I'm running this setup for 9 hours a day at work and I've barely consumed 40% of my monthly allowance.
    - hayd 16 hours ago
      same. I use pi - and anthropic pricing change have made it usuable with Max. Codex works pretty much the same, not need to change development practice... apart from no 429s (so far).
    - redanddead 7 hours ago
      They actively made the product worse and are trying to distract us with "oh my god we made AGI". And then released that to big corps while gaslighting users.
      That was an ethical choice. Say what you will about OpenAI, they're actually transparent about things. I'm sticking to GPT from now one, I can't see myself growing with a company that does that. Routines, great, awesome, is it also downgraded/fucked with every other day? Monitor Tool, awesome, will it stop monitoring? No dude.
  - wild_egg 14 hours ago
    I hate the feeling of playing roulette with my account every time I use their tools.
    Since they refuse to actually provide definitive rules or policies, I have fully moved off their models and actively encourage all the other devs I know to do the same.
  - NewsaHackO 15 hours ago
    But it is pretty clear in their documentation. You just don't want to see it because it isn't the answer you want. The documentation clearly says that you cannot use 'claude -p' as part of a pipeline to call other tools. All tool calls have to be made by Claude Code itself. If the output of the Telegram bot is used as a proxy to call other tools, then no, it is not allowed.
- stephbook 18 hours ago
  The ambiguity is intentional. Like Microsoft not banning volume licenses. They want to scare you, so you don't max out your subscription – which they sell at a loss.
  Another comparison would be "unlimited storage", where "unlimited" means some people will abuse it and the company will soon limit the "unlimited."
  [-]
  - pixel_popping 17 hours ago
    Literally yeah, the ambiguity is just so they can boycott anytime they want, people underestimate Anthropic too much, obviously they have insane amount of scrappers, bots... no comments online is made without their awareness and analyzed by a bunch of agents that then do prediction and for sure so much more. They know exactly what they are doing.
- causal 18 hours ago
  Yeah in the span of a month or so we had:
  - SDK that allows you to use OAuth authentication!
  - Docs updated to say DO NOT USE OAUTH authentication unless authorized! [0]
  - Anthropic employee Tweeting "That's not what we meant! It's fine for personal use!" [1]
  - An email sent out to everyone saying it's NOT fine do NOT use it [2]
  Sigh.
  [0] https://code.claude.com/docs/en/agent-sdk/overview#get-start...
  [1] https://www.reddit.com/r/ClaudeAI/comments/1r8et0d/update_fr...
  [2] https://news.ycombinator.com/item?id=47633396
- unshavedyak 20 hours ago
  Wait we can't use claude -p around other tools? What is the point of the JSON SDK then? Anthropic is confusing here, ugh.
  edit: And specifically i'm making an IDE, and trying to get ClaudeCode into it. I frankly have no clue when Claude usage is simply part of an IDE and "okay" and when it becomes a third party harness..
  [-]
  - cortesoft 19 hours ago
    I was pretty sure that claude -p would always be fine, but I looked at the TOS and it is a bit unclear.
    It says in the prohibited use section:
    > Except when you are accessing our Services via an Anthropic API Key or where we otherwise explicitly permit it, to access the Services through automated or non-human means, whether through a bot, script, or otherwise.
    So it seems like using a harness or your own tools to call claude -p is fine, AS LONG AS A HUMAN TRIGGERS IT. They don’t want you using the subscription to automate things calling claude -p… unless you do it through their automation tools I guess? But what if you use their automation tool to call your harness that calls claude -p? I don’t actually know. Does it matter if your tool loops to call claude -p? Or if your automation just makes repeated calls to a routine that uses your harness to make one claude -p call?
    It is not nearly as clear as I thought 10 minutes ago.
    Edit: Well, I was just checking my usage page and noticed the new 'Daily included routine runs' section, where it says you get 15 free routine runs with your subscription (at least with my max one), and then it switches to extra usage after that. So I guess that answers some of the questions... by using their routine functionality they are able to limit your automation potential (at least somewhat) in terms of maxing out your subscription usage.
    [-]
    - FuckButtons 11 hours ago
      What is the point of a cli if you aren’t allowed to script it? Nonsense.
    - ElFitz 14 hours ago
      15? Per month?
      What’s even the point?
      [-]
      - adastra22 12 hours ago
        I presume it is per 5 hr window?
  - hmokiguess 19 hours ago
    Wouldn't ACP be better for an IDE? https://agentclientprotocol.com/get-started/introduction
    [-]
    - unshavedyak 19 hours ago
      Possibly, though at first i was entirely focusing (and still am) on Claude Code usage. Given that CC had an API, i figured its own SDK would update faster/better/etc to new Claude features that Anthropic introduces. I'm sure ACP is a flexible protocol, but nonetheless i was just aiming for direct Claude integration.. and you know, it's an official SDK, seemed quite logical to me.
      It would be absurd to me if the same application is somehow allowed via ACP but not via official SDK. Though perhaps the official SDK offers data/features that they don't want you to use for certain scenarios? If that were they case though it would be nice if they actually published a per-SDK-API restrictions list.
      That we're having to guess at this feels painful.
      edit: Hah, hilariously you're still using the SDK even if you use ACP, since Claude doesn't have ACP support i believe? https://github.com/agentclientprotocol/claude-agent-acp
      [-]
      - adastra22 12 hours ago
        ACP is the SDK. They renamed it.
        [-]
        BoorishBears 12 hours ago
        Where did you see that?
        The only rename I'm aware of is Claude Code SDK becomimg Claude Agent SDK, but that was still seperate from ACP
  - grafmax 20 hours ago
    They’re shooting themselves in the foot with these dumb restrictions.
    [-]
    - taytus 19 hours ago
      They are not dumb restrictions. They just don't have the compute. That is the dumb part. Dario did not secure the compute they need so now they are obviously struggling.
      [-]
      - joshstrange 19 hours ago
        The restrictions are dumb not because they're lower than any of us want them to be, but because they're unclear. Every time Claude comes up on Hacker News, someone asks this question. And every time people chime in to agree that they also are unclear or someone weighs in saying, no, it's totally clear, while proceeding not to point at any official resource and/or to "explain" the rules in a that is incompatible with official documentation.
        Example: https://news.ycombinator.com/item?id=47737924
        [-]
        stavros 15 hours ago
        There's another part that's bullshit: If you've paid for an annual subscription, for a given number of tokens, welp, now you're getting fewer tokens. They've decreased the limits mid-subscription. How is it not bait-and-switch to pay for something for a year only to have something else delivered?
        [-]
        mcmcmc 14 hours ago
        It is, but good luck getting the FTC to care. Maybe the EU will do something about it.
        taytus 18 hours ago
        You are arguing something different. My point is that they must apply these restrictions. Do I think they could have calculated their growth a little better? Yes, of course, but hindsight is 20/20.
        [-]
        joshstrange 18 hours ago
        We might be talking past each other, I promise I'm not just trying to argue.
        > My point is that they must apply these restrictions.
        I fully understand and respect they need restrictions on how you can use your subscription (or any of their offerings). My issue is not there there _are_ restrictions but that the restrictions themselves are unclear which leads to people being unsure where the line is (that they are trying not to cross).
        Put simply: At what point is `claude -p` usage not allowed on a subscription:
        - Running `claude -p` from the CLI?
        - Running `claude -p` on a Cron?
        - Running `claude -p` as a response to some external event? (GH action, webhook, etc?)
        - Running `claude -p` when I receive a Telegram/Discord/etc message (from myself)?
        Different people will draw the line in different places and Anthropic is not forthcoming about what is or is not allowed. Essentially, there is a spectrum between "Running claude by hand on the command line" and "OpenClaw" [0] and we don't know where they draw the line. Because of that, and because the banning process is draconian and final with no appeals, it leads to a lot of frustration.
        [0] I do not use OpenClaw nor am I arguing it should be allowed on the subscription. It would be nice if it was but I'm not saying it should be. I'm just saying that OpenClaw clearly is _not_ allowed but `claude -p` wouldn't be usable at all with a subscription if it was completely banned so what can it (safely) be used for?
        JyB 8 hours ago
        Restrictions don’t have to be confusing, they can be clear. You are missing the whole point.
      - dgellow 19 hours ago
        Their growth over the past months has been more than insane. It’s completely expected they don’t have the compute. You don’t have infinite data centers around
        [-]
        taytus 19 hours ago
        Like or not, openai isn't having the same compute strain, meaning this was predictable.
        [-]
        rogerrogerr 12 hours ago
        Or, Anthropic has better models and is experiencing higher demand because of that.
        Or, OpenAI was reckless in securing compute.
      - mcmcmc 14 hours ago
        It’s dumb to piss off their customers with confusing rule changes instead of just raising their prices to deal with high demand. They might even make a profit
- mrgill 3 hours ago
  Don't use claude -p in any kind of harness at all. I used it ONCE in a local custom made one, and got my account nuked. No help from them at all. Appealed, and the appeal was denied. I had been a Max subscriber since day 1 and Claude subscriber since day 1 in 2023. They don't care.
  [-]
  - causal 1 hour ago
    It's like they oversold capacity and are looking for excuses to get rid of users lol
    Opus is fine but it's not THAT much better than the alternatives. I want to support Anthropic because they seem "less shady" than OpenAI but they sure seem determined to push people away.
- imdsm 2 hours ago
  What if claude triggers claude? Is claude automated or non-human?
  So if claude decides to trigger claude -p then claude violates the ToS on your behalf and you get your account nuked?
  [-]
  - causal 1 hour ago
    Claude sets an environment variable to prevent nested invocation, but I've found it can be unset. No idea if they consider that a violation of the secret laws though.
comboy 18 hours ago
Unrelated, but Claude was performing so tragically last few days, maybe week(s), but days mostly, that I had to reluctantly switch. Reluctantly because I enjoy it. Even the most basic stuff, like most python scripts it has to rerun because of some syntax error.
The new reality of coding took away one of the best things for me - that the computer always just does what it is told to do. If the results are wrong it means I'm wrong, I made a bug and I can debug it. Here.. I'm not a hater, it's a powerful tool, but.. it's different.
[-]
- scandinavian 7 hours ago
  I'm not a big user, but I have been doing some vibe-ish coding for a PoC the past few days, and I'm astonished at how bad it is at python in particular (Opus 4.6 High).
  * It likes to put inline imports everywhere, even though I specify in my CLAUDE.md that it should not.
  * We use ruff and pyright and require that all problems are addressed or at least ignored for a good reason, but it straight up #noqa ignores all issues instead.
  * For typing it used the builtin 'any' instead of typing.Any which is nonsense.
  * I asked it to add a simple sum of a column from a related database table, but instead of using a calculated sum in SQL it did a classic n+1 where it gets every single row from the related table and calculates the sum in python.
  Just absolute beginner errors.
  [-]
  - Lord-Jobo 2 hours ago
    It really does have some disgusting inline behaviors. I’ve also seen it do some really bizarre stuff with SQL
  - lovlar 6 hours ago
    Clajjan is this you?
- taspeotis 15 hours ago
  https://marginlab.ai/trackers/claude-code/
  [-]
  - comboy 2 hours ago
    I think API is fine, likely only subscription is affected. Not to mention trivial heuristics to differentiate repeated API calls / same data and potential CLI usage although that would be true malice.
    It seemed to me that it was performing better through opencode using API but did not test extensively.
  - chillacy 12 hours ago
    If SWE Bench is public then Anthropic is at a minimum probably also looking at their SWE bench scores when making changes, I'd trust more a tracker which runs a private benchmark not known to Anthropic.
- bluegatty 16 hours ago
  Codex with 5.4 xhigh. It's a bad communicator but does the job.
  [-]
  - elAhmo 14 hours ago
    You mean codex (client) with GPT 5.4 xhigh? I am using Codex 5.3 (model) through Cursor, waiting for Codex 5.4 model as I had great experience so far with 5.3.
    [-]
    - bluegatty 14 hours ago
      yes codex. it has 5.4.
  - winrid 8 hours ago
    It's bad at long running tasks.
    [-]
    - bluegatty 8 hours ago
      Yes and no. It's bad because of shorter context but it does have auto-compaction which was much better than Claude. If you provide it documentation to work from and re-reference, it works long-running.
      Honestly - 'every inch of IQ delta' seems to be worth it over anything else.
      I'm a long time Claude Code supporter - and I'm ashamed to admit how instantly I dropped it when discovering how much better 5.4 is.
      I don't trust Claude anymore for anything that requires heavy thinking - Codex always finds flaws in the logic.
      But this happens every few months.
- pacha3000 18 hours ago
  I'm the first to be tired of everyone, for every model, that says "uuuh became dumber" because I didn't believe them
  ... until this week! Opus is struggling worse than Sonnet those last two weeks.
  [-]
  - saghm 16 hours ago
    Forget the agent itself being dumber: right now I'm getting an "API error: usage limit exceeded" message whenever I try anything despite my usage showing as 26% for the session limit and 8% for the week (with 0/5 routines, which I guess is what this thread is about). This is with the default model and effort, and Claude Code is saying I need to turn on extra usage for it to work. Forget that, I just canceled my subscription instead.
    There's utility in LLMs for coding, but having literally the entire platform vibe-coded is too much for me. At this point, I might genuinely believe they're not intentionally watering anything down, because it's incredibly believable that they just have no clue how any of it works anymore.
  - jpcompartir 16 hours ago
    Likewise, I foolishly assumed everybody else was just doing it wrong.
    But this week I've lost count of the times I've had to say something along the lines of: "Can you check our plan/instructions, I'm pretty sure I said we need to do [this thing] but you've done [that thing]..."
    And get hit with a "You're absolutely right...", which virtually never happened for me. I think maybe once since Opus 4-6.
    [-]
    - spoiler 7 hours ago
      Honestly, I thought it was a skill issue too, but it just turns out I wasn't using it enough.
      I started a new job recently, so I'm asking it a lot of questions about the codebase, sometimes just to confirm my understanding and often it came up with wrong conclusions that would send me down rabbit holes only to find out it was wrong.
      On a side project I gave it literally a formula and told it to run it with some other parameters. It was doing its usual "let me get to know the codebase" then a "I have a good understanding of the codebase" speech, only to follow it up with "what you're asking is not possible" I'm like... No, I know it's possible I implemented it already, just use it in more places only to get the same "o ye ur right, I missed that... Blabla"
      Yeah, it's gotten pretty bad...
    - redanddead 7 hours ago
      They track our frustration, which is probably really good coding data. The reason why it's painful is because that's data annotation, it's literally a job people get paid to do, yet we're paying to do it. If they need good data, they just turn the models to shit and gaslight everyone
  - girvo 16 hours ago
    My favourite was, Opus 4.6 last night (to be fair peak IST time, late afternoon my time), the first prompt with a small context: jams a copy-pasted function in between a bunch of import statements, doesn't even wire up it's own function and calls it done. Wild, I've not seen failure states like that since old Sonnet 4
    [-]
    - data-ottawa 12 hours ago
      Yesterday I had my biggest Opus WTF.
      I asked Opus 4.6 to help me get GPU stats in btop on nixos. Opus's first approach was to use patchelf to monkey patch the btop binary. I had to redirect it to just look the nix wiki and add `nixpkgs.config.rocmSupport = true;`.
      But the approach of modifying a compiled binary for a configuration issue is bizarre.
      [-]
      - pxc 11 hours ago
        It does stuff like this all the time. It loves doing this with scripts with sed, so I'm not surprised to hear about it trying to do this with binaries. It's definitely wilder, though
        [-]
        spoiler 7 hours ago
        It frequently gets indentation wrong on projects, then tries to write sed/awk scripts. Can't get it right, then write a python script that reformats the whole file on stdout, makes sure the indentation is correct, then writes requests an edit snippet.
        And you might be thinking. Well, you should use a code formatter! But I do!
        And then you might say, well surely you forgot to mention it in you AGENTS/CLAUDE file. Nope, it's there, multiple times even in different sections because once was apparently not enough.
        And lastly, surely if I'm watching this cursed loop unfold and am approving edits manually, like some bogan pleb, I can steer it easily... Well, let me tell ya... I tried stopping it and injecting hints about the formatter, and it stick for a minute before it goes crazy again. Or sometimes it rereads the file and just immediately fucks up the formatting.
        I think when this shit happens, it probably uses like 3x more tokens.
        For a Rust project, it recently stated analysing binaries in the target as directory a first instinct, instead of looking at the code...
        Good grief.
  - comboy 17 hours ago
    Pretty reassuring to hear that. I was skeptical too, there's a lot of variables like some crap added to memory specific skill or custom instructions interfering with the workflow and what not. But now it was like a toddler that consumes money when talking.
    [-]
    - timacles 16 hours ago
      It’s quite an interesting business model actually that the worse it performs to a degree the more money it makes you because of the token churn
  - combyn8tor 16 hours ago
    In my experience Opus and Claude have declined significantly over the past few weeks. It actually feels like dealing with an employee that has become bored and intentionally cuts corners.
    [-]
    - rishabhaiover 13 hours ago
      And the worse part is the company is gaslighting people when they report it
  - qingcharles 16 hours ago
    Is it? Or is it the task you're trying to do? Opus 4.6 has been staggeringly good for me this last week, both inside Claude Code and through Antigravity until I used up my quota.
    [-]
    - pacha3000 2 hours ago
      Usually, Claude code with Opus checks by itself the right tools to check the docs, for Svelte for example. So what it gives me is usually flawless.
      And right now, I have to remind it every time that the MCP exists, and even then it cannot manage to find a routing bug I have with Sveltekit.
      Did a lot of Sveltekit with Opus in the past, and I didn't have to think about it, Opus always got it right easily. Until now
    - SoMomentary 12 hours ago
      I think some of this comes down to undeclared A/B testing. I've had the worst week of interactions I have ever had using Claude Code. The whole week whenever I have a session that isn't failing miserably I seem to get tapped for a session survey but on any that are out and out shitting the bed it never asks. It has felt a little surreal. I'd love to see a product wide stats graph for swearing, I would 100% believe that it is hitting an all time high but maybe I'm just a victim of a bad A/B round.
      [-]
      - oefrha 11 hours ago
        Oh I’ve been getting a lot more of those too lately even though I dismiss it every time. Wonder if I should report not satisfied every time so that I get routed to something better…
- bicepjai 11 hours ago
  Yes totally agree it’s regurgitating crazy expansive text like book author who needs to publish 10 books a day
Eldodi 19 hours ago
Anthropic is really good at releasing features that are almost the same but not exactly the same as other features they released the week before
[-]
- masto 16 hours ago
  So management can cancel all of last week’s projects when they told us all we had to be using skills because the CEO read about them in the in flight magazine. Routines are the future, baby. DevOps already made a big announcement that they’re centralizing the Routines Hub. If you can’t keep up, we’ll get someone who says they can.
- dymk 19 hours ago
  7 days is long enough for work to leave the context window, hence…
- segmondy 12 hours ago
  They are mass copying any idea they see out there. They are not happy enough being a platform, they want to be everything. Their dream is to be the one AI that eats it all. It's so stupid that folks are using their system. I will never get on board with a platform that competes with me or try to compete with me. They are the Microsoft of AI model providers.
- pants2 9 hours ago
  Clearly they have a weekly automation to take the backlog of feature requests and build a new feature!
- foruhar 14 hours ago
  That's a fairly decent defintion of vibecoding across multiple sessions.
- subscribed 14 hours ago
  Just wait for the new wave of github issues about things they silently broke or degraded.
  Progress, I guess :)
  (I had the most hilariously bad session with Sonnet 4.6 today. I asked it a reasonably simple question and linked to resources, it refused to fetch the resources, didn't ask for pdf/txt I could provide, and confidently printed absolute BS, barely in the same category but completely unrelated.
  I called it off pointing the idiocy, asked if it wants more data, and requested the hallucination be fixed.
  It apologised profusely and hallucinated even worse.
  Maybe I'll try Opus 4.6 tomorrow because frankly Gemma-4-E4B was more coherent than that....
- tclancy 19 hours ago
  And or things I’ve spent a bunch of time building already. And naming them the same. I should have trademarked “dispatch”!
  [-]
  - dbish 19 hours ago
    you're telling me dispatchagents.ai :) (open to new names if anyone has cool ones, didn't expect anthropic to start using dispatch with their agents, naming is way too hard)
- spelunker 19 hours ago
  > In the Desktop app, click New task and choose New remote task; choosing New local task instead creates a local Desktop scheduled task, which runs on your machine and is not a routine.
  Oh uh... ok then.
- titzer 15 hours ago
  Just wait until they get into the phase where they're big enough that they're eating all the baby startups and have to pick winners and losers amongst the myriad of overlapping features while also having the previous baby startups they acquired crank out new features.
  We're watching a speed run of growthism, folks.
minimaxir 20 hours ago
Given the alleged recent extreme reduction in Claude Code usage limits (https://news.ycombinator.com/item?id=47739260), how do these more autonomous tools work within that constraint? Are they effectively only usable with a 20x Max plan?
EDIT: This comment is apparently [dead] and idk why.
[-]
- giancarlostoro 19 hours ago
  I've been talking to friends about this extensively, and read all sorts of different social media posts on X where people deep dove things (I'm at work so I don't have any links handy - though I did submit one on HN, grain of salt, unsure how valid it is but it was interesting: https://news.ycombinator.com/item?id=47752049 ).
  I think the real issue stems from the 1 Million token context window change. They did not anticipate the amount of load it would give you. That first few days after they released the new token window, I was making amazing things in one single session from nothing, to something (a new .NET based programming language inspired by Python, and a Virtual Actor framework in Rust). I think since then they've been trying too many things to tweak things, whilst irritating their users.
  They even added a new "Max" thinking mode, and made "High" the old medium, which is ridiculous because you think you're using "High" but really you're not. There's a hidden config file to change their terrible defaults to let Claude be smarter still, and apparently you can toggle off the 1M tokens.
  I think the real fix, and I'm surprised nobody there has done this yet, is to let the user trim down their context window.
  Think about it, you used to have what? 350k tokens or so? Now Claude will keep sending your prompt from 30 minutes ago that's completely irrelevant to the back-end, whereas 3 months ago it would have been compacted by now.
  Others have noted that similar prompting for some ungodly reason adds tens of thousands of extra garbage tokens (not sure why).
  Edit looks like someone figured out that if you downgrade your version of Claude Code and change one single setting it unruins Claude:
  https://news.ycombinator.com/item?id=47769879
  [-]
  - SkyPuncher 17 hours ago
    Yea, I've realized that if I stay under 200k tokens I basically don't have usage issues any more.
    A bit annoying, but not the end of the world.
    [-]
    - consumer451 15 hours ago
      super-edit: Sorry, this is not a usage related question, I have move it to: https://news.ycombinator.com/item?id=47772971
      Here is the question for which I cannot find an answer, and cannot yet afford to answer myself:
      In Claude Code, I use Opus 4.6 1M, but stay under 250k via careful session management to avoid known NoLiMa [0] / context rot [1] crap. The question I keep wanting answered though: at ~165k tokens used, does Opus 1M actually deliver higher quality than Opus 200k?
      NoLiMa would indicate that with a ~165k request, Opus 200k would suck, and Opus 1M would be better (as a lower percentage of the context window was used)... but they are the same model. However, there are practical inference deployment differences that could change the whole paradigm, right? I am so confused.
      Anthropic says it's the same model [2]. But, Claude Code's own source treats them as distinct variants with separate routing [3]. Closest test I found [4] asserts they're identical below 200K but it never actually A/B tests, correct?
      Inside Claude Code it's probably not testable, right? According to this issue [5], the CLI is non-deterministic for identical inputs, and agent sessions branch on tool-use. Would need a clean API-level test.
      The API level test is what I really want to know for the Claude based features in my own apps. Is there a real benchmark for this?
      I have reached the limits of my understanding on this problem. If what I am trying to say makes any sense, any help would be greatly appreciated.
      If anyone could help me ask the question better, that would also be appreciated.
      [0] https://arxiv.org/abs/2502.05167
      [1] https://research.trychroma.com/context-rot
      [2] https://claude.com/blog/1m-context-ga
      [3] https://github.com/anthropics/claude-code/issues/35545
      [4] https://www.claudecodecamp.com/p/claude-code-1m-context-wind...
      [5] https://github.com/anthropics/claude-code/issues/3370
      [-]
      - onenite 14 hours ago
        2 parent comments above say that you can use older version of claude code with opus 200k to compare. my guess is that eventually you’ll be able to set it in model settings yourself
  - dgb23 8 hours ago
    The future of harnesses cannot be „resend the whole history on every step“ or whatever this terrible compaction is.
    Most of the context is unstructured fluff, much of it is distracting or even plain wrong. Especially the „thinking“ tokens are often completely disjoint halucinations that don’t make any sense.
    I think what will have to happen is that context looks less like a long chat and action log and more like a structured, short, schema validated state description, plus a short log trace that only grows until a checkpoint is reached, which produces a new state.
  - dacox 19 hours ago
    Yeah, I have been seeing lots of comments, tweets, etc, but given everything I have learned about these models - i do not think the change to 1M was innocuous. I'm not sure what they've claimed publicly, but I'm fairly certain they must be doing additional quantization, or at minimum additional quantization of the KV cache. Plus, sequence length can change things even when not fully utilized. I had to manually re-enable the "clear context and continue" feature as well.
    [-]
    - Jimpulse 34 minutes ago
      How do you re-enable that feature?
    - giancarlostoro 18 hours ago
      I used the heck out of it when it was announced, and it felt like I was using one of the best models I've ever used, but then so were all of their other customers, I don't think they accounted for such heavy load, or maybe follow up changes goofed something up, not sure. Like I said, the 1M token, for the first few days allowed me to bust out some interesting projects in one session from nothing to "oh my" in no time.
      I'm thinking they should go back to all their old settings and as a user cap you at their old token limit, and ask you if you want to compact at your "soft" limit or burst for a little longer, to finish a task.
- imhoguy 18 hours ago
  AI race to the bottom is a debt game now. Once the party is over somebody will have to pay the bill.
  [-]
  - timacles 16 hours ago
    It’s going to be crazy with the explanation they come up with why the us public has to pay to bail out AI for national security.
    In a way, it’s true if china has superior AI then it’s dominance over US will materialize. But it’s not hard to see how this scenario is being used to essential lie and scam into trillions of debt.
    Its interesting how the cutthroat space of big tech has manifested into an incidious hyper capitalist system where disrupting a system is it’s primary function. The system in this case is world order and western governments
    [-]
    - joquarky 15 hours ago
      "Move fast and break things" broke containment to the tech industry. Now you can see it everywhere.
- breakingcups 19 hours ago
  You seem to be vouched for now, no longer dead for me.
  [-]
  - minimaxir 19 hours ago
    Hmm, I can't edit the original comment to retract that edit either. Either my account is flagged for something or HN is being weird.
    [-]
    - 0123456789ABCDE 2 hours ago
      this comment was made almost 1h after the one at the root of the thread
      you can make changes to your posts up to 10 minutes after they were originally created — see: https://news.ycombinator.com/newsfaq.html#:~:text=minutes%20...
    - TacticalCoder 19 hours ago
      Everything looks good to me: you don't look like you have a flagged account (but then I don't work for HN).
- stavros 15 hours ago
  It's not alleged: https://www.ghacks.net/2026/03/27/anthropic-reduces-claude-s...
  [-]
  - minimaxir 14 hours ago
    That's a separate change to what the linked HN post described.
ctoth 20 hours ago
You'd think that if they were compute-limited ... Trying to get people to use it less ... The rational thing to do would be to not ship features that will use more compute automatedly? Or does this use extra usage?
[-]
- whicks 20 hours ago
  I would imagine that this sort of scheduling allows them to have more predictable loads, and they may be hoping that people will schedule some of their tasks in “off hours” to reduce daytime load.
  [-]
  - andai 20 hours ago
    It also beats OC's heartbeat where it auto-runs every 30 minutes and runs a bunch of prompts to see if it actually needed to run or not.
    [-]
    - stingraycharles 3 hours ago
      It’s really, really ridiculous just how many tokens OpenClaw burns when it’s not doing anything.
    - pkulak 20 hours ago
      Man, this just bit me too. I started playing with OC over the weekend (in a VM), and the spend was INSANE even though I wasn't doing anything. I don't see this as very useful as an "assistant" that wanders around and anticipates my needs. But I do like the job system, and the ability to make skills, then run them on a schedule or in response to events. But when I looked into what it was doing behind my back, 48 times a day it was packaging up 20K tokens of silly context ("Be a good agent, be helpful, etc, for 30 paragraphs"), shipping it off to the model, and then responding with a single HEARTBEAT_OK.
      Luckily you can turn if off pretty easily, but I don't know why it's on by default to begin with. I guess holdover from when people used it with a $20 subscription and didn't care.
      [-]
      - stingraycharles 3 hours ago
        I can recommend Hermes Agent as an alternative to OpenClaw which actually works well, is properly architected, and doesn’t break three times a week.
      - stavros 15 hours ago
        If you want something more lightweight, I made one that has no heartbeat by default: https://stavrobot.stavros.io.
        It's very light on token usage in general, as well.
  - pletnes 19 hours ago
    Also you can schedule it a bit off. Every hour? Delay it a few seconds. Can’t do that with a chat message. Also, batch up a bunch of them, maybe save some compute that way? Latency is not an issue.
  - ctoth 20 hours ago
    I thought about that but I'm pretty sure that if the backlog is automatically clean and I don't need to run my skill for that when I start up in the morning that just means I can do the next task I would have done which will probably use Claude Code.
    Your own, personal, Jevons.
- dpark 18 hours ago
  They are more worried about building a moat than anything else. They want people building integrations that are difficult to undo so that they lock into the platform.
  [-]
  - stavros 15 hours ago
    So basically what happened:
    1. Anthropic realized their models weren't enough of a moat.
    2. They built tools so they could expand their moat.
    3. People don't want to use their tools, they want their models, and use other, better tools.
    4. Anthropic bans the use of better tools, taking advantage of their model superiority to try to lock people into subpar tools.
    "I don't have enough of a moat so I'll use my little moat and pretend it's a big one" doesn't sound like a great strategy. All they're doing with this anticonsumer behaviour is making sure that I'll leave the moment another model works for me as well as Claude does.
    [-]
    - goosejuice 9 hours ago
      Kind of. Yes, they want you to use their tools.
      I doubt Anthropic ever thought they would have a big moat just based on the models. The platform is just as important.
      Claude code and Cowork would still be extremely valuable to Anthropic even if they didn't release them to the public.
      Owning the harness gives them a ton of data they can use to tune the models.
      This is a perfectly sane strategy even if it's a bit unsavoury to some technical folk.
    - sumedh 5 hours ago
      > I'll leave the moment another model works for me as well as Claude does.
      Isnt that a large moat in itself but you are claiming its not enough?
      [-]
      - stavros 5 hours ago
        Apparently not for Anthropic, since they've been wanting to build a larger one by force. My point is you can't build a moat by forcing people, it defeats the purpose.
    - hyraki 12 hours ago
      What are other better tools? I would like to use them
      [-]
      - stavros 12 hours ago
        OpenCode, Pi, whatever Anthropic doesn't let you use with their subscription because they want to lock you in to their stuff.
  - lostmsu 17 hours ago
    > They want people building integrations that are difficult to undo so that they lock into the platform.
    Ironically, they are now playing against their own models that can relatively easily build wrappers around any API shape into any other API shape.
- iBelieve 20 hours ago
  Max accounts get 15 daily runs included, any runs above that will get billed as extra usage.
- AlexCoventry 19 hours ago
  I don't think "usage" is exactly the metric they're going for, more like "usage in line with our developmental strategy." Transcripts of people using Claude to write code are probably far more valuable to them than transcripts of OpenClaw trying to set up a calendar invite.
  [-]
  - fgkramer 19 hours ago
    I mean, they don’t train on your data unless you have the setting enabled. Do you really think they are reading your prompts at all? Free inference providers sure, but Anthropic?
- dockerd 20 hours ago
  It's how they can lock more users into their eco-system.
mellosouls 20 hours ago
Put Claude Code on autopilot. Define routines that run on a schedule, trigger on API calls, or react to GitHub events...
We ought to come up with a term for this new discipline, eg "software engineering" or "programming"
[-]
- avaer 20 hours ago
  Setting up your agent. This part doesn't deserve a name; there is no programming or engineering or really much thinking involved.
- raincole 19 hours ago
  Sounds more like openclawing.
- oxag3n 18 hours ago
  It's "promptramming".
- jnpnj 19 hours ago
  gramming
  [-]
  - realo 18 hours ago
    Ah! Totally... We have:
    airgramming plusgramming programming maxgramming studiogramming
    and recently the brand new way of working: Neogramming !
    Personally I stick for now with the "Programming " tier. Maybe will upgrade to "Maxgramming" later this year...
- baq 20 hours ago
  Does ‘vibe coding’ work?
brandensilva 11 hours ago
Anthropic is burning their good will faster than the tokens we use these days. It is hard to be excited about these new features when the core product has been neutered into oblivion.
oxag3n 18 hours ago
Are they going to mirror every tool software engineers were used to for decades, but in a mangled/proprietary form?
I think to become really efficient they'll have to invent new programming language to eliminate all the ambiguity and non-determinism. Call it "prompt language", with ai-subroutines, ai-labels and ai-goto.
eranation 19 hours ago
I've been using it for a while (it was just called "Scheduled", so I assume this is an attempt to rebrand it?)
It was a bit buggy, but it seems to work better now. Some use cases that worked for me:
1. Go over a slack channel used for feedback for an internal tool, triage, open issues, fix obvious ones, reply with the PR link. Some devs liked it, some freaked out. I kept it.
2. Surprisingly non code related - give me a daily rundown (GitHub activity, slack messages, emails) - tried it with non Claude Code scheduled tasks (CoWork) not as good, as it seems the GitHub connector only works in Claude Code. Really good correlation between threads that start on slack, related to email (outlook), or even my personal gmail.
I can share the markdowns if anyone is interested, but it's pretty basic.
Very useful, (when it works).
[-]
- wazHFsRy 8 hours ago
  Same here it just feels like something so simple, I’d rather have it under my own control. That way I can keep it independent of claude as well. I use it for all kind of routine tasks like updating the summary of projects I am working on or tracking some personal activities. My setup looks like this: https://www.dev-log.me/click_recurring_tasks_for_claude/
- FrenchTouch42 10 hours ago
  I'd be curious to take a look if you can share
- panavm 9 hours ago
  [dead]
cedws 17 hours ago
This is the beginning of AI clouds in my estimation. Cloud services provide needed lock-in and support the push to provide higher level services over the top of models. It just makes sense, they'll never recoup the costs on just inference.
summarity 20 hours ago
If you’re trying this for automating things on GitHub, also take a look at Agentic Workflows: https://github.github.com/gh-aw/
They support much of the same triggers and come with many additional security controls out of the box
[-]
- eranation 18 hours ago
  +1 for that, having that said, because GH agentic workflows require a bit more handholding and testing to work, (and have way more guardrails, which is great, but limiting), and lack some basic connectors (for example - last time I tried it, it had no easy slack connector, I had to do it on my own). This is why I'm moving some of the less critical gh-aw (all the read only ones) to Claude Routines.
- deadfall23 18 hours ago
  Why not https://github.com/anthropics/claude-code-action?
- gavinray 20 hours ago
  Why have I not heard of this? Was looking for a way to integrate LLM CLI's to do automated feature development + PR submission triggered by Github issues, seems like this would solve it.
  [-]
  - eranation 16 hours ago
    Built in Co-Pilot I believe can do this better than gh-aw (or a click away).
    Cursor has that too by the way (issue -> remote coding session -> PR -> update slack)
Chrisszz 2 hours ago
I don't get all the hype around these kinds of things, it's not something that crazy that when it gets released you solve some kind of problem you had. Those are things that can easily be replicated and that many of us already built for ourselves months ago, before those were published and became mainstream
richardw 15 hours ago
I’m moving away from Claude for anything complicated. It’s got such nice DX but I can’t take the confident flaky results. Finding Codex on the high plan more thorough, and for any complicated project that’s what I need.
Still using Claude for UX (playgrounds) and language. OpenAI has always been a little more cerebral and stern, which doesn’t suit those areas. When it tries to be friendly it comes off as someone my age trying to be a 20-something.
airstrike 20 hours ago
Still no moat.
The reason someone would use this vs. third-party alternatives is still the fact that the $200/mo subscription is markedly cheaper than per-token API billing.
Not sure how this works out in the long term when switching costs are virtually zero.
[-]
- petesergeant 20 hours ago
  I think at this point the aim is less about moat, and more about getting an advantage that self-sustains: https://www.rand.org/pubs/research_reports/RRA4444-1.html
- TacticalCoder 19 hours ago
  > Not sure how this works out in the long term when switching costs are virtually zero.
  All these not really helpful, but vendor specific, "bonuses" sounds like a way to try to lock people in, to try to raise the switching cost.
  I'm using, on purpose, a simple process so that at any time I can switch AI provider.
haukem 14 hours ago
I used the claude-code-action GitHub Action to review PRs before, but it is pretty buggy e.g. PRs from forked repositories do not work, and I had to fix it myself. This should work better with Claude Code Routines. claude-code-action only works with the API and is therefore pretty expensive compared to the subscription.
I think LLM reviews on PRs are helpful and will reduce the load on maintainers. I am working on OpenWrt and was approved for the Claude Code Max Open Source Program today. The cap of 15 automatic Claude Code Routines runs per day is a bit low. We get 5 to 20 new PRs per day and I would like to run it on all of them. I would also like to re-run it when authors make changes, in that case it should be sufficient to just check if the problems were addressed.
Is it possible to get more runs per day, or to carry over unused ones from the last 7 days? Maybe 30 on Sonnet and 15 on Opus?
When I was editing a routine, the window closed and showed an error message twice. Looks like there are still some bugs.
robeym 2 hours ago
I'm not interested in new features when the main ones are mysteriously getting worse and worse. I need to have a sense of stability before I get excited about any other features.
[-]
- causal 58 minutes ago
  100%. Every time I see a Claude Code announcement I sigh when I still can't get the desktop app to fullscreen without fritzing or make a plan and display it consistently. Many papercuts.
bryanhogan 9 hours ago
So do I understand correctly that this is a competitor to something like n8n, but instead entirely vibe-coded?
n8n: https://n8n.io/
matthieu_bl 20 hours ago
Blog post https://claude.com/blog/introducing-routines-in-claude-code
holografix 7 hours ago
Anthropic is putting a lot of eggs into the same Claude Code basket.
If the Lovable clone is real that’s going to piss off many model consumers out there.
Is Sierra next?
lherron 10 hours ago
It’s interesting to watch Ant try to ship every value-add product feature they can while they still have the SOTA model for agentic. When an open weights equivalent to Opus 4.5’s agentic capabilities comes out, I expect massive shifts of workloads away from Claude.
Don’t get me wrong, I think their business model is still solid and they will be able to sell every token they can generate for the next couple years. They just won’t be critical path for AI diffusion anymore, which will be good for all sides.
twobitshifter 16 hours ago
It seemed OpenClaw is just Pi with Cron and hooks, and it seems like this is just Claude Code with Cron and hooks. Based on the superiority of Pi, I would not expect this to attract any one from OpenClaw, but it will increase token usage in Claude Code.
netdur 20 hours ago
didn’t we have several antitrust cases where a vendor used its monopoly to disadvantage rivals? did not anthropic block openclaw?
[-]
- Someone1234 19 hours ago
  They did not.
  You can still use OpenClaw on their API pricing tier as much as you want. What they did is not allow subscriptions to be used to power automated third-party workloads, including OpenClaw.
  Now, is their messaging around this confusing? Absolutely. The whole thing has been handled shambolically. Everyone knows that they lack the compute to keep up, and likely have lower margins on subscriptions than API; but they cannot just say that because investors may be skittish.
- dmix 20 hours ago
  How is Anthropic a monopoly? The market is barely even fully developed and has multiple large and small competitors
- andai 20 hours ago
  It's not blocked, you just can't use the Claude-only subscription endpoint with unauthorized 3rd party software. (You can use it via the regular API (7x more expensive) and pay per token just fine.)
  ...Except now you sorta-kinda can: now they auto-detect 3rd party stuff and bill you per-token for it?
  If I'm reading it right:
  https://news.ycombinator.com/item?id=47633568
sminchev 18 hours ago
Everything is big race! Each company is trying to do as much as possible, to provide as many tools as possible, to catch the wave and beat the concurrency. I remember how Antropic and OpenAI made releases in just 10-15 minutes of difference, trying to compete and gain momentum.
And because they use AI heavily, they produce new product every week. So fast, that I have no time to check, does it worth or not.
This one looks interesting. I have some custom commands that I execute manually weekly, for monitoring, audits, summary, reports. It it can send reports on email, or generate something that I can read in the morning with my coffee, or after I finish with it ;) it might be a good tool.
The question is, do I really want to so much productive? I am already much better in performance with AI, compared with the 'old school' way...
Everything is just getting to much for me.
mercurialsolo 8 hours ago
Why not just do event based triggers e.g. register (web)hooks instead of schedules time based triggers. Have a mechanism to listen to an event and then run some flow - analyze, plan, execute, feedback
whh 7 hours ago
I don’t think LLMs should be trying to replace what essentially should be well tested heuristics.
It’s fine if it’s a stop gap. But, it’s too inconsistent to ever be reliable.
yohamta 11 hours ago
Claude and Open AI seems to be trying not to be 'Just a model', but this is intrinsically problematic because model can be degraded and prices only goes up once they lock-in customers. It is increasingly important for anyone who are responsible of managing 'AI workflows' to keep the sovereignty about how you use AI models. This is why I'm super excited in building the local-first workflow orchestration software called "Dagu", that allows us to own your harness on your own. It's not only more cost-effective, but outcome is better as well because you have 100% full control. I think it's only matter of time that people notice they need to own their workflow orchestration on their own not relying on Anthropic, OpenAI, or Google.
[-]
- smartbit 4 hours ago
  > Hi HN! I'm the author of Dagu, a workflow engine that's been in development for a few years.
  yohamta in https://news.ycombinator.com/item?id=44142130
kylegalbraith 7 hours ago
Having used the cowork version of this: scheduled automations. I have very little confidence in this from Anthropic. 90% of the time the automation never even runs.

I also felt the need for a cloud based cron like automations, so decided to build it myself https://cronbox.sh with:

  1. an ephemeral linux sandbox for each task 
  2. capability to fetch any url
  3. can use tools like ffmpeg to fulfill your scheduled task

sublimefire 9 hours ago
Did this sort of a thing in my own macos app which can have routines with a cron, custom configs and chains of prompts. There is also more like custom VMs and models to be used for different tasks. Interesting to see larger providers trying to do the same.
But their own failure is the fact that there is a limited way to configure it with other models, think 3d modelling and integrating 3d apps on a VM to work with. I believe an OSS solution is needed here, which is not too hard to do either.
vfalbor 5 hours ago
Two things about My experience, first you only have one at a time per suscription, if you need implement two at the same time i could not able to. The second is that you can do that with a well configured cron.
kennywinker 9 hours ago
Am I crazy in thinking an LLM doing any kind of serious workload is risky as hell?
Like, say it works today, but tomorrow they update the model and instead of emailing you an update it emails your api keys to all your contacts? Or if it works 999 times out of 1000 but then commits code to master that makes all your products free?
Idk man… call me Adama, but i do not trust long-running networked ai one bit
vessenes 20 hours ago
This is one of the best features of OpenClaw - makes sense to swipe it into Claude Code directly. I wonder if Anthropic wants to just make claude a full stand-in replacement for openclaw, or just chip away at what they think the best features are, now that oAI has acquired.
[-]
- mkw5053 20 hours ago
  What are some of the best use cases you've found? I have some gh actions set up to call claude code, but those have already been possible.
manishfp 3 hours ago
So basically OpenClaw but better and safer I presume haha!
dispencer 19 hours ago
This wild, one of the pieces I was lacking for a very openclaw-esque future. Now I think I have all the mcp tools I need (github, linear, slack, gmail, querybear), all the skills I need, and now can run these on a loop.
Am I needed anymore?
[-]
- imhoguy 17 hours ago
  Let it "build power plant" and we are on track with "Paperclip maximizer" https://en.wikipedia.org/wiki/Instrumental_convergence#Paper...
- brcmthrowaway 19 hours ago
  No
tills13 18 hours ago
> react to GitHub events from Anthropic-managed cloud infrastructure
Oh cool! vendor lock-in.
rahimnathwani 14 hours ago
The docs list the GitHub events that can be used as triggers. This is included in the list:
```
  Push Commits are pushed to a branch
```
But when I try to create a routine, the only GitHub events available in the drop down related to pull requests and releases. Nothing available related to pushes/commits or issues. Am I holding it wrong?
tallesborges92 15 hours ago
Anthropic I don’t care for your tools, just ship good and stable models so we can build the tools we need.
srid 19 hours ago
I just used this to summarize HN posts in last 24 hours, including AI summaries.
This PR was created by the Claude Code Routine:
https://github.com/srid/claude-dump/pull/5
The original prompt: https://i.imgur.com/mWmkw5e.png
taw1285 19 hours ago
I have a small team of 4 engineers, each of us is on the personal max subscription plan and prefer to stay this way to save cost. Does anyone know how I can overcome the challenge with setting up Routines or Scheduled Tasks with Anthropic infra in a collaborate manner: ie: all teammates can contribute to these nightly job of cleaning up the docs, cleaning up vibe coding slops.
[-]
- hallway_monitor 18 hours ago
  My team was doing this until recently but I think in February, Anthropic made team accounts available for subscription instead of API billing. Assuming that is the cost you mentioned.
watermelon0 20 hours ago
Seems like it only supports x86_64. It would be nice if they offered a way to bring your own compute, to be able to work on projects targeting arm64.
hackermeows 10 hours ago
Is claude at the AWS's old throw sh*t at the wall and see what sticks phase of their business already . That did not take very long.
causal 18 hours ago
Haven't Github-triggered LLMs already been the source of multiple prompt injection attacks? Seems bad.
yalogin 12 hours ago
I am beginning to fear Claude is going to massively raise prices or at the very least severely restrict its $20/month plan. Hope it doesn’t happen but feels inevitable
thegdsks 10 hours ago
Looks like they are slowly getting the OpenClaw features here in Cowork Already seeing the 5 limit per day in usage bar now..
theodorewiles 20 hours ago
How does this deal with stop hooks? Can it run https://github.com/anthropics/claude-code/blob/main/plugins/...
woeirua 17 hours ago
I don't get the use case for these... Their primary customers are enterprises. Are most enterprises happy with running daily tasks on a third party cloud outside of their ecosystem? I think not.
So who are they building these for?
[-]
- chickensong 15 hours ago
  Nobody likes infra. PMs don't know wtf a crontab is.
- dbbk 16 hours ago
  Not really any different to GitHub Actions
egamirorrim 19 hours ago
I wish they'd release more stuff that didn't rely on me routing all my data through their cloud to work. Obviously the LLM is cloud based but I don't want any more lock-in than that. Plus not everyone has their repositories in GitHub.
eranation 16 hours ago
If anyone from anthropic reads it. I love this feature very much, when it works. And it mostly doesn't.
The main bugs / missing features are
1. It loses connection to it's connectors, mostly to the slack connector. It does all the work, then says it can't connect to slack. Then when you show it a screenshot of itself with the slack connector, it will say, oh, yeah, the tools are now loaded and does the rest of the routine.
2. ability to connect it to github packages / artifactory (private packages) - or the dangerous route of allowing access to some sort of vault (with non critical dev only secrets... although it's always a risk. But cursor has it...)
3. the GitHub MCP not being able to do simple things such as update release markdown (super simple use case of creating automated release notes for example)
You are so close, yet so far...
[-]
- Terretta 15 hours ago
  It's remarkable how often it refuses to introspect but a SCREENSHOT of itself and suddenly "yeah this works fine".
  This happens in all their UIs, including, say, Claude in Excel, as well.
cryptonector 8 hours ago
Oof, running Claude Code automatically on PRs is scary.
desireco42 20 hours ago
I think they are using Claude to come up with these and they will bringing one every second day... In fact, this is probably routine they set.
amebahead 9 hours ago
Anthropic's update cycle is too fast..
drumttocs8 12 hours ago
Can someone tell me what this does that n8n doesn't?
jcims 19 hours ago
Is there a consensus on whether or not we've reached Zawinski's Law?
[-]
- senko 19 hours ago
  I've had an AI assistant send me email digests with local news, and another watching a cron job, analyzing the logs and sending me reports if there's any problem.
  I'd say that counts as yes.
  (For clarity: neither are powered by Claude Code Routines. Rather, Claude Code coded them and they're simple cron jobs themselves.)
- verdverm 19 hours ago
  TIL email is what I'm missing in my personal development (swiss army) tool
nico 20 hours ago
Nice, could this enable n8n-style workflows that run fully automatically then?
[-]
- outofpaper 20 hours ago
  Yes but much less efficiently. Having LLMs handle automation is like using a steam engine to heat your bath water. It will work most of the time but it's super inefficient and not really designed for that use and it can go horribly wrong from time to time.
  [-]
  - meetingthrower 20 hours ago
    Correct. But the llm can also program you the exact automation you want! Much more efficiently than gui madness with N8N. And if you want observability just program that too!
- meetingthrower 20 hours ago
  Already very possible and super easy if you do a little vibecoding. Although it will hit the api. Have a stack polling my email every five minutes, classifying email, taking action based on the types. 30 minute coding session.
gegtik 12 hours ago
how did they not call this OpenClaude?
teucris 19 hours ago
My only real disappointment with Claude is its flakiness with scheduling tasks. I have several Slack related tasks that I’ve pretty much given up trying to automate - I’ve tried Cowork and Claude Code remote agents, only to find various bugs with working with plugins and connectors. I guess I’ll give this a try, but I don’t have high hopes.
[-]
- dbmikus 11 hours ago
  If you don't get it working with Claude Code Routines, would love to connect and see if we can help! We're building an open core product that can spin up sandboxed coding and control them from Slack (and also web UI, TUI, and HTTP APIs + CLIs)
  We work with any coding model / harness.
  website: https://www.amika.dev/
  OSS repo: https://github.com/gofixpoint/amika
  And my email is dylan@amika.dev (I'm one of the founders)
compounding_it 6 hours ago
A year ago everyone was so hyped on LLMs even on HN. A year later I see frustration and disappointment on HN. It’s very interesting because this is the case with every new technology and the ‘next thing’
heartleo 10 hours ago
I couldn’t agree more.
dispencer 19 hours ago
This is massive. Arguably will be the start of the move to openclaw-style AI.
I bet anthropic wants to be there already but doesn't have the compute to support it yet.
[-]
- dpark 18 hours ago
  What’s massive about cron jobs and webhooks? I feel like I’m missing something. This is useful functionality but also seems very straightforward.
Razengan 8 hours ago
Claude has been hilariously broken for me: https://i.imgur.com/HF198nl.png
So many gaffes like that.
Codex is much more consistent but still has to be verified. AI is still not quite at the point where you can blindly trust it.
bofia 12 hours ago
Could be a start
lofaszvanitt 9 hours ago
AI companies act like pelicans. They want to gobble everything.
ale 20 hours ago
So MCP servers all over again? I mean at the end of the day this is yet another way of injecting data into a prompt that’s fed to a model and returned back to you.
i_love_retros 6 hours ago
hehe imagine 10 years ago releasing a library where the functions may or may not do what you expect 100 percent of the time. And paying lots of money to use it.
What a time to be alive.
blcknight 15 hours ago
For the love of god fix bugs and write some fricken tests instead of dropping new shiny things
It is absolutely wild to me you guys broke `--continue` from `-p` TWO WEEKS AGO and it is still not fixed.
[-]
- weird-eye-issue 13 hours ago
  --resume works fine?
  [-]
  - blcknight 12 hours ago
    That's why I mentioned `-p`.
    `--continue` and `--resume` are broken from `-p` sessions for the last 2 weeks. The use case is:
    1. Do autonomous claudey thing (claude -p 'hey do this thing')
    2. Do a deterministic thing
    3. Reinvoke claude with `--continue`
    This no longer works. I've had this workflow in GitHub actions for months and all of a sudden they broke it.
    They constantly break stuff I rely on.
    Skill script loading was broken for weeks a couple months ago. Hooks have been broken numerous times.
    So tired of their lack of testing.
    [-]
    - weird-eye-issue 11 hours ago
      Been working fine for me with -p
xuxu298 7 hours ago
Can we use it free or fee? If free, how many free routines per day.
dbg31415 14 hours ago
Seems like more vendor lock-in tactics.
Not saying it doesn’t look useful, but it’s something that keeps you from ever switching off Claude.
Next year, if Claude raises rates after getting bought by Google… what then?
And what happens when Claude goes down and misses events that were supposed to trigger Routines? I’m not at the point where I trust them to have business-dependable uptime.
jwpapi 15 hours ago
All these new offers try to kill fire with fire. You don’t make the codebase better with more agents. You introduce more complicated issues.
It’s a trap.
verdverm 19 hours ago
One gripe I have with Claude Code is that the CLI, Desktop app, and apparently the Webapp have a Venn Diagram of features. Plugins (sets of skills and more) are supported in Code CLI, maybe in Cowork (custom fail to import) but not Code Desktop. Now this?
The report that they are 90% Ai code generated seems more likely the more I attempt to use their products.
[-]
- bottlepalm 17 hours ago
  Their source code leak showed how badly vibe coded Claude Code is, despite it being one of the best AI assistants.
  But yea there's some annoying overlap here with Cowork which also has scheduled tasks, in Cowork the tasks can use your desktop, browser and accounts which is pretty useful - a big difference from these Claude Code Routines.
hamuraijack 17 hours ago
please, no more features. just fix context bloat.
varispeed 20 hours ago
Why would you use it if you don't know whether the model will be nerfed at that run?
crooked-v 20 hours ago
The obvious functionality that seems to be missing here is any way to organize and control these at an organization rather than individual level.
bpodgursky 20 hours ago
OpenClawd had about a two week moat...
Feature delivery rate by Anthropic is basically a fast takeoff in miniature. Pushing out multiple features each week that used to take enterprises quarters to deliver.
[-]
- nightpool 20 hours ago
  Do you mean a 3 months moat? Moltbot started going viral in January. That seems to be about a quarter to deliver to me : )
- jcims 19 hours ago
  >Feature delivery rate by Anthropic is basically a fast takeoff in miniature.
  I like to just check the release notes from time to time:
  https://github.com/anthropics/claude-code/releases
  and the equally frenetic openclaw:
  https://github.com/openclaw/openclaw/releases
  GPT-4.1 was released a year ago today. Sonnet 4 is ~11 months old. The claude-code cli was released last Feb. Gas Town is 3 months old.
  This is a chart that simply counts the bullet points in the release notes of claude code since inception:
  https://imgur.com/a/tky9Pkz
  This is as bad and as slow as it's going to be.
- irthomasthomas 18 hours ago
  The velocity of shipping is wild. Though I cannot recall a novel feature they shipped first. Can you?
- whalesalad 20 hours ago
  Hard to wanna go all-in on the Anthropic ecosystem with how inconsistent model output from their top-tier has been recently. I pay $$$ for api-level opus 4.6 to avoid any low-tier binning or throttling or subversive "its peak rn so we're gonna serve up sonnet in place of opus for the next few hours" but I still find that the quality has been really hit or miss lately.
  The bell curve up and then back down has been so jarring that I am pivoting to fully diversifying my use of all models to ensure that no one org has me by the horns.
  [-]
  - bpodgursky 20 hours ago
    yeah i mean nobody uses Claude anymore, the utilization is too high
    [-]
    - chrisweekly 20 hours ago
      right, like the bar nobody goes to anymore bc it's always too crowded
- renticulous 20 hours ago
  Anthropic is trying to be AI version of AWS.
  [-]
  - twoodfin 19 hours ago
    That is a really tough business if you can't match AWS' efficiency & reliability at scale. Presumably AWS also wants to be the AI version of AWS.
    (Amazon + Anthropic does seem like a much more compelling enterprise collaboration / acquisition than Microsoft + OpenAI ever did.)
- dbbk 20 hours ago
  And yet none of them work properly and are unstable.
- slopinthebag 20 hours ago
  You're delusional if you think these features would take competent programmers quarters to deliver.
  [-]
  - buster 19 hours ago
    He said "enterprises" not "competent programmers".
    [-]
    - slopinthebag 17 hours ago
      Is Anthropic not an enterprise?
  - unshavedyak 20 hours ago
    Maybe they were accounting for huge layers of red tape in large orgs. God knows those are far slower than "competent programmers" lol
    [-]
    - slopinthebag 17 hours ago
      That red tape doesn't disappear when you start vibe coding tho.
nojvek 4 hours ago
I could understand the portal before. Now it’s a gazillion things bolted on.
Enshittification is well in force.
I’d trust the huperscalers a lot more with their workers/lambda like infra to run routine jobs calling LLM APIs or deterministic code instead of Anthropic.
Anthropic is a phenomenal paid model but they have a poor reliability record.
I don’t care much if Claude code hiccups when generating code. But after the code is generated I want it to run with multiple 9s under certain latencies every single time.
consumer451 20 hours ago
meta:
Sorry, but I just have to ask. Why is u/minimaxir's comment dead? Is this somehow an error, an attack, or what?
This is a respected user, with a sane question, no?
I vouched, but not enough.
edit: His comment has arisen now. Leaving this up for reference.
[-]
- irthomasthomas 18 hours ago
  We live in strange times!
qwertyuiop_ 16 hours ago
“Scheduled tasks and actions invoked by callback urls”
claud_ia 4 hours ago
[dead]
agent-kay 7 hours ago
[dead]
maxbeech 3 hours ago
[dead]
enesz 6 hours ago
[dead]
consomida 3 hours ago
[dead]
Shinobis_dev 17 hours ago
[dead]
danish00111 5 hours ago
[dead]
KaiShips 19 hours ago
[dead]
vomayank 13 hours ago
[dead]
lo1tuma 18 hours ago
[dead]
maroondlabs 18 hours ago
[dead]
vdalhambra 18 hours ago
[dead]
claud_ia 4 hours ago
[dead]
ninjahawk1 14 hours ago
[dead]
hankerapp 16 hours ago
[dead]
SadErn 20 hours ago
[dead]