The Unbearable Cheapness of Open Weight Models

(jamesoclaire.com)

71 points | by ddxv 7 hours ago

11 comments

  • Jackobrien 2 hours ago
    The giants knew this was coming, and soon 95% of AI tasks will be able to be done by open models (coding, research, cowork style work). So why pay a premium? Why use them at all? This leaves the labs with two options:

    1) push the frontier in a way only massive scale can, and cash in on it (mythos level cyber security, recursive training, frontier science work). There’s big money for never before possible capabilities.

    2) own the app layer with their edge in reputation and powered by their infrastructure. Be apple where everyone else is Linux. Do design, coding, research, SMBs, legal, finance, healthcare and more (they are doing all of this).

    Will it be enough to justify a Google level valuation? We’ll see how fast they can push it.

    • fredley 1 hour ago
      3) Buy all the RAM, increasing the barrier to entry to push back the tide a bit, in time for a juicy IPO.
    • orwin 24 minutes ago
      Mythos was outperformed by small, specific local models in multiple oss project.
      • RugnirViking 5 minutes ago
        i'd love to hear about this! do you have examples?
    • ed_elliott_asc 2 hours ago
      Won’t all they need to do is say “best in class, latest models, fastest” and wine and dine a few execs and those enterprise deals will be signed?

      In this case the people tasked with using the product won’t actually mind.

      • NitpickLawyer 1 hour ago
        No one is getting fired for using SotA.
        • spwa4 1 hour ago
          If the price difference is 2x? Sure.

          If the price difference is 50x? No way.

          • brainwad 1 hour ago
            So long as the benefit:cost ratio is still sufficiently high, I don't think anyone gets fired for not scrimping. Better to encourage positive EV behaviour by your employees than to scare them away by firing them for not being perfectly optimal.
            • ThunderSizzle 40 minutes ago
              The CEO won't get in trouble, but the employee who can't justify a bad result/prompt?
          • RobotToaster 1 hour ago
            Tell that to Oracle
          • watwut 18 minutes ago
            Accenture says "yeah totally CEOs will pay a lot for literal nothing"
      • actionfromafar 1 hour ago
        Yes, exactly that. Be Azure and Office 365 and Sharepoint and AWS where everyone else is Debian Stable on a USB thumbdrive.
        • fragmede 1 hour ago
          Office 365? Ew, Google docs, please.
    • ForHackernews 1 hour ago
      Google already owns the app layer, and hardware, and they are a frontier-level AI research firm.

      I don't see how Anthropic or OpenAI survives being eaten by DeepSeek et al from the bottom of the stack and Google from the top.

      • dubbie99 1 hour ago
        The only reason people use google apps is because they are cheap and reliable. The user experience is awful. Have you ever tried to find a document you had open yesterday in drive?
        • hobo_mark 15 minutes ago
          Uh? Recently and frequently opened documents always show up on the first screen as soon as I open the app or website.
    • sofixa 1 hour ago
      > own the app layer with their edge in reputation and powered by their infrastructure. Be apple where everyone else is Linux. Do design, coding, research, SMBs, legal, finance, healthcare and more (they are doing all of this).

      The problem with this is that there are incumbents in all those spaces doing their own AI agents / platforms, and they're the ones choosing the models they use internally and they sell to their own customers. The margins and the possibility to fine tunie using open weight models, as well as the guarantee they'll keep running at predictable costs (no US orders yanking access), make them a very appealing option.

      And if you're a company that needs an AI powered legal software, would you buy it from OpenAI/Anthropic, or from someone who you've already bought legal software from before and has the domain knowledge?

  • arthurofbabylon 2 hours ago
    Let's imagine that Anthropic/OpenAI fail to manufacture scarcity by villainizing Open Weight models (a sincere probability). What is left for these corporations to prop up their prices, or any margin at all? I expect scaffolding around tool use, supporting bespoke implementation and driving risk down for institutional adoption. (They might even build an insurance tool to protect accountants/lawyers from errors in compounded probabilism!)

    A question for economists... It seems plainly clear to me that information and information processing is commodifying (for the first time in human history?). Without the age-old bottlenecks at the top of the value chain, capital will surely flow downwards, right?

    • ddxv 2 hours ago
      OpenAI, though they seem to backtrack it lately, have been slowly pushing forward of their launch of ads which would be a supplemental way to support cheaper use of their models. This is currently not as great a fit as the modern day banner ads, but it will be interesting to see where they go with that.
  • linzhangrun 4 hours ago
    It would not be surprising if GPT and Claude get cheaper too as inference gets cheaper. Two years ago, o1 was the strongest model and cost much more than Fable, while being nowhere near as smart as a Qwen 3.6 35B that you can now run on a DGX Spark without much trouble.
    • ddxv 3 hours ago
      True, outside of the dark tactics I imagined in the article, they will have to compete at lower costs. It's just that the current iteration does not feel cost competitive yet.
    • tsss 1 hour ago
      Probably they will, unless Claude and GPT become luxury brands like Gucci. Currently it makes no sense for them to invest into efficiency. They need to put everything into competing for the top spot as long as they still have a shot.
  • arikrahman 3 hours ago
    With cache hit rates being effectively free, harnesses like Reasonix have let me do a month of work for less than 2 dollars. It's not even the subsidies making it cheap, American providers like Digital Ocean or Cloudflare host the same model with similar pricing.
    • pjc50 9 minutes ago
      How does caching help here? How much repetition is there in queries?
    • Scaevolus 46 minutes ago
      Cloudflare's Deepseek V4 Pro prices are 4x more than Deepseek's for input and output tokens, and 100x more for cached input tokens, which is crucial for the tool uses of agents which cause multi-turn conversations.
    • ForHackernews 1 hour ago
      I think this is very likely and something that everyone seems to be missing when valuing these AI firms. AI is not the new industrial revolution, it's the new cloud VM: a very useful commodity software offering.
  • leroman 30 minutes ago
    The token-economics for closed source models are different, they are optimizing for 200 USD tokens worth of software engineer monthly usage, they will increase per token price as models or harnesses are more optimized.
  • odie5533 3 hours ago
    This is what concerns me about how AI giants are planning to make money. Their product has already been commoditized at prices which for them are still subsidized to grab market share. Unless the giants invent a technological leap, their prices are going to be dragged down by open weight models and I don't see how they'll turn a profit.
    • Jimega36 2 hours ago
      Reach AGI to leapfrog whoever is behind. Burn everything to get there faster.
      • odie5533 2 hours ago
        If Anthropic announced AGI tomorrow, how much better would that model be than Fable 5? It's looking like the road to AGI is gradual and moat-less. Models seem capable of improving other models, and even without illegal distillations many are nipping at the heels of Anthropic.
        • InsideOutSanta 1 hour ago
          Yeah, I think we're learning that we overestimated the relevance of recursive self-improvement in a singularity/intelligence takeoff scenario. We thought that once an AI could start improving itself, it would cause an exponential, self-reinforcing intelligence explosion.

          Turns out that scaling up compute is much more important and also limits the upper end of intelligence.

        • ForHackernews 1 hour ago
          What is an "illegal" distillation? Terms of service are not laws, and clearly copyright laws are no barriers to developing AI models.
        • IncreasePosts 1 hour ago
          Why would the creator of AGI sell it to anyone, when they could keep it to themselves and corner dozens of markets?
      • jorisw 2 hours ago
        'Reach AGI', the same way SpaceX will put data centers in orbit. A pipe dream.
        • ben_w 1 hour ago
          I'm currently writing a blog post about data centres in orbit, and my current conclusion is that even though they can build one, they definitely can't put 1 million up there and would have better things to do if they could.

          AGI? Too loosely defined. They lack a lot of competences which humans recognise when we see them but find it hard to put into words; on the other hand what they can do they already do faster than any human (and have greater breadth than any single human, but this usually doesn't matter because "coder" and "economist" and "translator" gets solved in human teams by hiring three people).

          I do not think current ML has the tools to solve for quality. But we know it's possible for a really mediocre intelligence to make human level intelligence, because evolution made us, so for me the question of AGI is more a practical one: is it affordable?

          (I also think not at the present time, but that's an "I think" not "I am analyzing it carefully").

          • trick-or-treat 1 hour ago
            Maybe you missed the part where starlink / orbiting datacenters don't really have to even make money as long as they partially fund rocket launch tests.

            Or maybe you don't take Elon seriously when he talks about Mars.

        • NitpickLawyer 1 hour ago
          > will put data centers in orbit. A pipe dream.

          Cheap access to space was once a pipe dream.

          Reusable boosters were once a pipe dream.

          A new player beating Boeing to the ISS was once a pipe dream.

          LEO constellations were once a pipe dream.

          Launching thousands of satellites was once a pipe dream.

          You should know that a) they are already running "AI" chips on their current sats. and b) they are already producing kW of power on orbit and have ~10k sats on orbit. You can watch Scott Manley's video on it, where he does some rough calculations and explains the overall architecture. There is nothing stopping them to do this, from an engineering perspective. If it makes commercial sense, that's another question, but 5-10-20 years in the future things might change there as well.

          • InsideOutSanta 1 hour ago
            I don't think people's argument is that it's impossible to put data centers into space. The argument is that the downsides (radiation, cooling, maintenance, power) are so severe that it is pointless to do it at scale.
            • NitpickLawyer 1 hour ago
              Go back to the megathreads when this came up. Even here on HN. Plenty of people used the argument that it can't be done, for various reasons.

              And my point was that at one point or the other there were many "downsides" for all the tech that SpaceX already has. Reusable boosters were seen as "uneconomical" and "pointless unless they can fly 10 times" by industry experts. They're now flying 30+times a booster.

              LEO constellations were similarly "full of downsides" plus "all the companies that tried it went bankrupt in the 90s", so "it's pointless". And so on.

              • InsideOutSanta 40 minutes ago
                Reusable boosters have clear upsides, though.

                Pretty much everything about data centers in space is worse than having them on Earth. Apart from niche use cases, the only reason you'd talk about data centers in space is if you had a company with rocket ships and needed a story to tie your rocket ships to the current AI craze.

          • general1465 1 hour ago
            Microsoft tried to put datacenters into ocean [1] and then shelved the idea, because even that you have lower amount of failures, you still have failures and somebody has to go there and fix them. Which turns out to be problem.

            And in ocean you don't have to solve for radiation nor cooling.

            [1] https://www.tomshardware.com/desktops/servers/microsoft-shel...

          • IncreasePosts 1 hour ago
            If just Elon was taking about data centers in space, you could take it with a grain of salt. But there are other serious players talking about it like Google and blue origin that it should be pretty clear it can't just be dismissed with "you didn't think about cooling!"
            • NitpickLawyer 1 hour ago
              Yeah, and there's already been tech demonstrators for this. Starcloud-1 launched in '25 (on a F9) and demoed a CotS H100 in a ~60kg bus w/ 1kW of power. They ran inference on a "gemini" model (probably something small) and trained a GPT2 version LLM as a tech demonstrator.
            • ForHackernews 1 hour ago
              Google also wanted to deliver internet from balloons and put everyone's real name on their YouTube comments. Not all their ideas are winners.
        • chpatrick 1 hour ago
          I think it's such a vague term. If you showed someone in 2010 what we have now they would say it's science fiction.
  • anax32 1 hour ago
    Open weight and local hosting is far, far cheaper. In every respect. Even support is cheaper, over time.

    However, it's difficult to sell this to businesses who want contracts and KPIs, not staff and commitments.

    Regulated industries will favour the closed sources, either by choice or mandate. The interesting question is whether they will have better models, or worse models. History says they will receive a worse service, but continue anyway.

    • general1465 1 hour ago
      > Regulated industries will favour the closed sources, either by choice or mandate

      Until your country will appear on naughty list of US administration because your local politician did something what mildly inconvenienced US oligarch

  • my-next-account 1 hour ago
    I wonder whether Oracle is going to go bankrupt because of this
    • worldsayshi 1 hour ago
      Why Oracle?
      • InsideOutSanta 1 hour ago
        They're extremely exposed to a market crash due to their huge debt-funded compute contracts.

        Having said that, while one can always hope, I would assume that Oracle is one of these companies that will be bailed out or find a way to survive.

        • cyanydeez 1 hour ago
          oracle is licking so much boot, you'd need to also have the republican fascist party completely faall apparent.
  • surgical_fire 1 hour ago
    One thing it doesn't even mention is how good those models are. Evet since I moved to DeepSeek I had zero regrets. It performs exceptionally well. I honestly prefer it to ChatGPT (or Claude that I use at work).

    I never used Fable, maybe it is that much better. DeepSeek has no problems with the workloads I give it though - if it only keeps marginally improving with each interaction I don't see myself needing to come back.

  • dist-epoch 1 hour ago
    It's so refreshing to read a short to the point article, which is not extruded into 10 pages with LLMs.