AI outperforms law professors in Stanford Law study

(law.stanford.edu)

77 points | by berlianta 2 hours ago

19 comments

  • causal 58 minutes ago
    As a software engineer I have some intuition for what the risks are of letting agents do some tasks vs others.

    I don't have a similar intuition calibrated for what could go wrong when asking AI to draft a legal document. Some things seem harmless, i.e. drafting a will, but I don't really know- our legal system is notoriously rife with footguns.

    • thewebguyd 52 minutes ago
      I think this is probably true for most skilled professions. AI is best used in the hands of folks already knowledgeable in the skills/professions they are using it for.

      I liken it to me googling things as a sysadmin vs. Jane from accounting doing it. The non-tech end user is far more likely to make the problem worse, or install something sketchy from the ad riddled results than I am, or one of my help desk employees are.

      I wouldn't trust myself to draft an important legal document using AI without the advice of a lawyer, much like I wouldn't really want to rely on my lawyer to use AI to write code for me.

      • ChrisMarshallNY 24 minutes ago
        > I wouldn't really want to rely on my lawyer to use AI to write code for me.

        Yet that is exactly what a lot of C-Suiters (many of whom are lawyers), are doing.

        • xiaoyu2006 9 minutes ago
          Vice versa there is also a lot of irresponsible programmers doing stupid things with ai. Irresponsible people stay irresponsible, AI just make them more productive at being irresponsible.
      • zuzululu 20 minutes ago
        im not so sure

        i think devs overestimate their own role and underestimate others

        i am seeing lawyers and doctors roll out their own software with AI

        but we dont have their training and experience

      • stackghost 19 minutes ago
        It's like that in engineering, for sure. My background is in aerospace and there are lots of things that a reasonably technically-inclined random can probably do passably. It takes an engineer to know which tasks those are, though.

        I would imagine it's similar in law, in that it takes a lawyer or judge to know where the foot guns lie.

    • _heimdall 4 minutes ago
      I wouldn't consider drafting a will to be harmless. If its done poorly the next of kin could have to deal with a huge headache and potentially months or years of probate proceedings.
    • rayiner 46 minutes ago
      I would think that LLMs would be better at avoiding foot-guns. That’s a situation where you have a list of well known rules and potential pit falls, and the work of the lawyer is to apply those to a fact pattern. That’s something that has been hard to automate programmatically, because the fact patterns are similar but different. LLMs, however, seem to excel at applying general principles to differing fact patterns.
      • HappMacDonald 16 minutes ago
        I would categorize this in the "expertise that people internalize but never figure out how to verbalize" department, and that is a department we have no way to teach an LLM because if nobody is writing out those unspoken, subconscious rules then the LLM has nothing to read about them in its training data.
      • dylan604 17 minutes ago
        But can an LLM come up with questions like what the definition of is is? Seems to me there's a lot of "depends on how you read it" type of stuff that lawyers excel at finding novel interpretations. So what coders thinking of as rules are much less straight forward to understand when it comes to laws
    • knollimar 40 minutes ago
      I'm afraid since claude cheats in benches, what will it do with law?
    • prpl 44 minutes ago
      there’s really no limit to how many times and ways you can review something with AI, except dollars.
    • Boss0565 41 minutes ago
      cannot IMAGINE letting ai write my will rn.
    • jay_kyburz 36 minutes ago
      I imagine it's really hard to spot a comma in the wrong place, or a missing sentence in a 10 page contract unless you wrote it yourself, or you assembled it from some battle tested templates.
  • chewbacha 36 minutes ago
    My best guess is that Gemini was trained on the textbooks that the questions are meant to test against, thus they are probably better at explicit recall of those questions or related questions.

    This is a pretty limited introductory course based on what it says in the methods of the paper itself.

  • Esophagus4 45 minutes ago
    Yeah this could be interesting. A lot of the spotlight has been on “law firm stuff” like demand letters and writing contracts…

    But imagine if a dev team didn’t have to go engineer -> product manager -> legal team to get a question answered on local data retention requirements. You could ship that much faster.

    • ares623 35 minutes ago
      Would you take responsibility for missing details about local data retention requirements?
      • Esophagus4 14 minutes ago
        Yes.

        If the only purpose of asking a lawyer is transferring risk (aka cover your ass) while getting the same advice as an LLM, that’s slowing down delivery for purely bureaucratic reasons.

        I’ve seen that mentality at big companies where everyone is scared to stick their neck out and be accountable for a decision. And nothing gets done. Drives me crazy.

        But the people who move up are the people who take ownership and get shit done (and are right a lot).

        (BTW, I have been at companies that were sued by regulators. They never really punish the individual(s) who were in the room when the decision is made. So your worry is kind of misplaced.)

      • zuzululu 33 minutes ago
        honestly if you just avoid EU and China

        you can get away with anything

        • jedberg 21 minutes ago
          California too.
          • applfanboysbgon 12 minutes ago
            And with those three places listed you've ruled out literally 40% of the world economy. Great, you can ship your product in bumfuck Nebraska.
  • throw7 33 minutes ago
    Oh, a "Human-Cented" study by AI lover:

    Julian Nyarko

        Professor of Law
        Co-Chair Stanford Law AI Initiative
        Senior Fellow, Stanford Institute for Human-Cented AI (HAI)
    
    LOL!
  • gaiagraphia 31 minutes ago
    Incredible that the common people will be able to wrestle the right to rule of law away from the bloated legal caste, who have built themselves quite the moat.

    The inaccessibility of justice is a huge driver of inequality. Any tools which bridge this gap will help make a more just society.

  • t0lo 7 minutes ago
    More great news from the prestigious university where 40% of students claim they are disabled

    https://fortune.com/article/rise-in-elite-students-seeking-a...

    and where they wanted to ban words such as "chief", "stupid", "karen" and "American"

    https://reason.com/2022/12/21/stanford-elimination-harmful-l...

  • airstrike 26 minutes ago
    Yes, LLMs are great at search. That's not news.
  • king_zee 1 hour ago
    I think there will be a market for firms that aggressively market themselves as non-AI, and then as more people turn towards that human connection we'll go full circle
    • rayiner 44 minutes ago
      Nobody wants to pay their lawyers more than they have to. There will be a huge market for firms that can use AI to avoid charging clients for $1,000/hour junior associates.
    • zuzululu 31 minutes ago
      that worked out for artists and translators right ?
    • citizenpaul 59 minutes ago
      If you want human connection the legal system is not where you are going to find it, period.

      I don't think there will be any such market for "non ai" law. If I'm involved with the legal system I just want out as quick as possible as cheap as possible.

      • applfanboysbgon 46 minutes ago
        Bad legal advice will keep you dealing with the legal system for much longer and at much greater cost. Something being cheap and quick upfront doesn't mean it will be cheap and quick by the end of the process.
        • Esophagus4 44 minutes ago
          But isn’t this study saying that the legal advice could actually be better with AI?

          A bit of extrapolation from the study, but not a crazy stretch.

          • applfanboysbgon 38 minutes ago
            Maybe, although I would be extremely hesitant to extrapolate from this one study and trust my legal life to an LLM. One thing that's worth noting, though, is that regardless of the quality of objective legal advice in the abstract, for a lot of smaller scale stuff the human connection actually is literally what is important. There are ambiguities in the law, which are not resolved deterministically but rather at the individual discretion of judges. Your lawyer, if they're any good at their job, knows the local judges and how they're likely to rule for given circumstances, which can influence their legal advice to you specifically.
            • Esophagus4 12 minutes ago
              Fair.

              But I could also see a world where that, too, is fed to models for hyper-local results.

              Could be a way off, but I could see it.

  • wilg 58 minutes ago
    > In a blind evaluation of nearly 3,000 anonymized comparisons, professors rated AI responses significantly higher than answers written by other professors, with AI winning 75% of head-to-head matchups.

    75% win rate seems pretty good!

    Paper link: https://law.stanford.edu/wp-content/uploads/2026/06/salinas_...

    • causal 57 minutes ago
      I wonder to what degree the AI was just better at communicating. My experience with attorneys is that they are often some of the worst writers.
    • jshier 55 minutes ago
      I do wish they'd used some more objective criteria. Simply being preferable one of the things LLMs have trained for since the beginning, hence its sycophantic nature.
      • wilg 54 minutes ago
        What criteria would you use for judging legal arguments?
        • mitkebes 37 minutes ago
          The arguments need to be based on actual law, and any cited reference cases need to be real.

          There's been a lot of news stories about lawyers using AI, and then getting in trouble for citing hallucinated laws or cases. It doesn't matter if the AI response is "preferred" over the human one if it gets thrown out when put under the scrutiny of a real case.

          • wilg 35 minutes ago
            Who's gonna determine that? A bunch of law professors?
        • mylifeandtimes 46 minutes ago
          maybe seeing if the case law it cited was real or imagined? Just one idea, IANAL
          • gamerDude 41 minutes ago
            Well, they had the data around if the answer would be harmful to the students learning. AI was scored at 3.5% harmful answers and 12% of law professor answers were considered harmful.
    • falcor84 52 minutes ago
      Yeah, 75% win rate is a ~200 points Elo difference, which is quite massive.
  • t0lo 12 minutes ago
    Library outperforms student... more news at 9
  • Thaxll 32 minutes ago
    AI will never convince a jury though.
    • jojobas 26 minutes ago
      A couple of acting classes might be cheaper than a lawyer, then you can go all out representing yourself.
  • bko 42 minutes ago
    Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.

    I'm getting more convinced. I mean, sure it makes dumb mistakes sometimes but its a particular set of self serving mistakes, commenting out tests in order to pass. We obv don't want this behavior but I wouldn't say it's dumb.

    It'll be like the Turing test, which we just blew past years ago and no one cared. After all the hand-wringing about sentience and rights of the AI if it passes the Turing test, and now we just have AI bots running 24/7 writing slop.

    How does everyone else feel?

    • acdha 24 minutes ago
      > Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.

      He stands to make billions if enough people believe him — unless you also do, consider that you’re the mark. For example, if that was true, it would have to mean that AI companies either aren’t letting customers use the good models or are instructing them to frequently make errors which reveal a fundamental lack of reasoning ability.

      Consider also that his wealth means he hasn’t had to defend an idea stringently since the 90s. I wouldn’t be surprised if he does think LLMs give deep answers because it often looks that way until you critically review the response and ask questions like what’s missing which require you to have a decent understanding of the problem domain.

    • paulmist 32 minutes ago
      Knowing the question is half of the answer. LLMs are great at scoping your context and answering precisely what you asked; it's also why they go off the rails when they misunderstand a part of your question. Incidentally, they're great at "knowing" and reaching for knowledge.

      Humans have the advantage of perspective. We always lack some knowledge and answer broadly. This is bad if you have a particular goal in mind, but better if you're just generally learning, because you see more and learn to discriminate the correct from the wrong. And most importantly, being wrong is part of human ingenuity - because sometimes we turn something "obviously" wrong into something right.

    • moregrist 27 minutes ago
      Marc Andreessen has a strong financial incentive to feel this way and to convince others to feel this way.

      I also think it’s easy to think that AI gives good answers if you don’t know the field well. In fields where I know the material, the answers are pretty variable and can be quite bad.

    • scottfalconer 17 minutes ago
      Getting the right answers is just half of it, you need to know the right questions to ask. I haven't yet seen AI crack that one.
    • foolserrandboy 27 minutes ago
      He would tell you NFTs were AGIs if it might get you to buy them.
    • rvz 31 minutes ago
      > Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.

      Investor with vested interest in AI companies makes claim of reaching "AGI".

      He is one of the last people to listen to about AGI. Unless the term "AGI" means something entirely different to him vs to independent researchers vs to CEOs, since the term has become entirely meaningless.

    • 12AHg 37 minutes ago
      [flagged]
      • futuraperdita 29 minutes ago
        I’m not an AI stan by any means and certainly no fan of Andreessen, but using the term “clanker” immediately biases your statement and can discredit what is a well-referenced or well-meaning comment.
  • homeonthemtn 45 minutes ago
    Personally I think this is very good. One of the hardest things out there is maintaining a society in the face of changing times and it's because law is dense and slow.

    I think, in the right hands, this could be huge.

    • wholinator2 31 minutes ago
      It turns out everybody has at least one right hand, even the people we trust the least.
  • 34981t 1 hour ago
    He is basically an AI professor for law. This study just confirms his existence:

    https://juliannyarko.com/

    Stanford and its donors of course want to replace anyone but its administrators, so they cheer on such anti-intellectual nonsense.

    • signatoremo 17 minutes ago
      This is the state of HN. Created new account. Accused without evidence. Emotional clickbait.
  • steele 49 minutes ago
    in mice
  • jimbokun 48 minutes ago
    [flagged]
    • jatora 38 minutes ago
      definitely not needed if you're in the middle-man slime trades (law)
      • jimbokun 36 minutes ago
        In an advanced economy everyone’s the middle man for something. We’re not self sustaining agrarian farmers anymore.
      • zuzululu 32 minutes ago
        what do you think software devs do all day
    • Waterluvian 37 minutes ago
      the memes were nice tho
  • fgh_ask 59 minutes ago
    [flagged]
    • maxbond 56 minutes ago
      Just so you know, I have nothing to do with Stanford, but I am flagging this as conspiratorial nonsense. So when you comment is flagged, I just want you to know that it doesn't confirm your belief, it's just that this comment harms discussion and so must be removed.
      • hoppyhoppy2 55 minutes ago
        >Don't feed egregious comments by replying; flag them instead. If you flag, please don't also comment that you did.

        https://news.ycombinator.com/newsguidelines.html

      • thin_carapace 38 minutes ago
        for what it's worth I have no idea why it would be nonsense to question institutional motivations especially in the context of an academic article that could easily be corporate propaganda, I also think that shutting conversations down is much more harmful than discussing topics that are potentially harmful
    • 19skitsch 55 minutes ago
      uh alright buddy
  • aetq51 44 minutes ago
    [flagged]
    • dang 23 minutes ago
      Would you please stop creating accounts to post this?
    • rfw300 32 minutes ago
      A law professor studying AI has an affiliation with the center at their university that studies applications of AI? Scandalous!
    • wilg 32 minutes ago
      You're suspicious that the person doing academic research on how AI applies to law has a job related to research on law and AI?
      • runarberg 14 minutes ago
        You are not? It is at least worth investigating how much this professor benefits from AI companies. In fact this is HN. Let me come back to you in about 10 minutes.

        EDIT: 10 min later. I give up. I tried to find who is funding HAI, and came empty handed, usually you can see that in their yearly reports, but no such luck for me. I know Google and Bill Gates are big donors, so take that as you will.

    • ares623 33 minutes ago
      Running out of IPO juice. Each bump is less effective and lasts shorter.