LLMs predict my coffee

(dynomight.net)

142 points | by surprisetalk 5 days ago

23 comments

p4bl0 19 hours ago
The title is very misleading. This has almost nothing to do with coffee. I was expecting that the input would be the parameters of a coffee recipe (like quantities of coffee and water, grind size, etc for a given type of preparation), and the output to have something to do with coffee too (like extraction time, rate, etc.). It actually is just about water cooling down. Also, it doesn't actually ask the LLM for actual prediction about the result of the experiment, only to generate a ±textbook formula for the situation (which is a good point since LLMs aren't made for that at all, but contributes to make the title misleading).
andy99 1 day ago
```
  Does that seem hard? I think it’s hard. The relevant physical phenomena include at least..,
```
In most engineering problems, the starting point is recognizing that usually one or two key things will dominate and the rest won’t matter.
[-]
- JKCalhoun 14 hours ago
  Yeah, I was thinking the same. Surface tension, convection currents? Maybe the author is overthinking it a bit, giving too much weight to small contributors.
  But that has presumably always been a pitfall for humans: trying to second guess the physical world and sometimes being "non-intuitively" wrong.
  [-]
  - ok_computer 11 hours ago
    Latitude may affect the eddy currents and resulting convective shear on the film surface imparted by angular momentum from the earth’s spin.
    This is what humors me about analytical minded computation focused people vs dumb simple engineer and physics (practical) people. That is imagining all the infinities of what may change a physical result vs knowing by experience or education.
    ANOVA: analysis of variance from linear fit parameters will show you in experimental data or simulation the contributing factors. Or you can read a chapter in an undergraduate heat transfer book.
    Decay rate of (T(t) - T_inf)/(T(0) - T_inf) is probably dominated by the wind speed in your room. For an 8-12oz cup a sphere or cylinder will get you pretty close.
    [-]
    - JKCalhoun 9 hours ago
      Give that, um, computer, an Ig Nobel Prize.
- pablocorbani 1 day ago
  [dead]
drmikeando 1 day ago
To me the neat bit isn't that it got the exponential decay right - that's pretty standard, its that it realised there were two different timescales for the decay and got ball-park numbers for them pretty well.
This is the kind of model you would expect from a simple cylindrical model of the coffee cup with some inbuilt heat capacity of its own.
However, those decay coefficients are going to be very dependent of the physical parameters of your coffee cup - in particular the geometry and thermal parameters of the porcelain. There's a lot of assumptions and variability to account for that the models will have to deal with.
[-]
- Hnrobert42 11 hours ago
  I would think that the starting temperature and ambient temperature are controlling. Boiling water is three times ambient temperature. 150°F is twice ambient temperature.
  The exponential decay is obvious because he started the readings at boiling. If he had started at 150°F, it might not have been as obvious that the readings were on an exponential curve.
  Is that right? I didn't get much sleep and don't drink coffee. Lol
detectivestory 1 day ago
On a related note, I have been working on an app that helps determine the correct grinder setting when dialing in espresso. After logging two shots with the same setup (grinder, coffeee machine, basket etc), it then uses machine learning (and some other stuff that I am still improving) to predict the correct setting for your grinder based on the machine temperature, the weight of the shot etc.
https://apps.apple.com/ph/app/grind-finer-app/id6760079211
Its far from perfect when it comes to predictions right now but I expect to have massive improvements over the coming weeks. For now it works ok as an espresso log at least.
I'm hoping after a few tweaks I can save people a lot of wasted coffee!
[-]
- a10c 1 day ago
  Funnily enough I have built essentially the exact same thing in HomeAssistant. Shot collection is completely automated as I have a LM Linea Micra and Acaia Lunar scales (Both have integrations that use Bluetooth). You should consider support for bluetooth scales etc!
  https://i.imgur.com/a5ztsco.jpeg
- gerdesj 1 day ago
  Me and the wife (en_GB - draw your own conclusions!) love a decent coffee but can't be arsed with too much wankery over it. We have owned a few kitchen built in units and I've messed with a couple of grinders and espresso pots in the past.
  Wifey found a kitchen built in unit a few years ago and it is still doing the job, very nicely.
  Let's face it, what you want is a decent coffee and you have to start from that point, not what sort of bump or grind (that's grindr).
  I want a cup of coffee with: - Correct volume - sometimes a shot, mostly an "Americano" - I'm British don't you know - Correct temperature - it'll go really bitter if too hot. Too cold - ... it'll be cold. - Crema - A soft top is non negotiable - Flavour - Ingredients and temperature (mostly)
  The unit we have now manages bean to cup quite reasonably, without any mensuration facilities. I have made coffee for several Italians and they were quite happy with the results.
- bigbuppo 7 hours ago
  And when you perfect the God shot, it ceases being the God shot.
- rrr_oh_man 1 day ago
  Missed chance to call it Grind Finr
- kevinschweikert 19 hours ago
  Would love to use it without creating an account
  [-]
  - detectivestory 10 hours ago
    yeah, i had intended to allow that but it just felt messy when i considered strategies for syncing the espresso logs from from non-account users to when they do finally sign in. I'll probably eventually get around to it as I do agree it is a little annoying to be forced to login but for now its just a "magic link" sign in and there is a button in settings where you can delete your account pretty easily.
amha 1 day ago
There's a simple differential equation often taught in intro calc courses, "Newton's Law of Cooling/Heating," which basically says that the rate of heat loss is proportional to the difference in temperature between a substance and its environment. I'm curious what that'd look like here. It's a very simple model, of course, not taking into account all the variables that Dynomight points out, but if a simple model can be nearly as predictive as more complex models...
I'm also curious to see the details of the models that Dynomight's LLMs produced!
[-]
- 3eb7988a1663 1 day ago
  The appendix lists the equations transcribed from the raw answers.
```
  LLM  T(t)  Cost
  Kimi K2.5 (reasoning)  20 + 52.9 exp(-t/3600)+ 27.1 exp(-t/80)  $0.01
  Gemini 3.1 Pro  20 + 53 exp(-t/2500) + 27 exp(-t/149.25)  $0.09
  GPT 5.4  20 + 54.6 exp(-t/2920) + 25.4 exp(-t/68.1)  $0.11
  Claude 4.6 Opus (reasoning)  20 + 55 exp(-t/1700) + 25 exp(-t/43)  $0.61 (eeek)
  Qwen3-235B  20 + 53.17 exp(-t/1414.43)  $0.009
  GLM-4.7 (reasoning)  20 + 53.2 exp(-t/2500)  $0.03
```
  [-]
  - kurthr 1 day ago
    It looks like a lot of them are missing something big. I'd think the two big ones are the evaporative cooling as you pour into the cup, and heating up the cup (by convection) itself. The convective cooling to the air is tertiary, but important (and conduction of the mug to the table probably isn't completely negligible). If there's only one exponential, they're definitely doing something wrong.
    I'd like to see a sensitivity study to see how much those terms would need to be changed to match within a few %. Exponentials are really tweaky!
    [-]
    - andai 1 day ago
      Is that what that first drop is? The cold cup stealing heat from the coffee?
      [-]
      - kadoban 1 day ago
        It's a mix of course, but I think it should be mainly that and evaporative cooling. Evap is _very_ effective but will fall off rapidly as you get away from boiling. The conduction into the mug will depend a lot on the mug material but will slow down a lot as the mug approaches the water temperature.
        I'd be very interested in seeing separate graphs for each major component and how they add up to the total. Even asking the LLMs to separate it out might improve some of their results, would be interesting to try that too.
      - kurthr 9 hours ago
        Yes, since they didn't explicitly list the evaporative cooling when the coffee was poured into the cup, I suspect it was not included (as if the coffee started in the cup). That means that the starting temperature is off and screws up all the other calculations.
        The evaporative cooling as you pour into the cup is when the coffee is at the highest temperature and has the most surface area even though it only takes a few seconds. One could test this either by including it explicitly in the requested calculation, or by putting the fill spout directly at the bottom of the cup when filling.
- e-khadem 14 hours ago
  That will be the dominating term eventually. But the initial sharp temperature drop is mostly due to the coffee mug being at room temperature and having a ~significant mass.
- amelius 1 day ago
  That model doesn't explain the relatively sharp drop in the beginning.
  [-]
  - kbelder 5 hours ago
    https://teatrade.co.uk/learning/physics-of-teh-tarik-foam.ht...
    Apparently the act of pouring has a huge effect on temperature because of the surface area :: volume ratio of the fluid as it streams (and turbulence after striking the bottom). The site above claims a single pour can drop it 20-30 degrees. There may be a similar effect here.
  - spiralcoaster 1 day ago
    It absolutely does. The model that came closest simply used that model twice in the same equation. One for the cup and one for the air.
  - coder68 1 day ago
    It does? There is a fast drop followed by a long decay, exponential in fact. The cooling rate is proportional to the temperature difference, so the drop is sharpest at the very beginning when the object is hottest.
    [-]
    - amelius 1 day ago
      I mean that initial drop doesn't look like it is part of the same exponential decay.
  - bryan0 1 day ago
    Are you sure? I believe Newtown's law of cooling says the temperature will drop sharply at the beginning:
    dT/dt = -k(T_0 - T_room)
    so T(t) = T_room + (T_0 - T_room) exp(-kt)
    exp(-x) has a fast drop off then levels off.
    [-]
    - amelius 1 day ago
      https://www.electronics-tutorials.ws/rc/time-constant.html
      scroll down, these graphs just don't look similar.
    - cyberax 1 day ago
      Ha. My university professor used this in a lab to catch people who slack off.
      There is another factor here: convection. Its speed depends on the viscosity of the fluid and the temperature difference both. And viscosity itself depends on the temperature, so you get this very sharp dropoff.
  - lacunary 1 day ago
    probably dominated by the cup as the ambient temperature initially and then as air/the counter top as the ambient temperature on the longer time scale, once the cup and the liquid near equilibrium
broken-kebab 1 day ago
The fact that near boiling water cools down quicker than warm water used to be a well-known kitchen knowledge bit. Like my grandma who wasn't a physicist at all knew it. I guess in some places (particularly those where people microwave water) that part of culture is lost cause there's at least a whole generation which hasn't done cooking.
[-]
- astrange 23 hours ago
  https://en.wikipedia.org/wiki/Mpemba_effect
- blharr 23 hours ago
  What is meant by "cools down quicker"?
  Will near-boiling water drop 10 temperature points in a shorter time than the warm water? Yes.
  Will it reach 10C faster than the warm water? No.
  [-]
  - shdudns 22 hours ago
    No?
    Today's your lucky day, you get to learn about the Mpemba effect.
    (Although the why of the effect is disputed, the trivial counter to your point is that boiling water loses mass quickly so there's less mass to cool)
    [-]
    - cateye 18 hours ago
      [dead]
- teaearlgraycold 1 day ago
  I remember mentioning this to my high school chemistry teacher and was told I was wrong. I think I even lost points on a test.
  [-]
  - chotmat 21 hours ago
    It's because you was wrong (or at least not correct). The Mpemba Effect wasn't scientifically proven, and can be explained away with error in measurement. https://www.youtube.com/watch?v=SkH2iX0rx8U
    [-]
    - shakna 20 hours ago
      That video is also, outdated.
      A reliable way of reproducing the effect was found in 2021. [0] Though the precise cause is still unknown.
      [0] https://doi.org/10.1038/s42254-021-00349-8
jofzar 1 day ago
" Does that seem hard? I think it’s hard. The relevant physical phenomena include at least"
Imo no, this seems like something that would be in multiple scientific papers so a LLM would be able to generate the answer based on predictive text.
[-]
- shdudns 1 day ago
  A full model of a cup of water cooling is, in fact, incredibly difficult.
  Impossible, since it is chaotic.
  But a T(t) model should not be too hard for an LLM with a basic heat transfer book in its training set.
  [-]
  - SchemaLoad 1 day ago
    You don't need a full model of every atomic interaction because all of those chaotic interactions end up averaging out. Given enough coin flips you will end up on a 50/50 split even if the individual flips are unpredictable. Given enough atomic interactions the heat will transfer in the same way every time.
kqr 17 hours ago
I transcribed the data and fitted dual exponentials to it. When time t is in minutes, the data seem to follow
```
     T(t) = 20 + 25e^(-2.3*t) + 54e^(-0.034*t)
```
This is very close to what the LLMs suggested. If I wanted to make an initial guess at this as accurate as the LLMs, what would I need to know? My interpretation of the coefficients is:
(a) 20 ℃ represents the room temperature this will eventually reach.
(b) 25 ℃ is how much of the temperature the mug will absorb as it is heating up.
(c) The decay -2.3 represents how fast heat is transferred to the mug. (It will be halfway after 20 seconds.)
(d) 54 ℃ is the differential between room temperature and starting temperature once we've accounted for the loss of 25 ℃ to heat the mug.
(e) The decay -0.034 is how fast heat is transferred out of the mug to the room. (It will be halfway to room temperature after 20 minutes.)
I'm okay with (a), and I could probably have guessed (d) once I know the other parameters.
I can also sort of see myself figuring out (b): I would guess the heat capacity of the mug would be maybe 500 × 0.6 = 300 J/K, do the same for the water (4000×0.2 = 800 J/K). Some work later this comes out to a temperature loss of 20 degrees. Close enough.
But even if I tried to use my intuition for how hot the mug feels as these processes go on, I would have ended up nowhere near -2.3 and -0.034 for the decay coefficients. What would I need to know about convection, mug materials, and air properties to guess that more accurately?
Is it a neat coincidence or a good, very approximate rule of thumb that heat transfer to air is about 60× slower than that to ceramic-like solids?
[-]
- azalemeth 15 hours ago
  This is all known physics. It's almost exactly Newton's Law of cooling -- see https://en.wikipedia.org/wiki/Newton%27s_law_of_cooling#Stan... -- which has been known for a long time and is a mono exp. The biexponential behaviour is because the _actual_ relevant differential equation, the Thermal Diffusion Equation, has a Fourier mode solution and you've picked out two coefficients. An infinite sum of them _is_ the answer. How you pick out the coefficients is the fun game of https://en.wikipedia.org/wiki/Spectral_methods
- dynm 11 hours ago
  Ack, I'm sorry you had to waste your time transcribing the data! I uploaded it here: https://dynomight.net/img/coffee/temps.csv
  [-]
  - kqr 10 hours ago
    It is really no bother when [1] exists. It took a couple of minutes at most. These are the differences between the transcribed data and the CSV values: https://i.xkqr.org/transcracc.png
    [1]: https://web.eecs.utk.edu/~dcostine/personal/PowerDeviceLib/D...
spiralcoaster 1 day ago
Is this for real?
This is like someone with no background in physics or engineering wondering "can a LLM predict the trajectory of my golf ball". They then pontificate about how absolutely complex all of the interacting phenomenon must be! What if there was wind? I didn't tell it what elevation I was at! How could it know the air density!? What if the golf ball wasn't a perfect sphere!!? O M G
And then being amazed when it gets the generic shape of a ballistic curve subject to air resistance.
This speaks far more to the ignorance of the author than something mind boggling about the LLM.
[-]
- teaearlgraycold 1 day ago
  Most amazement focused at LLMs comes from technical ignorance. Someone getting 100 lines of html that roughly conforms to their prompt is astounding to a muggle. To a web developer it’s a mild convenience.
- shdudns 21 hours ago
  To be fair, a golf ball's trajectory is hardly ballistic given its relatively large surface to volume ratio. Never-mind the dimples are there to cause a turbulent boundary layer to lower drag.
- zebrastaff 13 hours ago
  [dead]
scottmcdot 14 hours ago
Slightly related, I was using an LLM to help me understand whether I should add milk to my coffee before walking to my table or when I get to my table (objective to maximise coffee temperature at the point drinking). Turns out it's best to add the milk immediately when the coffee is made because the rate of cooling is higher at higher temperatures.
rustyhancock 8 hours ago
This article is somewhat baffling in that it presents the graphs but not the equations the LLMs provided. Kind of implying they provided some unique models (maybe they did but I seriously doubt it).
If equations were included you'd probably see a standard equation (Integral form of Newtons law of cooling). With the time parameters known from the input and the heat transfer parameters having reasonable guesses (cup opening area, mass of water).
kaelandt 1 day ago
It isn't that surprising that it works well, this problem is fairly well known and some simple heat equations would lead to the result, about which there is a lot of training data online.
persedes 1 day ago
That initial drop reminds me of one of the things that stuck to me from my thermodynamic lectures / tests: If you want to drink coffee at a drinkable temperature in t=15min, will it be colder if you add the milk first or wait 15min and then add milk? (=waiting 15 min because the temperature differential is greater and causes a larger drop). Almost useless fact, but it always comes up when making coffee.
[-]
- unholiness 1 day ago
  This is true if the milk is in the fridge the whole time. With the milk out the whole time, it's nearly the same, exact answer depends on the geometry of both containers.
AuthAuth 1 day ago
Irrelevant to your specific cup of coffee its giving you a generic answer.
mycocola 20 hours ago
The interesting bit about this physical experiment is that the water in the cup never starts at 100 celsius. That the act of pouring significantly reduces temperature is well-documented, so in some sense the LLM output is surprising.
wallofwonder 19 hours ago
It looks like the author forgot to insert the joke in the third last paragraph — the author left the placeholder right there in the text! But wait... is the joke forgetting to insert the joke?
shdudns 1 day ago
The problem is both highly complex, but fairly easy to model. Engineers have been doing this for over a century.
Of all the cooling modes identified by the author, one will dominate. And it is almost certainly going to have an exponential relationship with time.
Once this mode decays below the next fastest will this new fastest mode will dominate.
All the LLM has to do, then, is give a reasonable estimate for the Q for:
$T = To exp(-Qt)$
This is not too hard to fit if your training set has the internet within itself.
I would have been more interested to see the equations than the plots, but I would have been most interested to see the plots in log space. There, each cooling mode is a straight line.
The data collected, btw, appears to have at least two exponential modes within it.
[The author did not list the temperature dependance of heat capacity, which for pure water is fairly constant]
IncreasePosts 1 day ago
The water temperature drops quickly because the room temperature ceramic mug is getting heated to near equilibrium with the water. If you used a vacuum sealed mug(thermos) then the water temp would drop a bit but not much at all initially.
[-]
- SchemaLoad 1 day ago
  Or preheated the mug
foxglacier 20 hours ago
No estimate of uncertainty in his measurements so he can't really tell who's most right.
[-]
- kqr 18 hours ago
  Wouldn't the successive measurements contain some information of that uncertainty, if we assume the cooling rate is relatively smooth, locally in time?
  A logarithmic fit to their data indicate a standard deviation of 1 ℃ in the residuals. This includes both model error (the logarithmic fit is not that tight) and errors in my transcription from the plot, so the actual uncertainty of the measurements is probably even less.
  (The logarithmic fit was lazy. I tried a dual exponential fit and the standard deviation of residuals dropped to 0.45 ℃. Appears that measurement error is very small.)
  [-]
  - foxglacier 4 hours ago
    There could be a consistent bias due to the placement of the thermometer. You can't expect the LLM to assume that the temperature of the water means the temperature in a bottom corner of the cup which I guess is where the thermometer's sensor was. If he had told it how he would place the thermometer, then it could have known that, otherwise, what if it's being clever and finding an average temperature or one that would be measured at some other location? This seems qualitatively consistent with the fact that most models predicted higher temperature than what he measured because of hot water rising and cooling being greatest near the walls of the cup.
    This isn't really uncertainty so much as not defining the meaning of "temperature of the water".
- rcxdude 16 hours ago
  This is one of those 'you can just look at it' sorts of datasets, it's really not plausible that the uncertainty in the measurement is affecting anything.
e2e4 1 day ago
nice benchmark! coffee-based Turing test.
twinpost_rules 22 hours ago
[dead]
leecommamichael 1 day ago
... and so another benchmark is born.