The motivation was practical rather than theoretical:
Which of these geometric structures, if any, actually survive discretization, noise, and SGD-style training in modern machine learning?
In physics, global and coordinate-free formulations were not aesthetic choices; they were forced when local reasoning stopped working. A recurring structural pattern was:
structure -> symmetry -> invariance -> dynamics -> observables
In modern ML we increasingly see analogous issues:
* parameter symmetries and large quotient spaces * non-Euclidean data (graphs, meshes, manifolds) * highly structured hypothesis classes * training dynamics that are not well-described by flat Euclidean optimization
Some geometric ideas clearly paid off (e.g. equivariance via group actions). Many others did not. I’m trying to understand where future leverage might still lie, and where geometry collapses to interpretation or preconditioning.
Below is my current (incomplete) map of where modern geometry already shows up in ML, or plausibly could.
1. Geometry of data (base spaces)
Manifolds, stratified spaces, graphs and meshes; discrete differential operators (graph Laplacians, discrete Hodge theory); topological summaries (persistent homology).
This seems strongest for representation, spectral methods, and diagnostics. The open question is how much of this geometry can couple dynamically to training, rather than remain preprocessing or analysis.
2. Geometry of hypothesis spaces (architectures)
So far the most successful direction:
* symmetry and equivariance via group actions * quotienting hypothesis spaces * convolution as representation theory * SE(3)- / gauge-equivariant models * architectures encoding invariants or conservation laws
Here geometry restricts the hypothesis class before optimization. I suspect there is still room beyond global groups, toward local gauge structure, fiber bundle–valued representations, and architectures defined by connections rather than coordinates.
3. Geometry of parameters and optimization
Optimization on manifolds (Stiefel, Grassmann, SPD cones), structured or low-rank parameterizations, information geometry and natural gradients.
This seems most effective when constraints are hard and geometric. In looser settings, much of this reduces to preconditioning. It’s unclear where deeper geometric structure still matters at scale.
4. Geometry of training dynamics
Viewing training as a stochastic dynamical system:
* gradient descent as discretized flow * SGD as an SDE * trajectories on manifolds * attractors and metastability
This connects to dynamical systems, stochastic analysis, and geometric mechanics, but remains underdeveloped relative to its apparent relevance.
5. Discrete vs smooth geometry
Modern ML is deeply discrete: finite precision, quantization, sparse activations, graph-based computation. Smooth differential geometry may be the wrong limit in some regimes. Discrete differential geometry or combinatorial curvature might be more appropriate.
Some failures of “geometric ML” may simply be failures of choosing the wrong geometric category.
What I’m looking for:
* geometric structures that have actually influenced model or optimizer design beyond equivariance * where Riemannian or information-geometric ideas help in large-scale training * which geometric frameworks seem promising but currently mismatched with SGD * directions I’m missing
Perspectives from theory groups, applied math, and industry research labs would be very welcome.
0 comments