AI & ML | MathLumen

$The Mathematics of Self-Attention: Deconstructing the Transformer$

ai-ml9 min

Inside the equation that powers modern AI

The Transformer architecture revolutionized AI with a single mechanism: self-attention. We break down the linear...

Akhilesh YadavMarch 1, 2026

ai-ml10 min

Why optimization in deep learning is a geometric problem

Gradient descent navigates high-dimensional loss landscapes shaped by curvature and saddle points. We explore the...

Akhilesh YadavFebruary 10, 2026

ai-ml10 min

The Riemannian manifold of probability distributions

Information geometry equips the space of probability distributions with a Riemannian metric — the Fisher information....

Akhilesh YadavNovember 5, 2025