From Derivatives to Machine Learning — A Reference

Notebook excerpts

A plain-text scan of every section in this note — the interactive, fully-styled version is in the reader above. Use whichever helps.

01
What a derivative is
A derivative answers one question: how fast is something changing right now?
02
The limit definition formula
The whole expression says: take the slope between two nearby points, then shrink the gap to nothing. What's left is the slope at a single point — the derivative.
03
Heights, rise, and run
"Height" in the formula means exactly what you'd think: the vertical distance from the x-axis up to the point on the curve. For f(x) = x^2 :
04
How the formula is the derivative
The connection between the formula and the concept "derivative" is not "they're related" — the formula is the literal definition.
05
Slope in disguise
You're always calculating a slope when you compute a derivative — but the slope isn't always called "slope." It gets renamed depending on what the two axes represent. The pattern is invariant:
06
Partial differentiation
A regular derivative is the slope of a curve. A partial derivative is the slope of a surface — but only in one direction at a time.
07
The ML connection: weights & loss
Partial derivatives are arguably the most important math idea in modern machine learning. Here's why.
08
Why both w and b ?
If ∂ L / ∂ w tells you how to change w to reduce loss, why have b at all? The answer reveals why ML works the way it does.
09
Vocabulary & the bigger picture
Repeat for billions of training examples until the loss bottoms out. When you hear "the model is learning," what's literally happening is: the partial derivatives are telling each parameter which way to move, and the parameters are sliding downhill on the loss surface.
10
Is w derived from x ?
Imagine a stereo. The music signal coming in is like x — whatever the radio is broadcasting, you can't change it. The knob position is like w — how much you're amplifying the signal. The sound coming out is like haty = wx . Different songs (different x 's) come and go through the radio. The knob ( w ) stays where you set it until you decide to turn it. The knob isn't "derived from" the music — it's a separate thing you tune to make the output sound right across all the music that comes through.