Notebook excerpts
A plain-text scan of every section in this note — the interactive, fully-styled version is in the reader above. Use whichever helps.
01
What a derivative is
A derivative answers one question: how fast is something changing right now?
02
The limit definition formula
The whole expression says: take the slope between two nearby points, then shrink the gap to nothing. What's left is the slope at a single point — the derivative.
03
Heights, rise, and run
"Height" in the formula means exactly what you'd think: the vertical distance from the x-axis up to the point on the curve. For $f(x) = x^2$:
04
How the formula is the derivative
The connection between the formula and the concept "derivative" is not "they're related" — the formula is the literal definition.
05
Slope in disguise
You're always calculating a slope when you compute a derivative — but the slope isn't always called "slope." It gets renamed depending on what the two axes represent. The pattern is invariant:
06
Partial differentiation
A regular derivative is the slope of a curve. A partial derivative is the slope of a surface — but only in one direction at a time.
07
The ML connection: weights & loss
Partial derivatives are arguably the most important math idea in modern machine learning. Here's why.
08
Why both w and b ?
If $\partial L / \partial w$ tells you how to change $w$ to reduce loss, why have $b$ at all? The answer reveals why ML works the way it does.
09
Vocabulary & the bigger picture
Repeat for billions of training examples until the loss bottoms out. When you hear "the model is learning," what's literally happening is: the partial derivatives are telling each parameter which way to move, and the parameters are sliding downhill on the loss surface.
10
Is w derived from x ?
Imagine a stereo. The music signal coming in is like $x$ — whatever the radio is broadcasting, you can't change it. The knob position is like $w$ — how much you're amplifying the signal. The sound coming out is like $\hat{y} = wx$. Different songs (different $x$'s) come and go through the radio. The knob ($w$) stays where you set it until you decide to turn it. The knob isn't "derived from" the music — it's a separate thing you tune to make the output sound right across all the music that comes through.