Algorithms · Companion notes

Big O complexity,
visually explained.

The hidden cost of every algorithm. What it really means when we say "this runs in O(n log n)" — explained with race tracks, search games, and real numbers you can compare side by side.

21 sections, 14 interactive demos Approx. 35 min read Drag any slider · click any demo
Part I
Foundations
Before the formulas — what we're actually measuring, why it matters, and the one rule of thumb that makes Big O click.

§ 01Why complexity matters

Imagine you have two phonebooks. One has 10 names. The other has 10 million names. You're looking for "Sarah Khan".

If you flip page by page, the small phonebook takes maybe a few seconds. The big one would take weeks. But if you instead open the book in the middle and figure out which half "Sarah" is in, then halve again, and again — the small phonebook still takes about a second, and the big one takes about 24 page turns.

Big O is the answer to one question: as your input grows, how fast does the work grow with it? It's the difference between a program that scales to a billion users and one that grinds to a halt at ten thousand.

Demo 01 · The phonebook race
Drag the slider to change the size of the phonebook. Watch how long each strategy takes.
Phonebook size (n)100
log-scaled from 10 to 100 million names.
Flip page by page
100 turns
O(n)
~ 1 minute
Open in the middle (binary search)
7 turns
O(log n)
~ 4 seconds
Even at 100 names, the binary-search approach is already 14× faster. At a million names, it's 50,000× faster.

That gap between linear and logarithmic is what Big O measures. It tells you, in one notation, which class of growth your algorithm belongs to — and therefore whether it'll still be working when your data is 10× bigger, or 1000× bigger, or 10 million times bigger.

§ 02What Big O actually counts

Big O counts operations — the basic steps a computer takes to do its work. Comparisons, lookups, arithmetic, assignments. It expresses how that count grows as a function of the input size, usually called n.

Here's the cleanest way to think about it. For each algorithm, ask: "If the input has n items, roughly how many operations will it take?" That answer, simplified, is the Big O.

Demo 02 · Operation counter
Drag the slider to change n. The table shows how many operations each complexity class takes — and which ones explode.
Input size (n)10
try n=10, n=20, n=50 — watch the right-side columns explode.
Complexity Operations at this n Real-world time
(if 1 op = 1 µs)
Example
O(1) 1 instant look up by key, peek top of stack
O(log n) 3 instant binary search, balanced tree
O(n) 10 instant scan, sum, find max
O(n log n) 33 instant good sorts (merge, quick, heap)
O(n²) 100 instant nested loops, bubble sort
O(n³) 1,000 instant triple-nested loops, matrix multiply
O(2ⁿ) 1,024 instant try every subset, naive recursion
O(n!) 3,628,800 3.6 seconds try every arrangement (TSP brute force)
At n=10, everything except O(n!) is essentially instant. At n=20, O(2ⁿ) is already a million operations and O(n!) is two quintillion. The takeaway: it's not the speed at small n that matters — it's the direction the curve is going.

§ 03The shape, not the speed

Big O ignores constants and lower-order terms. If your algorithm does 3n² + 5n + 100 operations, the Big O is just O(n²). We drop the 3, drop the +5n, drop the +100. Why?

Because for large enough n, only the highest-power term matters. At n=1000, 3n² = 3,000,000 while 5n + 100 = 5,100 — the latter is irrelevant. And the constant 3 doesn't change which "class" the algorithm belongs to.

f(n) = 3n² + 5n + 100  ⟹  O(n²) Keep the fastest-growing term. Drop the multiplier in front. Drop everything smaller.
Demo 03 · What dominates?
Sliding n shows how only the dominant term matters at scale. Watch the bars.
n10
3n²
300
5n
50
100
100
total
450
At small n, the constant 100 matters. At large n, the 3n² term dwarfs everything else and the algorithm's behaviour is essentially "quadratic" — which is exactly what O(n²) means.

This is the one rule of thumb that makes Big O click: only the fastest-growing piece of the work matters when n is large enough. Everything else is noise.

Part II
The complexity zoo
Eight species of growth. Each one has its own habitat, its own examples, and its own threshold where it stops being practical.

§ 04O(1) Constant time

The same amount of work, no matter how big the input is. One operation. The phonebook could have ten names or ten billion — looking up Sarah's number when the book is sorted by name lookup table takes the same blink.

ops(n) = 1   or some other fixed number The graph is a flat horizontal line. n grows; work doesn't.

Real examples

Demo 04 · The constant-time lookup
Add as many items as you want to the array. Press "look up index 5" — always the same blink.
Array size10
However many cells we add to the array, the lookup is always one step. 1 operation = O(1).
# O(1) examples def first_element(arr): return arr[0] # single index access def get_age(user_id, db): return db.users[user_id].age # hash lookup def is_even(n): return n % 2 == 0 # one modulo, one compare

§ 05O(log n) Logarithmic

Every step halves the problem. Doubling the input size adds just one more step. This is the magic of "divide and conquer".

To search a million entries with binary search, you halve repeatedly: 1M → 500K → 250K → ... → 1. That's only log₂(1,000,000) ≈ 20 halving steps. To search a billion: just 30 steps. To search the number of atoms in the universe: about 266 steps. That's why this complexity class feels almost magical.

ops(n) = log₂(n)   (base usually doesn't matter for Big O) For n=8 → 3 ops. For n=1024 → 10 ops. For n=1,048,576 → 20 ops.
Demo 05 · Binary search, animated
Press play to watch binary search find a target by halving. Each step eliminates half the remaining items.
Target: 42 · Steps: 0
Each step throws away half the remaining array. After k steps, only n/2^k items remain. We're done when only 1 is left → k = log₂(n).

Real examples

§ 06O(n) Linear time

Look at every item once. Double the input → double the work. This is the natural-feeling complexity, the one humans intuit when they say "I'll check each one".

ops(n) = n For n=10 → 10 ops. For n=1M → 1M ops. For n=1B → 1B ops.
Demo 06 · Linear scan
A pointer walks left to right looking for the target. In the worst case (target at the end, or absent), it touches every cell.
Target: 7 · Steps: 0
Linear scan is dead simple — no sorting required, no extra memory. But for large data, it's slow compared to O(log n).

Real examples

# O(n) — touch every element once def find_max(arr): best = arr[0] for x in arr: # n iterations if x > best: best = x return best

§ 07O(n log n) Linearithmic

The "sorting sweet spot". Every good sort runs in this class: merge sort, heap sort, the typical quicksort. The intuition: you do log n rounds of work, each touching all n elements. n × log n total.

It's only slightly worse than linear — for a million items, n log n ≈ 20 million, which is the difference between "instant" and "a tenth of a second". This is the best we can do for comparison-based sorting (a mathematical lower bound).

ops(n) = n · log₂(n) For n=8 → 24 ops. For n=1024 → ~10,000 ops. For n=1M → ~20M ops.
Demo 07 · Merge sort, animated
Bars get split into halves recursively, then merged back in sorted order. Each level merges n items, and there are log₂(n) levels.
Comparisons: 0
For n=8 items, we make ~24 comparisons (n log n = 24). For n=1000 items, ~10,000 comparisons. The total cost grows just barely faster than linear.

Real examples

§ 08O(n²) Quadratic

Look at every pair of items. A nested loop where both loops run n times. Doubling the input quadruples the work.

This is the threshold where things start to hurt. At n=10, quadratic does 100 operations — fine. At n=1000, it does 1 million — still ok. At n=100,000, it does 10 billion — your computer just stopped responding.

ops(n) = For n=10 → 100 ops. For n=1000 → 1M ops. For n=100K → 10B ops (~3 hours).
Demo 08 · The nested loop grid
A nested loop visits every (i, j) pair. Watch cells light up — that's n × n work.
n8
try doubling n — watch the number of cells quadruple.
Ops: 0 / 64
The grid has n × n = n² cells. Each cell is one operation. Doubling n produces a 2× wider grid AND a 2× taller grid → 4× more work.

Real examples

# O(n²) — the classic nested loop def has_duplicate_slow(arr): for i in range(len(arr)): # n iterations for j in range(i + 1, len(arr)): # up to n iterations if arr[i] == arr[j]: # compare every pair return True return False

§ 09O(n³) Cubic

Three nested loops. Look at every triple. Doubling n now means 8× the work. This is the regime where small problems take seconds and medium problems take hours.

ops(n) = For n=100 → 1M ops. For n=1000 → 1 billion ops (~1 sec). For n=10000 → 1 trillion (~17 min).
Demo 09 · Cube grows as n³
Slide n to grow the cube. Each cube of size n contains n³ unit cells.
n4
n³ = 64 operations
A doubling from n=4 to n=8 takes the count from 64 to 512 — eight times more work. Beware any triple-nested loop over the same data.

Real examples

§ 10O(2ⁿ) Exponential

Every additional input doubles the work. The growth is terrifying. At n=30, you're at a billion operations. At n=60, you're at the age of the universe.

This is what naive recursion gets you when each call spawns two more — like the textbook recursive Fibonacci function.

ops(n) = 2ⁿ For n=10 → 1,024 ops. For n=20 → 1M ops. For n=30 → 1B ops. For n=50 → 1 quadrillion.
Demo 10 · Subsets of n items
A set of n items has 2ⁿ subsets. Slide n up — watch the number of subsets double every step.
n items4
2n = 16 subsets
Each new item doubles the subsets. n=4 → 16 subsets. n=5 → 32. n=20 → over a million. This is why brute-forcing combinatorial problems gets stuck above n≈30.

Real examples

# O(2ⁿ) — naive recursive Fibonacci. Each call spawns two more. def fib(n): if n <= 1: return n return fib(n - 1) + fib(n - 2) # two recursive calls # fib(40) takes seconds. fib(50) takes minutes. fib(60) never finishes.

§ 11O(n!) Factorial

The slowest practical complexity. n! = n × (n−1) × (n−2) × ... × 1. Beyond n=15 you've already hit a trillion operations. Beyond n=20 it's a quintillion. This is the brute-force "try every arrangement" complexity.

The travelling salesman problem solved by checking every route: O(n!). Generating all permutations of a string: O(n!). At n=12 your laptop chokes. At n=20 the heat death of the universe arrives first.

ops(n) = n! = n × (n−1) × (n−2) × … × 1 For n=5 → 120. For n=10 → 3.6M. For n=15 → 1.3T. For n=20 → 2.4 quintillion.
Demo 11 · Permutations grow as n!
Each letter can appear in any position. The number of unique arrangements is n!. Try increasing n.
n letters3
n! = 6 permutations
For n=3 letters there are 3! = 6 arrangements: ABC, ACB, BAC, BCA, CAB, CBA. For n=10, there are 3,628,800. For n=15, the list won't fit on this page.

Real examples

If you ever find yourself looking at O(n!) or O(2ⁿ), the answer is almost never "make it faster" — it's "find a smarter algorithm". Dynamic programming, branch-and-bound, heuristics, approximations: the field of algorithm design exists largely to escape these growth rates.

Part III
Side by side
The whole zoo on the same chart, racing the same n. The numbers and the curves tell the same story — but the numbers hit harder.

§ 12The race

The fairest way to compare complexity classes is to give them all the same input size and count operations. Below, drag the slider to set n. Each bar shows how many operations that complexity class needs to finish.

Demo 12 · The complexity race
Bars sized by operation count (clamped for readability). Numbers on the right are exact. Try n=10, n=20, n=50 — see who's still standing.
Input size (n)10
O(1)
1
O(log n)
3
O(n)
10
O(n log n)
33
O(n²)
100
O(n³)
1,000
O(2ⁿ)
1,024
O(n!)
3,628,800
At n=10, everything is still tractable. Try sliding to n=20 to watch O(2ⁿ) and O(n!) explode. By n=30, only the first four are practical.

§ 13Growth curves

The race tells you who wins at one fixed n. The curves tell you the whole story. Here are all eight complexities plotted on the same axes — first on a regular scale, then on a log scale so you can actually see the slower ones.

Demo 13 · All eight curves
Toggle log scale to compress the y-axis and see the slow-growing curves. Hover any curve in the legend to highlight it.
x-axis: n from 1 to 32 · y-axis: operations
On the linear y-axis, only the fast-growing curves are visible — the rest hug the bottom. On the log scale, every curve becomes a straight line (logarithmic axes turn polynomials into lines) and you can read off their slopes. Steeper slope = faster growth.

§ 14Where each one breaks

Every complexity class has a practical ceiling — the size of n above which the algorithm no longer finishes in reasonable time. The table below uses a rule of thumb: a modern CPU does about 10⁸ operations per second.

Complexity n = 100 n = 1,000 n = 1M n = 1B Practical ceiling
O(1)instantinstantinstantinstantno ceiling — any n
O(log n)instantinstantinstantinstantno ceiling — any n
O(n)instantinstant10 ms10 sec~10 billion is fine
O(n log n)instantinstant200 ms5 min~1 billion is fine
O(n²)instant10 ms3 hours317 years~100,000 starts to hurt
O(n³)instant10 sec317 millennia~5,000 is the limit
O(2ⁿ)~30 is the limit
O(n!)~12 is the limit
Mental rule of thumb
A modern CPU does about 10⁸ (one hundred million) simple operations per second. So a 1-second budget gives you about 10⁸ ops. Work backwards from there: at n=10⁴ you can afford O(n²); at n=10⁶ you need O(n) or O(n log n); at n=10⁹ you really want O(log n) or O(1).
Part IV
Real algorithms
The same eight classes, but now applied to the algorithms you actually use — searching, sorting, data structures, and graphs.

§ 15Searching

Two algorithms, one task: find a target in an array of n items. The difference is whether the array is sorted.

AlgorithmBestAverageWorstNotes
Linear searchO(1)O(n)O(n)works on any array, no sorting required
Binary searchO(1)O(log n)O(log n)requires sorted array; halves each step
Hash lookupO(1)O(1)O(n)O(n) worst case from collisions; in practice O(1)
Demo 14 · Linear vs binary search, side by side
Both algorithms search for the same target. Watch the binary version's bounds halve while the linear version walks one step at a time.
Linear search may get lucky and find the target on step 1 — or unlucky and take n steps. Binary search is consistent: ~log₂(n) regardless of where the target sits.

§ 16Sorting

The classic battleground for complexity. Every sorting algorithm sits in one of three speed classes — and which one matters enormously once n gets above a thousand.

AlgorithmBestAverageWorstSpaceNotes
Bubble sortO(n)O(n²)O(n²)O(1)teaching tool; never use in production
Selection sortO(n²)O(n²)O(n²)O(1)always n² — even on sorted input
Insertion sortO(n)O(n²)O(n²)O(1)very fast on small or nearly-sorted arrays
Merge sortO(n log n)O(n log n)O(n log n)O(n)stable, predictable, extra memory needed
QuicksortO(n log n)O(n log n)O(n²)O(log n)fastest in practice; O(n²) on pathological input
Heap sortO(n log n)O(n log n)O(n log n)O(1)guaranteed n log n, but slower constant
TimsortO(n)O(n log n)O(n log n)O(n)Python's sorted(), Java's Arrays.sort
Counting sortO(n + k)O(n + k)O(n + k)O(k)linear when value range k is small
Radix sortO(d · n)O(d · n)O(d · n)O(n + k)linear when digit count d is small
Demo 15 · Bubble vs Merge sort race
Both sort the same shuffled array. Watch the comparison counts diverge dramatically.
Bubble sort comparisons
Merge sort comparisons
For 16 items, bubble sort needs up to 120 comparisons and merge sort needs 64. At 1000 items, bubble takes ~500,000 and merge takes ~10,000 — a 50× gap. At 1 million items, the gap is 50,000×.

§ 17Data structures

Choosing the right data structure is half the battle. Here's how the basic operations cost across the common ones.

StructureAccessSearchInsertDeleteNotes
Array (indexed)O(1)O(n)O(n)O(n)insert/delete at the end is O(1) amortized
Linked listO(n)O(n)O(1)O(1)O(1) insert/delete only when you already have the node
Hash tableO(1)O(1)O(1)O(n) worst case from collisions
Balanced BSTO(log n)O(log n)O(log n)O(log n)AVL, red-black, B-tree
Heap (binary)O(n)O(log n)O(log n)find-min/max is O(1); search anywhere else is O(n)
TrieO(k)O(k)O(k)O(k)k = length of the key string (independent of n)
Stack / QueueO(1)O(1)only the ends are accessible
Disjoint set (union-find)α(n)α(n)α(n) is the inverse Ackermann — effectively O(1)
Choosing well
Need to lookup by key fast? Hash table. Need sorted iteration too? Balanced BST. Need the smallest item fast? Heap. Need to add/remove from both ends? Deque. Just looping in order? Plain array. The structure isn't the smart part — picking it is.

§ 18Graph algorithms

For graphs, complexity is usually expressed in terms of both vertices (V) and edges (E). A graph with n nodes can have up to n² edges, so an algorithm that's O(V + E) is essentially linear in the graph's size, while O(V·E) starts to look quadratic on dense graphs.

AlgorithmComplexityWhat it does
BFS / DFSO(V + E)visit every node and edge once
Topological sortO(V + E)linear ordering of a DAG
Dijkstra (binary heap)O((V + E) log V)shortest path from source, non-negative weights
Bellman-FordO(V · E)shortest path with negative weights; slower
Floyd-WarshallO(V³)all-pairs shortest paths
Prim / Kruskal (MST)O(E log V)minimum spanning tree
Strongly connected componentsO(V + E)Tarjan's or Kosaraju's
Max flow (Edmonds-Karp)O(V · E²)network flow
Travelling salesman (exact)O(n!)brute force; with DP, O(2ⁿ · n²)
Part V
Practice & reference
Try to spot the complexity yourself. Then check the common traps. End with the cheat sheet.

§ 19Identify the Big O

Read each snippet and pick the complexity. Click an option to check your answer — explanation appears below.

§ 20Common mistakes

1. Counting only the outer loop

If you have for i in n: for j in n: ..., the complexity is O(n²), not O(n). Each iteration of the outer loop runs the entire inner loop. Always count the total number of operations, which means multiplying nested loop counts.

2. Forgetting the hidden cost of operations

This looks like O(n) but it's actually O(n²):

for i in range(n): if x in my_list: # `in` on a list scans the whole list — O(n) ...

Each in check is O(n) for a list. So you're doing n iterations × n work per iteration = O(n²). The fix: use a set instead of a list — x in my_set is O(1).

3. Confusing "looks fast" with "is fast"

The code below has no nested loops, looks very innocent, and is O(n²):

def build_string(words): s = "" for w in words: # n iterations s = s + w # in many languages, string concat copies the whole string return s

String concatenation in many languages copies the existing string each time. Use "".join(words) instead — that's O(n) total.

4. Ignoring "amortized" complexity

Appending to a dynamic array (like a Python list) is "O(1) amortized" — most appends are O(1), but occasionally one is O(n) when the array needs to grow. Averaged over many appends, the cost per operation is O(1).

5. Confusing time and space

Big O is usually about time (operations), but algorithms also have space complexity (memory). A recursive function uses stack space proportional to its recursion depth — that's space, not time. Merge sort is O(n log n) time and O(n) space; heap sort is O(n log n) time and O(1) extra space.

6. Worst case vs average case vs best case

Quicksort's average case is O(n log n) but its worst case is O(n²) (when the pivot keeps picking the smallest or largest element). Hash table operations are O(1) amortized average but O(n) worst case (collisions). The standard "Big O" usually refers to worst case unless stated otherwise.

§ 21Reference table

A single sheet to keep nearby. The first column is the complexity; the rest is everything you'd want to remember about it.

Class Name Shape Examples When you'll see it
O(1) constant flat line hash lookup, array index, stack push/pop "jump straight to the answer" — no loops, no recursion that depends on n
O(log n) logarithmic slow curve, flattens binary search, balanced BST, heap insert "halve and conquer" — each step throws away half the remaining work
O(n) linear straight diagonal scan, sum, max, copy "look at each item once" — a single loop over the data
O(n log n) linearithmic nearly linear, slight curve up merge sort, heap sort, FFT "sort, or divide & conquer with linear merging" — best possible comparison sort
O(n²) quadratic parabola bubble sort, nested loops, all pairs "compare every pair" — two nested loops over the same data
O(n³) cubic steep parabola matrix multiply, Floyd-Warshall "every triple" — three nested loops, or graph algorithms on dense graphs
O(2ⁿ) exponential vertical wall naive recursion (Fibonacci), subset enumeration, SAT "try every binary choice" — recursion that branches twice without memoization
O(n!) factorial vertical wall (worse) brute-force TSP, all permutations "try every arrangement" — n choices, then n-1, then n-2, ...

Rule of thumb cheat sheet

Input size nTractable complexityComment
n ≤ 10anything, including O(n!)brute force fine
n ≤ 25up to O(2ⁿ)exponential still OK
n ≤ 500up to O(n³)cubic at the edge
n ≤ 5,000up to O(n²)quadratic still OK
n ≤ 10⁶up to O(n log n)the sweet spot for sorting/processing
n ≤ 10⁸O(n)a single linear pass
n > 10⁸O(log n) or O(1)you can't even read the input — need indexing, hashing, or sublinear algos

Big O is not about being precise. It's about knowing which class your algorithm lives in, and what that class implies for the inputs you actually have. Get that right, and the rest is engineering.