The hidden cost of every algorithm. What it really means when we say "this runs in O(n log n)" — explained with race tracks, search games, and real numbers you can compare side by side.
Imagine you have two phonebooks. One has 10 names. The other has 10 million names. You're looking for "Sarah Khan".
If you flip page by page, the small phonebook takes maybe a few seconds. The big one would take weeks. But if you instead open the book in the middle and figure out which half "Sarah" is in, then halve again, and again — the small phonebook still takes about a second, and the big one takes about 24 page turns.
Big O is the answer to one question: as your input grows, how fast does the work grow with it? It's the difference between a program that scales to a billion users and one that grinds to a halt at ten thousand.
That gap between linear and logarithmic is what Big O measures. It tells you, in one notation, which class of growth your algorithm belongs to — and therefore whether it'll still be working when your data is 10× bigger, or 1000× bigger, or 10 million times bigger.
Big O counts operations — the basic steps a computer takes to do its work. Comparisons, lookups, arithmetic, assignments. It expresses how that count grows as a function of the input size, usually called n.
Here's the cleanest way to think about it. For each algorithm, ask: "If the input has n items, roughly how many operations will it take?" That answer, simplified, is the Big O.
Big O ignores constants and lower-order terms. If your algorithm does 3n² + 5n + 100 operations, the Big O is just O(n²). We drop the 3, drop the +5n, drop the +100. Why?
Because for large enough n, only the highest-power term matters. At n=1000, 3n² = 3,000,000 while 5n + 100 = 5,100 — the latter is irrelevant. And the constant 3 doesn't change which "class" the algorithm belongs to.
This is the one rule of thumb that makes Big O click: only the fastest-growing piece of the work matters when n is large enough. Everything else is noise.
The same amount of work, no matter how big the input is. One operation. The phonebook could have ten names or ten billion — looking up Sarah's number when the book is sorted by name lookup table takes the same blink.
Every step halves the problem. Doubling the input size adds just one more step. This is the magic of "divide and conquer".
To search a million entries with binary search, you halve repeatedly: 1M → 500K → 250K → ... → 1. That's only log₂(1,000,000) ≈ 20 halving steps. To search a billion: just 30 steps. To search the number of atoms in the universe: about 266 steps. That's why this complexity class feels almost magical.
Look at every item once. Double the input → double the work. This is the natural-feeling complexity, the one humans intuit when they say "I'll check each one".
The "sorting sweet spot". Every good sort runs in this class: merge sort, heap sort, the typical quicksort. The intuition: you do log n rounds of work, each touching all n elements. n × log n total.
It's only slightly worse than linear — for a million items, n log n ≈ 20 million, which is the difference between "instant" and "a tenth of a second". This is the best we can do for comparison-based sorting (a mathematical lower bound).
Look at every pair of items. A nested loop where both loops run n times. Doubling the input quadruples the work.
This is the threshold where things start to hurt. At n=10, quadratic does 100 operations — fine. At n=1000, it does 1 million — still ok. At n=100,000, it does 10 billion — your computer just stopped responding.
Three nested loops. Look at every triple. Doubling n now means 8× the work. This is the regime where small problems take seconds and medium problems take hours.
Every additional input doubles the work. The growth is terrifying. At n=30, you're at a billion operations. At n=60, you're at the age of the universe.
This is what naive recursion gets you when each call spawns two more — like the textbook recursive Fibonacci function.
The slowest practical complexity. n! = n × (n−1) × (n−2) × ... × 1. Beyond n=15 you've already hit a trillion operations. Beyond n=20 it's a quintillion. This is the brute-force "try every arrangement" complexity.
The travelling salesman problem solved by checking every route: O(n!). Generating all permutations of a string: O(n!). At n=12 your laptop chokes. At n=20 the heat death of the universe arrives first.
If you ever find yourself looking at O(n!) or O(2ⁿ), the answer is almost never "make it faster" — it's "find a smarter algorithm". Dynamic programming, branch-and-bound, heuristics, approximations: the field of algorithm design exists largely to escape these growth rates.
The fairest way to compare complexity classes is to give them all the same input size and count operations. Below, drag the slider to set n. Each bar shows how many operations that complexity class needs to finish.
The race tells you who wins at one fixed n. The curves tell you the whole story. Here are all eight complexities plotted on the same axes — first on a regular scale, then on a log scale so you can actually see the slower ones.
Every complexity class has a practical ceiling — the size of n above which the algorithm no longer finishes in reasonable time. The table below uses a rule of thumb: a modern CPU does about 10⁸ operations per second.
| Complexity | n = 100 | n = 1,000 | n = 1M | n = 1B | Practical ceiling |
|---|---|---|---|---|---|
| O(1) | instant | instant | instant | instant | no ceiling — any n |
| O(log n) | instant | instant | instant | instant | no ceiling — any n |
| O(n) | instant | instant | 10 ms | 10 sec | ~10 billion is fine |
| O(n log n) | instant | instant | 200 ms | 5 min | ~1 billion is fine |
| O(n²) | instant | 10 ms | 3 hours | 317 years | ~100,000 starts to hurt |
| O(n³) | instant | 10 sec | 317 millennia | — | ~5,000 is the limit |
| O(2ⁿ) | — | — | — | — | ~30 is the limit |
| O(n!) | — | — | — | — | ~12 is the limit |
Two algorithms, one task: find a target in an array of n items. The difference is whether the array is sorted.
| Algorithm | Best | Average | Worst | Notes |
|---|---|---|---|---|
| Linear search | O(1) | O(n) | O(n) | works on any array, no sorting required |
| Binary search | O(1) | O(log n) | O(log n) | requires sorted array; halves each step |
| Hash lookup | O(1) | O(1) | O(n) | O(n) worst case from collisions; in practice O(1) |
The classic battleground for complexity. Every sorting algorithm sits in one of three speed classes — and which one matters enormously once n gets above a thousand.
| Algorithm | Best | Average | Worst | Space | Notes |
|---|---|---|---|---|---|
| Bubble sort | O(n) | O(n²) | O(n²) | O(1) | teaching tool; never use in production |
| Selection sort | O(n²) | O(n²) | O(n²) | O(1) | always n² — even on sorted input |
| Insertion sort | O(n) | O(n²) | O(n²) | O(1) | very fast on small or nearly-sorted arrays |
| Merge sort | O(n log n) | O(n log n) | O(n log n) | O(n) | stable, predictable, extra memory needed |
| Quicksort | O(n log n) | O(n log n) | O(n²) | O(log n) | fastest in practice; O(n²) on pathological input |
| Heap sort | O(n log n) | O(n log n) | O(n log n) | O(1) | guaranteed n log n, but slower constant |
| Timsort | O(n) | O(n log n) | O(n log n) | O(n) | Python's sorted(), Java's Arrays.sort |
| Counting sort | O(n + k) | O(n + k) | O(n + k) | O(k) | linear when value range k is small |
| Radix sort | O(d · n) | O(d · n) | O(d · n) | O(n + k) | linear when digit count d is small |
Choosing the right data structure is half the battle. Here's how the basic operations cost across the common ones.
| Structure | Access | Search | Insert | Delete | Notes |
|---|---|---|---|---|---|
| Array (indexed) | O(1) | O(n) | O(n) | O(n) | insert/delete at the end is O(1) amortized |
| Linked list | O(n) | O(n) | O(1) | O(1) | O(1) insert/delete only when you already have the node |
| Hash table | — | O(1) | O(1) | O(1) | O(n) worst case from collisions |
| Balanced BST | O(log n) | O(log n) | O(log n) | O(log n) | AVL, red-black, B-tree |
| Heap (binary) | — | O(n) | O(log n) | O(log n) | find-min/max is O(1); search anywhere else is O(n) |
| Trie | O(k) | O(k) | O(k) | O(k) | k = length of the key string (independent of n) |
| Stack / Queue | — | — | O(1) | O(1) | only the ends are accessible |
| Disjoint set (union-find) | — | α(n) | α(n) | — | α(n) is the inverse Ackermann — effectively O(1) |
For graphs, complexity is usually expressed in terms of both vertices (V) and edges (E). A graph with n nodes can have up to n² edges, so an algorithm that's O(V + E) is essentially linear in the graph's size, while O(V·E) starts to look quadratic on dense graphs.
| Algorithm | Complexity | What it does |
|---|---|---|
| BFS / DFS | O(V + E) | visit every node and edge once |
| Topological sort | O(V + E) | linear ordering of a DAG |
| Dijkstra (binary heap) | O((V + E) log V) | shortest path from source, non-negative weights |
| Bellman-Ford | O(V · E) | shortest path with negative weights; slower |
| Floyd-Warshall | O(V³) | all-pairs shortest paths |
| Prim / Kruskal (MST) | O(E log V) | minimum spanning tree |
| Strongly connected components | O(V + E) | Tarjan's or Kosaraju's |
| Max flow (Edmonds-Karp) | O(V · E²) | network flow |
| Travelling salesman (exact) | O(n!) | brute force; with DP, O(2ⁿ · n²) |
Read each snippet and pick the complexity. Click an option to check your answer — explanation appears below.
If you have for i in n: for j in n: ..., the complexity is O(n²), not O(n). Each iteration of the outer loop runs the entire inner loop. Always count the total number of operations, which means multiplying nested loop counts.
This looks like O(n) but it's actually O(n²):
Each in check is O(n) for a list. So you're doing n iterations × n work per iteration = O(n²). The fix: use a set instead of a list — x in my_set is O(1).
The code below has no nested loops, looks very innocent, and is O(n²):
String concatenation in many languages copies the existing string each time. Use "".join(words) instead — that's O(n) total.
Appending to a dynamic array (like a Python list) is "O(1) amortized" — most appends are O(1), but occasionally one is O(n) when the array needs to grow. Averaged over many appends, the cost per operation is O(1).
Big O is usually about time (operations), but algorithms also have space complexity (memory). A recursive function uses stack space proportional to its recursion depth — that's space, not time. Merge sort is O(n log n) time and O(n) space; heap sort is O(n log n) time and O(1) extra space.
Quicksort's average case is O(n log n) but its worst case is O(n²) (when the pivot keeps picking the smallest or largest element). Hash table operations are O(1) amortized average but O(n) worst case (collisions). The standard "Big O" usually refers to worst case unless stated otherwise.
A single sheet to keep nearby. The first column is the complexity; the rest is everything you'd want to remember about it.
| Class | Name | Shape | Examples | When you'll see it |
|---|---|---|---|---|
| O(1) | constant | flat line | hash lookup, array index, stack push/pop | "jump straight to the answer" — no loops, no recursion that depends on n |
| O(log n) | logarithmic | slow curve, flattens | binary search, balanced BST, heap insert | "halve and conquer" — each step throws away half the remaining work |
| O(n) | linear | straight diagonal | scan, sum, max, copy | "look at each item once" — a single loop over the data |
| O(n log n) | linearithmic | nearly linear, slight curve up | merge sort, heap sort, FFT | "sort, or divide & conquer with linear merging" — best possible comparison sort |
| O(n²) | quadratic | parabola | bubble sort, nested loops, all pairs | "compare every pair" — two nested loops over the same data |
| O(n³) | cubic | steep parabola | matrix multiply, Floyd-Warshall | "every triple" — three nested loops, or graph algorithms on dense graphs |
| O(2ⁿ) | exponential | vertical wall | naive recursion (Fibonacci), subset enumeration, SAT | "try every binary choice" — recursion that branches twice without memoization |
| O(n!) | factorial | vertical wall (worse) | brute-force TSP, all permutations | "try every arrangement" — n choices, then n-1, then n-2, ... |
| Input size n | Tractable complexity | Comment |
|---|---|---|
| n ≤ 10 | anything, including O(n!) | brute force fine |
| n ≤ 25 | up to O(2ⁿ) | exponential still OK |
| n ≤ 500 | up to O(n³) | cubic at the edge |
| n ≤ 5,000 | up to O(n²) | quadratic still OK |
| n ≤ 10⁶ | up to O(n log n) | the sweet spot for sorting/processing |
| n ≤ 10⁸ | O(n) | a single linear pass |
| n > 10⁸ | O(log n) or O(1) | you can't even read the input — need indexing, hashing, or sublinear algos |
Big O is not about being precise. It's about knowing which class your algorithm lives in, and what that class implies for the inputs you actually have. Get that right, and the rest is engineering.