Introduction to Algorithms (CLRS)

Shared 12/5/2025•80 views

/ 1

Cheatsheet Content

Asymptotic Notations Big-O Notation: $O(g(n))$ - Upper bound. $f(n) \le c \cdot g(n)$ for $n \ge n_0$. Example: $3n+2 = O(n)$ Big-Omega Notation: $\Omega(g(n))$ - Lower bound. $f(n) \ge c \cdot g(n)$ for $n \ge n_0$. Example: $3n+2 = \Omega(n)$ Big-Theta Notation: $\Theta(g(n))$ - Tight bound. $c_1 \cdot g(n) \le f(n) \le c_2 \cdot g(n)$ for $n \ge n_0$. Example: $3n+2 = \Theta(n)$ Little-o Notation: $o(g(n))$ - Upper bound, not tight. $\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0$. Example: $3n+2 = o(n^2)$ Little-omega Notation: $\omega(g(n))$ - Lower bound, not tight. $\lim_{n \to \infty} \frac{f(n)}{g(n)} = \infty$. Example: $n^2 = \omega(n)$ Recurrence Relations 1. Substitution Method Guess the form of the solution. Use mathematical induction to find constants and show the solution works. Example: $T(n) = 2T(\lfloor n/2 \rfloor) + n$. Guess $T(n) = O(n \log n)$. 2. Recursion-Tree Method Draw a tree representing the recursive calls and costs at each level. Sum costs at each level to get total cost. Example: $T(n) = 3T(n/4) + cn^2$. Root: $cn^2$ Level 1: $3c(n/4)^2 = 3cn^2/16$ Level 2: $9c(n/16)^2 = 9cn^2/256$ Total: $cn^2(1 + 3/16 + 9/256 + \dots) = cn^2 \sum (3/16)^i = O(n^2)$ 3. Master Theorem For recurrences of the form $T(n) = aT(n/b) + f(n)$, where $a \ge 1$, $b > 1$, $f(n)$ is asymptotically positive. Case 1: If $f(n) = O(n^{\log_b a - \epsilon})$ for some $\epsilon > 0$, then $T(n) = \Theta(n^{\log_b a})$. Case 2: If $f(n) = \Theta(n^{\log_b a})$, then $T(n) = \Theta(n^{\log_b a} \log n)$. Case 3: If $f(n) = \Omega(n^{\log_b a + \epsilon})$ for some $\epsilon > 0$, AND $a f(n/b) \le c f(n)$ for some $c Examples: $T(n) = 9T(n/3) + n$: $a=9, b=3, f(n)=n$. $\log_b a = \log_3 9 = 2$. $n = O(n^{2-\epsilon})$. Case 1. $T(n) = \Theta(n^2)$. $T(n) = 2T(n/2) + n$: $a=2, b=2, f(n)=n$. $\log_b a = \log_2 2 = 1$. $n = \Theta(n^1)$. Case 2. $T(n) = \Theta(n \log n)$. $T(n) = 3T(n/4) + n \log n$: $a=3, b=4, f(n)=n \log n$. $\log_b a = \log_4 3 \approx 0.792$. $n \log n = \Omega(n^{0.792 + \epsilon})$. Case 3. Check regularity: $3(n/4)\log(n/4) \le c \cdot n \log n$. $T(n) = \Theta(n \log n)$. Sorting Algorithms Algorithm Worst-case Time Average-case Time Space Stable Insertion Sort $O(n^2)$ $O(n^2)$ $O(1)$ Yes Merge Sort $O(n \log n)$ $O(n \log n)$ $O(n)$ Yes Heapsort $O(n \log n)$ $O(n \log n)$ $O(1)$ No Quicksort $O(n^2)$ $O(n \log n)$ $O(\log n)$ (avg.) No Counting Sort $O(n+k)$ $O(n+k)$ $O(k)$ Yes Radix Sort $O(d(n+k))$ $O(d(n+k))$ $O(n+k)$ Yes Note: $n$ is number of elements, $k$ is range of input values, $d$ is number of digits. Data Structures 1. Hash Tables Direct Addressing: $O(1)$ time for all operations (lookup, insert, delete) if keys are small integers. Requires $O(max\_key)$ space. Collision Resolution: Chaining: Linked list at each slot. Average $O(1+\alpha)$ where $\alpha = n/m$ (load factor). Worst $O(n)$ if all keys hash to same slot. Open Addressing: All elements stored in table itself. Linear Probing: $h(k, i) = (h'(k) + i) \pmod m$. Suffers from primary clustering. Quadratic Probing: $h(k, i) = (h'(k) + c_1 i + c_2 i^2) \pmod m$. Suffers from secondary clustering. Double Hashing: $h(k, i) = (h_1(k) + i h_2(k)) \pmod m$. Best performance. Average $O(1/(1-\alpha))$ for successful search, $O(1/(1-\alpha)^2)$ for unsuccessful search. 2. Binary Search Trees (BST) Operations: Search, Minimum, Maximum, Successor, Predecessor, Insert, Delete. Time Complexity: $O(h)$ where $h$ is height of tree. Worst case $O(n)$ (skewed tree). 3. Red-Black Trees (Self-Balancing BST) Properties: Every node is Red or Black. Root is Black. Every leaf (NIL) is Black. If a node is Red, both its children are Black. All simple paths from a node to descendant leaves contain same number of Black nodes (black-height). Height: $O(\log n)$. Operations: Search, Min, Max, Succ, Pred, Insert, Delete all $O(\log n)$. 4. Heaps Max-Heap Property: Parent key $\ge$ children keys. Min-Heap Property: Parent key $\le$ children keys. Binary heap is a complete binary tree. Operations: $HEAPIFY(A, i)$: $O(\log n)$ $BUILD-MAX-HEAP(A)$: $O(n)$ $HEAP-MAXIMUM()$: $O(1)$ $HEAP-EXTRACT-MAX()$: $O(\log n)$ $HEAP-INCREASE-KEY()$: $O(\log n)$ $HEAP-INSERT()$: $O(\log n)$ 5. B-Trees Self-balancing search tree, optimized for disk access (reduces disk I/O). Each node can have many children ($t$ is minimum degree). Properties: Every node has at most $2t-1$ keys. Every internal node (except root) has at least $t$ children. Root has at least 2 children if it's not a leaf. All leaves have same depth. Height: $O(\log_t n)$. Operations: Search, Insert, Delete are $O(t \log_t n)$, or $O(\log_t n)$ disk accesses. Graph Algorithms 1. Graph Representations Adjacency List: $O(V+E)$ space. Good for sparse graphs. $DFS(G)$: $O(V+E)$ $BFS(G)$: $O(V+E)$ Adjacency Matrix: $O(V^2)$ space. Good for dense graphs. 2. Breadth-First Search (BFS) Finds shortest path in terms of number of edges for unweighted graphs. Explores layer by layer. Uses a queue. Time: $O(V+E)$. 3. Depth-First Search (DFS) Explores as far as possible along each branch before backtracking. Uses a stack (or recursion). Time: $O(V+E)$. Useful for topological sort, finding connected components, finding cycles. 4. Minimum Spanning Trees (MST) Prim's Algorithm: Grows MST from a starting vertex. Uses a min-priority queue. Time: $O(E \log V)$ with binary heap, $O(E + V \log V)$ with Fibonacci heap. Kruskal's Algorithm: Adds edges in increasing order of weight, avoiding cycles. Uses a disjoint-set data structure. Time: $O(E \log E)$ or $O(E \log V)$ (since $E \le V^2$, $\log E$ is $2 \log V$). 5. Single-Source Shortest Paths Bellman-Ford Algorithm: Handles negative edge weights. Detects negative cycles. Time: $O(VE)$. Dijkstra's Algorithm: For non-negative edge weights. Uses a min-priority queue. Time: $O(E \log V)$ with binary heap, $O(E + V \log V)$ with Fibonacci heap. 6. All-Pairs Shortest Paths Floyd-Warshall Algorithm: Dynamic programming approach. Handles negative edge weights (but not negative cycles). Time: $O(V^3)$. Johnson's Algorithm: For sparse graphs. Reweights edges to remove negative weights, then runs Dijkstra from each vertex. Time: $O(VE + V^2 \log V)$ with binary heap, $O(VE + V^2 \log V)$ with Fibonacci heap. 7. Maximum Flow Ford-Fulkerson Method: Iteratively finds augmenting paths in residual graph. Time: $O(E \cdot f)$ where $f$ is max flow value (can be slow). With Edmonds-Karp (BFS to find augmenting paths): $O(VE^2)$. Dynamic Programming Optimal substructure: Optimal solution to a problem contains optimal solutions to subproblems. Overlapping subproblems: A recursive algorithm revisits the same subproblems repeatedly. Memoization (top-down) or Tabulation (bottom-up). Examples: Longest Common Subsequence ($O(mn)$) Matrix-Chain Multiplication ($O(n^3)$) Rod Cutting ($O(n^2)$) Knapsack Problem (0/1 Knapsack: $O(nW)$, unbounded: $O(nW)$) Greedy Algorithms Makes locally optimal choices in the hope of finding a global optimum. Two properties for correctness: Greedy choice property: A globally optimal solution can be arrived at by making a locally optimal (greedy) choice. Optimal substructure: Same as DP. Examples: Activity Selection Problem ($O(n \log n)$ if sorted, $O(n)$ otherwise) Huffman Coding ($O(n \log n)$) Fractional Knapsack Problem ($O(n \log n)$) Prim's and Kruskal's MST algorithms Dijkstra's Shortest Path algorithm Amortized Analysis Averages the time required to perform a sequence of operations over all operations. Methods: Aggregate Analysis: Total cost for $n$ operations / $n$. Accounting Method: Assign "credits" to operations. Overcharge cheap operations to pay for expensive ones later. Potential Method: Define a potential function $\Phi$. Amortized cost $= \text{actual cost} + \Delta \Phi$. Examples: Dynamic Array (amortized $O(1)$ for append) Multipop stack (amortized $O(1)$ per operation) Binary counter (amortized $O(1)$ per increment) Complexity Classes P: Problems solvable in polynomial time by a deterministic Turing machine. ($O(n^k)$) NP: Problems verifiable in polynomial time by a deterministic Turing machine (or solvable in polynomial time by a non-deterministic Turing machine). NP-Complete (NPC): Problems in NP such that every other problem in NP can be reduced to it in polynomial time. If any NPC problem can be solved in polynomial time, then P = NP. NP-Hard: Problems to which all NP problems can be reduced in polynomial time. May or may not be in NP. Examples: P: Sorting, Shortest Path, MST, Matrix Multiplication NPC: Traveling Salesperson Problem, Satisfiability (SAT), Vertex Cover, Hamiltonian Cycle, Knapsack

Introduction to Algorithms (CLRS)

Related Cheatsheets