CLRS Algorithms
Cheatsheet Content
### Asymptotic Notation - **Big-O ($O$):** Upper bound. $f(n) \in O(g(n))$ if $\exists c, n_0 > 0$ s.t. $0 \le f(n) \le c g(n)$ for all $n \ge n_0$. - **Omega ($\Omega$):** Lower bound. $f(n) \in \Omega(g(n))$ if $\exists c, n_0 > 0$ s.t. $0 \le c g(n) \le f(n)$ for all $n \ge n_0$. - **Theta ($\Theta$):** Tight bound. $f(n) \in \Theta(g(n))$ if $f(n) \in O(g(n))$ and $f(n) \in \Omega(g(n))$. - **Little-o ($o$):** Strict upper bound. $f(n) \in o(g(n))$ if $\forall c > 0, \exists n_0 > 0$ s.t. $0 \le f(n) 0, \exists n_0 > 0$ s.t. $0 \le c g(n) ### Recurrences - **Substitution Method:** Guess a solution, then prove by induction. - **Recursion-Tree Method:** Draw a tree, sum costs per level. Good for generating guesses. - **Master Method:** For recurrences of the form $T(n) = aT(n/b) + f(n)$. - **Case 1:** If $f(n) = O(n^{\log_b a - \epsilon})$ for some $\epsilon > 0$, then $T(n) = \Theta(n^{\log_b a})$. - **Case 2:** If $f(n) = \Theta(n^{\log_b a} \log^k n)$ for some $k \ge 0$, then $T(n) = \Theta(n^{\log_b a} \log^{k+1} n)$. (Often $k=0$, so $f(n) = \Theta(n^{\log_b a}) \implies T(n) = \Theta(n^{\log_b a} \log n)$). - **Case 3:** If $f(n) = \Omega(n^{\log_b a + \epsilon})$ for some $\epsilon > 0$, AND $a f(n/b) \le c f(n)$ for some $c ### Sorting Algorithms - **Insertion Sort:** $O(n^2)$ average/worst, $O(n)$ best. In-place, stable. Good for small $n$ or nearly sorted arrays. - **Merge Sort:** $O(n \log n)$ always. Not in-place, stable. Divide and conquer. - Merge step: $O(n)$. - **Heap Sort:** $O(n \log n)$ always. In-place, not stable. Uses a max-heap. - Build-Heap: $O(n)$. - Extract-Max: $O(\log n)$. - **Quick Sort:** $O(n \log n)$ average, $O(n^2)$ worst. In-place, not stable. Divide and conquer. - Partition step: $O(n)$. - Worst case: Already sorted or reverse sorted array (if pivot is always first/last). - Randomized Quick Sort: $O(n \log n)$ expected time. - **Counting Sort:** $O(n+k)$ where $k$ is range of input numbers. Not comparison sort. Stable. - **Radix Sort:** $O(d(n+k))$ where $d$ is number of digits. Uses stable sort (e.g., Counting Sort) as subroutine. ### Binary Search Trees (BSTs) - **Properties:** Left subtree keys ### Hash Tables - **Direct-Addressing:** If keys are small integers. Array $T[0 \dots m-1]$. $O(1)$ for all ops. - **Hash Function:** $h: U \to \{0, \dots, m-1\}$. - **Collision Resolution:** - **Chaining:** List of elements at each slot. $O(1+\alpha)$ average for search/insert/delete, where $\alpha = n/m$ (load factor). - **Open Addressing:** All elements in hash table itself. - **Linear Probing:** $h(k, i) = (h'(k) + i) \pmod m$. - **Quadratic Probing:** $h(k, i) = (h'(k) + c_1 i + c_2 i^2) \pmod m$. - **Double Hashing:** $h(k, i) = (h_1(k) + i h_2(k)) \pmod m$. - **Universal Hashing:** Choose hash function randomly from a family of functions to minimize collisions in expectation. ### Dynamic Programming - **Characteristics:** 1. **Optimal Substructure:** Optimal solution to problem contains optimal solutions to subproblems. 2. **Overlapping Subproblems:** Recursive algorithm revisits same subproblems repeatedly. - **Steps:** 1. Characterize optimal substructure. 2. Recursively define the value of an optimal solution. 3. Compute the value of an optimal solution (usually bottom-up). 4. Construct an optimal solution from computed information. - **Examples:** - **Rod Cutting:** $P_i$ = price for length $i$. Find max value for length $n$. $r_n = \max_{1 \le i \le n} (p_i + r_{n-i})$. - **Matrix Chain Multiplication:** Find optimal parenthesization. $m[i,j] = \min_{i \le k ### Greedy Algorithms - **Characteristics:** 1. **Optimal Substructure:** Same as DP. 2. **Greedy Choice Property:** A globally optimal solution can be reached by making a locally optimal (greedy) choice. - **Proof Techniques:** - **Exchange Argument:** Show that if there's an optimal solution that doesn't make the greedy choice, you can transform it into one that does, without worsening the solution. - **Examples:** - **Activity Selection Problem:** Select max non-overlapping activities. Sort by finish time, pick first, then next non-overlapping. - **Huffman Coding:** Build optimal prefix codes. Greedily merge two lowest frequency nodes. - **Fractional Knapsack:** Take items with highest value-to-weight ratio. ### Graph Algorithms - **Representations:** Adjacency List ($O(V+E)$), Adjacency Matrix ($O(V^2)$). - **Breadth-First Search (BFS):** Explores layer by layer. Finds shortest path in unweighted graphs. - Time: $O(V+E)$. Uses a queue. - **Depth-First Search (DFS):** Explores as far as possible along each branch. Used for topological sort, cycle detection, strongly connected components. - Time: $O(V+E)$. Uses a stack (or recursion). - **Topological Sort:** Linear ordering of vertices such that for every directed edge $(u,v)$, $u$ comes before $v$. Only for Directed Acyclic Graphs (DAGs). - Using DFS: $O(V+E)$. - **Minimum Spanning Trees (MST):** - **Prim's Algorithm:** Grows a single tree. Uses a min-priority queue. $O(E \log V)$ or $O(E+V \log V)$. - **Kruskal's Algorithm:** Grows a forest of trees. Uses a min-priority queue for edges and Disjoint-Set data structure. $O(E \log E)$ or $O(E \log V)$. - **Single-Source Shortest Paths:** - **Bellman-Ford:** Handles negative edge weights, detects negative cycles. $O(VE)$. - **Dijkstra's Algorithm:** Non-negative edge weights. Uses a min-priority queue. $O(E \log V)$ or $O(E+V \log V)$. - **All-Pairs Shortest Paths:** - **Floyd-Warshall:** $O(V^3)$. Handles negative edge weights, detects negative cycles. - **Johnson's Algorithm:** Uses Bellman-Ford for reweighting, then Dijkstra's for all vertices. $O(VE + V^2 \log V)$. - **Maximum Flow:** - **Ford-Fulkerson Method:** Generic method. Augmenting paths. - **Edmonds-Karp Algorithm:** BFS for augmenting paths. $O(VE^2)$. ### String Matching - **Naive Algorithm:** $O((n-m+1)m)$. - **Rabin-Karp Algorithm:** Uses hashing. $O(n+m)$ average, $O(nm)$ worst case. - **Knuth-Morris-Pratt (KMP) Algorithm:** Uses a prefix function (failure function) to avoid re-matching characters. $O(n+m)$. - Prefix function computation: $O(m)$. - Matching: $O(n)$. - **Finite Automata:** $O(n+m)$ where $m$ is pattern length. - Build automaton: $O(m |\Sigma|)$. ### NP-Completeness - **P (Polynomial Time):** Problems solvable in polynomial time. - **NP (Nondeterministic Polynomial Time):** Problems whose solutions can be verified in polynomial time. - **NP-Hard:** A problem $H$ is NP-hard if every problem $L \in \text{NP}$ can be reduced to $H$ in polynomial time ($L \le_P H$). - **NP-Complete (NPC):** A problem $L$ is NP-complete if $L \in \text{NP}$ and $L$ is NP-hard. - **Polynomial-Time Reductions:** $A \le_P B$ means problem $A$ can be transformed into problem $B$ in polynomial time. If $A \le_P B$ and $B \in P$, then $A \in P$. If $A \le_P B$ and $A \notin P$, then $B \notin P$. - **Cook-Levin Theorem:** SAT (Satisfiability) is NP-Complete. - **Common NP-Complete Problems:** - **3-CNF SAT:** SAT where each clause has exactly 3 literals. - **Clique:** Find a clique of size $k$. - **Vertex Cover:** Find a vertex cover of size $k$. - **Hamiltonian Cycle:** Find a Hamiltonian cycle. - **Traveling Salesperson Problem (TSP):** Find shortest tour visiting all cities. - **Subset Sum:** Is there a subset that sums to $k$? - **Knapsack Problem (0/1):** Maximize value with weight constraint.