Algorithm Cheatsheet
Cheatsheet Content
### Minimum Spanning Tree Algorithms An MST connects all vertices in a graph with the minimum possible total edge weight, without forming cycles. #### Prim's Algorithm * **What is:** A greedy algorithm that builds an MST by iteratively adding the cheapest edge from the MST being built to a vertex not yet in the MST. It starts from an arbitrary vertex and grows the tree. * **Algorithm:** 1. Initialize an empty MST. 2. Choose an arbitrary starting vertex, add it to the MST. 3. While the MST doesn't include all vertices: a. Find the edge $(u, v)$ with the minimum weight such that $u$ is in the MST and $v$ is not. b. Add vertex $v$ and edge $(u, v)$ to the MST. * **Dry Run Example:** Graph: A --(2)-- B | \ / | (3) (6) (1) (5) | \ / | D --(4)-- C | Step | Current MST Vertices | Candidate Edges (u in MST, v not) | Selected Edge | Total Weight | |------|----------------------|-----------------------------------|---------------|--------------| | 1 | {A} | (A,B:2), (A,D:3), (A,C:6) | (A,B:2) | 2 | | 2 | {A,B} | (A,D:3), (A,C:6), (B,C:1) | (B,C:1) | 2+1=3 | | 3 | {A,B,C} | (A,D:3), (C,D:4) | (A,D:3) | 3+3=6 | | 4 | {A,B,C,D} | - (All vertices included) | - | 6 | Final MST Edges: (A,B), (B,C), (A,D) * **Complexity:** * **Time:** * Adjacency Matrix: $O(V^2)$ * Adjacency List with Binary Heap (Priority Queue): $O(E \log V)$ or $O(E + V \log V)$ * Adjacency List with Fibonacci Heap: $O(E + V \log V)$ * **Space:** $O(V + E)$ or $O(V^2)$ depending on graph representation and priority queue. #### Kruskal's Algorithm * **What is:** A greedy algorithm that builds an MST by sorting all edges by weight in ascending order and adding them to the MST if they don't form a cycle with already added edges. It uses a Disjoint Set Union (DSU) data structure to detect cycles efficiently. * **Algorithm:** 1. Create a forest where each vertex is a separate tree (using DSU). 2. Sort all edges in the graph by their weights in non-decreasing order. 3. Iterate through the sorted edges: a. For each edge $(u, v)$ with weight $w$: b. If $u$ and $v$ are not already in the same set (i.e., adding this edge doesn't form a cycle), add $(u, v)$ to the MST and union the sets of $u$ and $v$. 4. Stop when $V-1$ edges have been added (or all edges processed). * **Dry Run Example:** Graph: (Same as Prim's example) A --(2)-- B | \ / | (3) (6) (1) (5) | \ / | D --(4)-- C Edges sorted by weight: (B,C:1), (A,B:2), (A,D:3), (C,D:4), (B,D:5), (A,C:6) | Step | Edge (u,v:w) | Find(u) | Find(v) | Action | MST Edges | DSU Sets (e.g., {A,B,C,D}) | Total Weight | |------|--------------|---------|---------|--------------|-----------------------|----------------------------|--------------| | 1 | (B,C:1) | B | C | Add, Union(B,C) | (B,C) | {A}, {B,C}, {D} | 1 | | 2 | (A,B:2) | A | B | Add, Union(A,B) | (B,C), (A,B) | {A,B,C}, {D} | 1+2=3 | | 3 | (A,D:3) | A | D | Add, Union(A,D) | (B,C), (A,B), (A,D) | {A,B,C,D} | 3+3=6 | | 4 | (C,D:4) | C | D | Skip (cycle) | (B,C), (A,B), (A,D) | {A,B,C,D} | 6 | | 5 | (B,D:5) | B | D | Skip (cycle) | (B,C), (A,B), (A,D) | {A,B,C,D} | 6 | | 6 | (A,C:6) | A | C | Skip (cycle) | (B,C), (A,B), (A,D) | {A,B,C,D} | 6 | Final MST Edges: (B,C), (A,B), (A,D) * **Complexity:** * **Time:** $O(E \log E)$ or $O(E \log V)$ (dominated by sorting edges), plus $O(E \alpha(V))$ for DSU operations, where $\alpha$ is the inverse Ackermann function, which is nearly constant. * **Space:** $O(V + E)$ for storing edges and DSU structure. ### Greedy Algorithms Greedy algorithms make the locally optimal choice at each step with the hope of finding a global optimum. #### Longest Common Subsequence (LCS) + Backtracking * **What is:** The LCS problem finds the longest subsequence common to two sequences. A subsequence does not require consecutive elements. While the decision to extend the common subsequence is greedy (if characters match, extend), the optimal substructure property leads to a dynamic programming solution. Backtracking is then used to reconstruct the actual sequence from the DP table. * **Algorithm:** 1. **Build DP Table:** Create a 2D array `LCS_DP[m+1][n+1]` where `m` and `n` are lengths of strings `X` and `Y`. * `LCS_DP[i][j]` stores the length of the LCS of `X[0...i-1]` and `Y[0...j-1]`. * Initialize first row and column to 0. * For `i` from 1 to `m`, `j` from 1 to `n`: * If `X[i-1] == Y[j-1]`: `LCS_DP[i][j] = 1 + LCS_DP[i-1][j-1]` * Else: `LCS_DP[i][j] = max(LCS_DP[i-1][j], LCS_DP[i][j-1])` 2. **Backtracking:** Start from `LCS_DP[m][n]`. * If `X[i-1] == Y[j-1]`: The character `X[i-1]` is part of LCS. Add it to result and move diagonally up-left (`i--, j--`). * Else (`X[i-1] != Y[j-1]`): Move to the cell with the larger value: if `LCS_DP[i-1][j] > LCS_DP[i][j-1]`, move up (`i--`); else move left (`j--`). * Repeat until `i` or `j` becomes 0. Reverse the collected characters for the final LCS. * **Dry Run Example:** `X = "ABCBDAB"`, `Y = "BDCABA"` DP Table `LCS_DP[i][j]`: | | | B | D | C | A | B | A | |-----|---|---|---|---|---|---|---| | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | **A** | 0 | 0 | 0 | 0 | 1 | 1 | 1 | | **B** | 0 | 1 | 1 | 1 | 1 | 2 | 2 | | **C** | 0 | 1 | 1 | 2 | 2 | 2 | 2 | | **B** | 0 | 1 | 1 | 2 | 2 | 3 | 3 | | **D** | 0 | 1 | 2 | 2 | 2 | 3 | 3 | | **A** | 0 | 1 | 2 | 2 | 3 | 3 | 4 | | **B** | 0 | 1 | 2 | 2 | 3 | 4 | 4 | `LCS_DP[7][6] = 4` (length of LCS) Backtracking from `(7,6)`: | i | j | X[i-1] | Y[j-1] | LCS_DP[i][j] | Action | LCS (reversed) | |---|---|--------|--------|--------------|-------------------------------------------|----------------| | 7 | 6 | B | A | 4 | `LCS_DP[6][6]` (4) > `LCS_DP[7][5]` (3), move up (i=6) | | | 6 | 6 | A | A | 4 | Match! Add 'A', move diag (i=5, j=5) | A | | 5 | 5 | D | B | 3 | `LCS_DP[4][5]` (3) > `LCS_DP[5][4]` (2), move up (i=4) | A | | 4 | 5 | B | B | 3 | Match! Add 'B', move diag (i=3, j=4) | BA | | 3 | 4 | C | A | 2 | `LCS_DP[2][4]` (1) w`): (Current item cannot fit) `DP[i][w] = DP[i-1][w]` (Cannot take the item, value is same as without this item) * **Dry Run Example:** Items: 1. {weight: 2, value: 12} 2. {weight: 1, value: 10} 3. {weight: 3, value: 20} 4. {weight: 2, value: 15} Knapsack Capacity `W = 5` DP Table `DP[i][w]` (rows = items, columns = capacity): | Capacity `w` | 0 | 1 | 2 | 3 | 4 | 5 | |--------------|---|---|---|---|---|---| | **No items** | 0 | 0 | 0 | 0 | 0 | 0 | | **Item 1 (w:2,v:12)** | 0 | 0 | 12 | 12 | 12 | 12 | | **Item 2 (w:1,v:10)** | 0 | 10 | 12 | 22 | 22 | 22 | | **Item 3 (w:3,v:20)** | 0 | 10 | 12 | 22 | 30 | 32 | | **Item 4 (w:2,v:15)** | 0 | 10 | 15 | 25 | 30 | 37 | Max value for capacity 5: `DP[4][5] = 37` * **Complexity:** * **Time:** $O(n \cdot W)$ * **Space:** $O(n \cdot W)$ (can be optimized to $O(W)$ using only two rows or one row) #### Fractional Knapsack Problem * **What is:** Similar to 0/1 Knapsack, but items can be taken fractionally (e.g., half an item). This problem *can* be solved using a greedy approach because taking a fraction of an item doesn't prevent taking fractions of other items. * **Algorithm (Greedy):** 1. For each item, calculate its value-to-weight ratio (`value/weight`). 2. Sort the items in descending order based on their value-to-weight ratios. 3. Initialize `current_weight = 0` and `total_value = 0`. 4. Iterate through the sorted items: a. If the entire item `i` can fit in the remaining capacity (`current_weight + weight[i] ### Traversing Algorithms Traversing algorithms visit all nodes in a graph or tree in a systematic way. #### Breadth-First Search (BFS) * **What is:** An algorithm for traversing or searching tree or graph data structures. It starts at the tree root (or some arbitrary node of a graph) and explores all of the neighbor nodes at the present depth prior to moving on to the nodes at the next depth level. It uses a queue. * **Algorithm:** 1. Create a queue and add the starting node to it. 2. Mark the starting node as visited. 3. While the queue is not empty: a. Dequeue a node `u`. b. Process `u` (e.g., print it). c. For each unvisited neighbor `v` of `u`: i. Mark `v` as visited. ii. Enqueue `v`. * **Dry Run Example:** Graph: A --- B | | C --- D --- E Start node: A | Step | Queue (front to back) | Visited Nodes | Traversal Order | |------|-----------------------|---------------|-----------------| | 1 | [A] | {A} | | | 2 | [] (Dequeue A) | {A} | A | | | [B, C] (Enqueue B, C) | {A,B,C} | | | 3 | [C] (Dequeue B) | {A,B,C} | A, B | | | [C, D] (Enqueue D) | {A,B,C,D} | | | 4 | [D] (Dequeue C) | {A,B,C,D} | A, B, C | | | [D] (No unvisited neighbors for C other than B, which is visited) | {A,B,C,D} | | | 5 | [] (Dequeue D) | {A,B,C,D} | A, B, C, D | | | [E] (Enqueue E) | {A,B,C,D,E} | | | 6 | [] (Dequeue E) | {A,B,C,D,E} | A, B, C, D, E | | | [] (No unvisited neighbors) | {A,B,C,D,E} | | | 7 | Queue empty. Stop. | | | Traversal Order: A, B, C, D, E * **Complexity:** * **Time:** $O(V + E)$ (for adjacency list) or $O(V^2)$ (for adjacency matrix). Each vertex and edge is processed at most once. * **Space:** $O(V)$ for the queue and visited array. #### Depth-First Search (DFS) * **What is:** An algorithm for traversing or searching tree or graph data structures. It starts at the root (or an arbitrary node) and explores as far as possible along each branch before backtracking. It uses a stack (or recursion, which uses the call stack). * **Algorithm (Recursive):** 1. `DFS(node u)`: a. Mark `u` as visited. b. Process `u`. c. For each unvisited neighbor `v` of `u`: i. Call `DFS(v)`. * **Algorithm (Iterative with Stack):** 1. Create a stack and push the starting node onto it. 2. Mark the starting node as visited. 3. While the stack is not empty: a. Pop a node `u`. b. Process `u`. c. For each unvisited neighbor `v` of `u`: i. Mark `v` as visited. ii. Push `v` onto the stack. * **Dry Run Example:** Graph: (Same as BFS example) A --- B | | C --- D --- E Start node: A | Step | Stack (top) | Visited Nodes | Traversal Order | |------|-------------|---------------|-----------------| | 1 | [A] | {A} | | | 2 | [] (Pop A) | {A} | A | | | [C, B] (Push C, then B - order depends on adjacency list/iteration) | {A,B,C} | | | 3 | [C] (Pop B) | {A,B,C} | A, B | | | [C, D] (Push D) | {A,B,C,D} | | | 4 | [D] (Pop C) | {A,B,C,D} | A, B, C | | | [D] (No unvisited neighbors for C other than B, which is visited) | {A,B,C,D} | | | 5 | [] (Pop D) | {A,B,C,D} | A, B, C, D | | | [E] (Push E)| {A,B,C,D,E} | | | 6 | [] (Pop E) | {A,B,C,D,E} | A, B, C, D, E | | | [] (No unvisited neighbors) | {A,B,C,D,E} | | | 7 | Stack empty. Stop. | | | Traversal Order: A, B, D, E, C (or A, C, B, D, E depending on neighbor processing order) * **Complexity:** * **Time:** $O(V + E)$ (for adjacency list) or $O(V^2)$ (for adjacency matrix). Each vertex and edge is processed at most once. * **Space:** $O(V)$ for the recursion stack or explicit stack and visited array. ### Searching Algorithms Searching algorithms are used to find an item or retrieve information from any data structure. #### Linear Search * **What is:** The simplest searching algorithm. It sequentially checks each element of the list until a match is found or the whole list has been searched. * **Algorithm:** 1. Start from the first element of the list. 2. Compare the current element with the target value. 3. If they match, return the index of the current element. 4. If not, move to the next element. 5. Repeat until the end of the list is reached. 6. If the target is not found, return an indication (e.g., -1). * **Dry Run Example:** Array: `[10, 20, 30, 40, 50]` Target: `30` | Step | Index | Element | Comparison `(Element == Target)` | Result | |------|-------|---------|----------------------------------|--------| | 1 | 0 | 10 | `10 == 30` (False) | | | 2 | 1 | 20 | `20 == 30` (False) | | | 3 | 2 | 30 | `30 == 30` (True) | Found! | | 4 | | | | Return index 2 | Target: `60` | Step | Index | Element | Comparison `(Element == Target)` | Result | |------|-------|---------|----------------------------------|--------| | 1 | 0 | 10 | `10 == 60` (False) | | | 2 | 1 | 20 | `20 == 60` (False) | | | 3 | 2 | 30 | `30 == 60` (False) | | | 4 | 3 | 40 | `40 == 60` (False) | | | 5 | 4 | 50 | `50 == 60` (False) | | | 6 | | | End of list reached | Not Found | | 7 | | | | Return -1 | * **Complexity:** * **Time:** * Best Case: $O(1)$ (target is the first element) * Worst Case: $O(n)$ (target is the last element or not present) * Average Case: $O(n)$ * **Space:** $O(1)$ #### Binary Search * **What is:** An efficient algorithm for finding an item from a *sorted* list of items. It works by repeatedly dividing the search interval in half. * **Algorithm:** 1. Initialize `low = 0` and `high = length - 1`. 2. While `low target`: The target is in the left half. Set `high = mid - 1`. 3. If the loop finishes without finding the target, return an indication (e.g., -1). * **Dry Run Example:** Sorted Array: `[10, 20, 30, 40, 50, 60, 70, 80, 90]` Target: `70` Length `n = 9` | Step | `low` | `high` | `mid` | `array[mid]` | Condition (`array[mid]` vs Target) | Action (`low`, `high` update) | |------|-------|--------|-------|--------------|------------------------------------|-------------------------------| | 1 | 0 | 8 | 4 | 50 | `50 35` (Target is smaller) | `high = 4 - 1 = 3` | | 2 | 0 | 3 | 1 | 20 | `20 35` (Target is smaller) | `high = 3 - 1 = 2` | | 5 | 3 | 2 | | | `low > high` | Loop terminates | | 6 | | | | | | Return -1 | * **Complexity:** * **Time:** $O(\log n)$ (each step halves the search space). * **Space:** $O(1)$ (iterative) or $O(\log n)$ (recursive due to call stack).