### Asymptotic Notation - **O-notation (Big-O):** $O(g(n))$ is the set of functions $f(n)$ such that there exist positive constants $c$ and $n_0$ with $0 \le f(n) \le cg(n)$ for all $n \ge n_0$. (Upper bound) - **$\Omega$-notation (Big-Omega):** $\Omega(g(n))$ is the set of functions $f(n)$ such that there exist positive constants $c$ and $n_0$ with $0 \le cg(n) \le f(n)$ for all $n \ge n_0$. (Lower bound) - **$\Theta$-notation (Big-Theta):** $\Theta(g(n))$ is the set of functions $f(n)$ such that there exist positive constants $c_1, c_2$ and $n_0$ with $0 \le c_1g(n) \le f(n) \le c_2g(n)$ for all $n \ge n_0$. (Tight bound) - **o-notation (Little-o):** $o(g(n))$ is the set of functions $f(n)$ such that for all positive constants $c > 0$, there exists a constant $n_0$ with $0 \le f(n) 0$, there exists a constant $n_0$ with $0 \le cg(n) ### Sorting Algorithms #### Insertion Sort - **Idea:** Build final sorted array one item at a time. - **Time Complexity:** $O(n^2)$ worst-case, $O(n)$ best-case. - **Space Complexity:** $O(1)$ - **Stable:** Yes - **Pseudocode:** ``` INSERTION-SORT(A) for j = 2 to A.length key = A[j] i = j - 1 while i > 0 and A[i] > key A[i+1] = A[i] i = i - 1 A[i+1] = key ``` #### Merge Sort - **Idea:** Divide, Conquer, Combine. Recursively divides array into halves, sorts them, and merges them. - **Time Complexity:** $\Theta(n \log n)$ always. - **Space Complexity:** $O(n)$ - **Stable:** Yes - **Recurrence:** $T(n) = 2T(n/2) + \Theta(n)$ #### Heap Sort - **Idea:** Uses a max-heap data structure. Build max-heap, then repeatedly extract max element and rebuild heap. - **Time Complexity:** $\Theta(n \log n)$ always. - **Space Complexity:** $O(1)$ (in-place) - **Stable:** No #### Quick Sort - **Idea:** Divide and Conquer. Pick a pivot, partition array around pivot (elements smaller than pivot to left, larger to right), recursively sort sub-arrays. - **Time Complexity:** $O(n \log n)$ average-case, $O(n^2)$ worst-case. - **Space Complexity:** $O(\log n)$ average-case (due to recursion stack), $O(n)$ worst-case. - **Stable:** No - **Recurrence:** $T(n) = T(k) + T(n-k-1) + \Theta(n)$ for a partition of size $k$. #### Counting Sort - **Idea:** Sorts integers in a specific range by counting occurrences of each number. - **Time Complexity:** $\Theta(n+k)$ where $k$ is range of input numbers. - **Space Complexity:** $\Theta(n+k)$ - **Stable:** Yes - **Limitations:** Only for integers with a small range $k$. #### Radix Sort - **Idea:** Sorts numbers by processing individual digits. Starts from least significant digit (LSD) or most significant digit (MSD). - **Time Complexity:** $\Theta(d(n+k))$ where $d$ is number of digits, $k$ is range of digit values. - **Space Complexity:** $\Theta(n+k)$ - **Stable:** Yes (if underlying sort is stable) #### Comparison Sort Lower Bound - Any comparison sort requires $\Omega(n \log n)$ comparisons in the worst case. ### Data Structures #### Hash Tables - **Idea:** Map keys to array indices using a hash function. - **Collision Resolution:** - **Chaining:** Store elements with same hash value in a linked list. Average $O(1)$ for search/insert/delete. - **Open Addressing:** Probe for next empty slot. - **Linear Probing:** $h(k, i) = (h'(k) + i) \pmod m$ - **Quadratic Probing:** $h(k, i) = (h'(k) + c_1 i + c_2 i^2) \pmod m$ - **Double Hashing:** $h(k, i) = (h_1(k) + i h_2(k)) \pmod m$ - **Load Factor ($\alpha$):** $n/m$ (number of elements / number of slots). Average $O(1+\alpha)$ for operations. #### Red-Black Trees - **Idea:** Self-balancing binary search trees. - **Properties:** 1. Every node is either red or black. 2. Root is black. 3. Every leaf (NIL) is black. 4. If a node is red, both its children are black. 5. All simple paths from a node to any descendant leaf contain the same number of black nodes (black-height). - **Operations:** Search, Insert, Delete in $O(\log n)$ worst-case time. #### B-Trees - **Idea:** Self-balancing search trees optimized for disk I/O. Each node can have many children. - **Properties (t is minimum degree):** 1. Every node has at most $2t-1$ keys and $2t$ children. 2. Every node (except root) has at least $t-1$ keys and $t$ children. 3. Root has at least 1 key if not a leaf. 4. All leaves have the same depth. - **Operations:** Search, Insert, Delete in $O(t \log_t n)$ worst-case time. Minimizes disk accesses. ### Dynamic Programming - **Idea:** Solve problems by breaking them into overlapping subproblems and storing the results of subproblems to avoid recomputation. - **Steps:** 1. Characterize optimal substructure. 2. Recursively define the value of an optimal solution. 3. Compute the value of an optimal solution (bottom-up or memoized). 4. Construct an optimal solution from computed information. #### Examples: - **Rod Cutting:** Maximize revenue by cutting a rod into pieces. - **Matrix Chain Multiplication:** Find optimal parenthesization to minimize scalar multiplications. - **Longest Common Subsequence (LCS):** Find the longest subsequence common to two sequences. - **0/1 Knapsack:** Maximize value of items in a knapsack with weight capacity (each item taken once or not at all). ### Greedy Algorithms - **Idea:** Make locally optimal choices in the hope of finding a globally optimal solution. - **Properties:** 1. **Greedy-choice property:** A globally optimal solution can be arrived at by making a locally optimal (greedy) choice. 2. **Optimal substructure:** An optimal solution to the problem contains optimal solutions to subproblems. #### Examples: - **Activity Selection Problem:** Select maximum number of non-overlapping activities. - **Huffman Coding:** Construct optimal prefix codes for data compression. - **Fractional Knapsack:** Maximize value of items in a knapsack where fractions of items can be taken. - **Minimum Spanning Tree (MST):** Prim's and Kruskal's algorithms. ### Graph Algorithms #### Graph Representations - **Adjacency List:** For sparse graphs. $O(V+E)$ space. - **Adjacency Matrix:** For dense graphs. $O(V^2)$ space. #### Breadth-First Search (BFS) - **Idea:** Explores neighbor nodes first, then their neighbors, level by level. - **Uses:** Shortest path in unweighted graphs, finding connected components. - **Time Complexity:** $O(V+E)$ #### Depth-First Search (DFS) - **Idea:** Explores as far as possible along each branch before backtracking. - **Uses:** Topological sort, finding strongly connected components, cycle detection. - **Time Complexity:** $O(V+E)$ #### Topological Sort - **Idea:** Linear ordering of vertices such that for every directed edge $u \to v$, $u$ comes before $v$ in the ordering. - **Applicable to:** Directed Acyclic Graphs (DAGs) only. - **Time Complexity:** $O(V+E)$ (using DFS) #### Minimum Spanning Trees (MST) - **Goal:** Find a subset of edges that connects all vertices, has no cycles, and has minimum total edge weight. - **Properties:** - **Cut Property:** If a cut crosses an MST edge, that edge must be a light edge crossing the cut. - **Cycle Property:** The heaviest edge in any cycle is not in any MST. - **Kruskal's Algorithm:** - **Idea:** Sort all edges by weight, then add edges in increasing order if they don't form a cycle (using Union-Find). - **Time Complexity:** $O(E \log E)$ or $O(E \log V)$ - **Prim's Algorithm:** - **Idea:** Start from an arbitrary vertex, greedily grow MST by adding the cheapest edge from a vertex in the MST to a vertex outside the MST. - **Time Complexity:** $O(E \log V)$ with binary heap, $O(E + V \log V)$ with Fibonacci heap. #### Single-Source Shortest Paths - **Bellman-Ford Algorithm:** - **Idea:** Relaxes all edges $V-1$ times. - **Uses:** Graphs with negative edge weights, detects negative cycles. - **Time Complexity:** $O(VE)$ - **Dijkstra's Algorithm:** - **Idea:** Greedily selects the unvisited vertex with the smallest distance from the source. - **Uses:** Graphs with non-negative edge weights. - **Time Complexity:** $O(E \log V)$ with binary heap, $O(E + V \log V)$ with Fibonacci heap. #### All-Pairs Shortest Paths - **Floyd-Warshall Algorithm:** - **Idea:** Dynamic programming. Considers intermediate vertices $k$ in increasing order. - **Uses:** Graphs with negative edge weights, detects negative cycles. - **Time Complexity:** $O(V^3)$ - **Johnson's Algorithm:** - **Idea:** Reweights graph to eliminate negative weights, then runs Dijkstra from each vertex. - **Uses:** Sparse graphs with negative edge weights. - **Time Complexity:** $O(VE + V^2 \log V)$ (with Fibonacci heap for Dijkstra) #### Maximum Flow - **Ford-Fulkerson Method:** - **Idea:** Iteratively find augmenting paths in the residual graph and push flow. - **Time Complexity:** Depends on augmenting path finding strategy. With Edmonds-Karp (BFS for paths): $O(VE^2)$. - **Max-Flow Min-Cut Theorem:** The value of a maximum flow in a network is equal to the capacity of a minimum cut in that network. ### String Matching #### Naive Algorithm - **Idea:** Slide pattern across text, checking for match at each position. - **Time Complexity:** $O((n-m+1)m)$ worst-case. #### Rabin-Karp Algorithm - **Idea:** Uses hashing to quickly compare pattern with text substrings. - **Time Complexity:** $O(n+m)$ average-case, $O((n-m+1)m)$ worst-case (if many spurious hits). - **Hash Function:** $h(P) = ( \sum_{i=0}^{m-1} P[i] d^{m-1-i} ) \pmod q$ #### Knuth-Morris-Pratt (KMP) Algorithm - **Idea:** Preprocesses pattern to build a "prefix function" ($\pi$) which tells how many characters to shift when a mismatch occurs, avoiding re-matching characters. - **Time Complexity:** $O(n+m)$ worst-case. - **Prefix Function ($\pi$):** $\pi[q]$ is the length of the longest proper prefix of $P_q$ that is also a suffix of $P_q$. ### Computational Geometry #### Convex Hull - **Goal:** Find the smallest convex polygon that encloses a set of points. - **Algorithms:** - **Graham Scan:** Sorts points by polar angle, then uses a stack to maintain candidate vertices. $O(n \log n)$. - **Jarvis March (Gift Wrapping):** Finds extreme points one by one. $O(nh)$ where $h$ is number of vertices on hull. #### Closest Pair of Points - **Idea:** Divide and Conquer. Divide points into two halves, recursively find closest pair in each, then combine. - **Time Complexity:** $O(n \log n)$. ### NP-Completeness - **P (Polynomial time):** Class of decision problems solvable in polynomial time. - **NP (Nondeterministic Polynomial time):** Class of decision problems where a 'yes' instance has a polynomial-time verifiable certificate. - **NP-Hard:** A problem $H$ is NP-hard if for every problem $L \in NP$, $L \le_p H$ (L is polynomial-time reducible to H). - **NP-Complete:** A problem $C$ is NP-complete if $C \in NP$ and $C$ is NP-hard. - **Common NP-Complete Problems:** - **SAT (Satisfiability):** Given a boolean formula, is there an assignment that makes it true? - **3-SAT:** SAT where each clause has exactly 3 literals. - **Clique:** Given a graph $G$ and integer $k$, does $G$ contain a clique of size at least $k$? - **Vertex Cover:** Given a graph $G$ and integer $k$, is there a vertex cover of size at most $k$? - **Hamiltonian Cycle:** Given a graph $G$, does it contain a Hamiltonian cycle? - **Traveling Salesperson Problem (TSP):** Given a list of cities and distances, find shortest possible route visiting each city exactly once. (Optimization problem, decision version is NP-Complete). - **Subset Sum:** Given a set of integers and a target sum, is there a subset that sums to the target? - **Knapsack:** Given items with weights and values, and a capacity, can a subset of items be chosen with total weight at most capacity and total value at least target value? - **Polynomial-time reduction ($L \le_p H$):** An algorithm that transforms any instance of problem $L$ into an instance of problem $H$ such that the solution to $H$'s instance can be used to solve $L$'s instance, and the transformation takes polynomial time.