### Introduction to NumPy NumPy (Numerical Python) is a fundamental package for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. - **Import Convention:** `import numpy as np` ### Array Creation Creating `ndarray` objects. - **From List:** ```python arr1 = np.array([1, 2, 3]) # 1D array arr2 = np.array([[1, 2], [3, 4]]) # 2D array ``` - **Zeros & Ones:** ```python zeros_arr = np.zeros((2, 3)) # 2x3 array of zeros ones_arr = np.ones((3, 2)) # 3x2 array of ones ``` - **Empty:** (Uninitialized, faster) ```python empty_arr = np.empty((2, 2)) ``` - **Range:** ```python range_arr = np.arange(0, 10, 2) # [0, 2, 4, 6, 8] ``` - **Linspace:** (Evenly spaced numbers) ```python linspace_arr = np.linspace(0, 1, 5) # 5 numbers from 0 to 1 ``` - **Identity Matrix:** ```python identity_mat = np.eye(3) # 3x3 identity matrix ``` - **Random:** ```python rand_arr = np.random.rand(2, 2) # Uniform [0, 1) randn_arr = np.random.randn(2, 2) # Standard Normal randint_arr = np.random.randint(0, 10, size=(2, 2)) # Integers [low, high) ``` ### Array Attributes Properties of `ndarray` objects. - **Shape:** `arr.shape` (Tuple of array dimensions) - **Dimensions:** `arr.ndim` (Number of array dimensions) - **Size:** `arr.size` (Total number of elements) - **Data Type:** `arr.dtype` (Data type of elements, e.g., `int64`, `float64`) - **Item Size:** `arr.itemsize` (Size in bytes of each element) ### Array Manipulation Reshaping, joining, splitting. - **Reshape:** ```python arr = np.arange(6) # [0, 1, 2, 3, 4, 5] reshaped_arr = arr.reshape((2, 3)) # [[0, 1, 2], [3, 4, 5]] ``` - **Flatten:** ```python flat_arr = reshaped_arr.flatten() # [0, 1, 2, 3, 4, 5] ``` - **Transpose:** ```python transposed_arr = reshaped_arr.T # [[0, 3], [1, 4], [2, 5]] ``` - **Concatenate:** ```python a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6]]) concat_v = np.concatenate((a, b), axis=0) # Vertical stack # [[1, 2], [3, 4], [5, 6]] concat_h = np.concatenate((a, b.T), axis=1) # Horizontal stack # [[1, 2, 5], [3, 4, 6]] ``` - **Stack:** ```python v_stack = np.vstack((a, b)) h_stack = np.hstack((a, b.T)) ``` - **Split:** ```python arr = np.arange(9).reshape(3, 3) # [[[0, 1, 2], [3, 4, 5], [6, 7, 8]]] h_split = np.hsplit(arr, 3) # Split horizontally into 3 arrays v_split = np.vsplit(arr, 3) # Split vertically into 3 arrays ``` ### Indexing & Slicing Accessing array elements. - **Basic Indexing:** ```python arr = np.array([10, 20, 30, 40, 50]) print(arr[0]) # 10 print(arr[-1]) # 50 ``` - **2D Indexing:** ```python mat = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print(mat[0, 1]) # 2 print(mat[1][2]) # 6 (less efficient) ``` - **Slicing (1D):** ```python print(arr[1:4]) # [20, 30, 40] print(arr[:3]) # [10, 20, 30] print(arr[2:]) # [30, 40, 50] print(arr[:]) # [10, 20, 30, 40, 50] ``` - **Slicing (2D):** ```python print(mat[0:2, 1:3]) # Rows 0-1, Cols 1-2 # [[2, 3], [5, 6]] print(mat[:, 0]) # First column # [1, 4, 7] ``` - **Boolean Indexing:** ```python arr = np.array([1, 2, 3, 4, 5]) mask = (arr > 2) # [False, False, True, True, True] print(arr[mask]) # [3, 4, 5] ``` - **Fancy Indexing:** ```python arr = np.array([10, 20, 30, 40, 50]) idx = np.array([0, 2, 4]) print(arr[idx]) # [10, 30, 50] ``` ### Array Operations Element-wise and aggregate operations. - **Arithmetic Operations (Element-wise):** ```python a = np.array([1, 2, 3]) b = np.array([4, 5, 6]) print(a + b) # [5, 7, 9] print(a * b) # [4, 10, 18] print(a / b) # [0.25, 0.4, 0.5] print(a ** 2) # [1, 4, 9] ``` - **Matrix Multiplication (Dot Product):** ```python a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6], [7, 8]]) print(a @ b) # Or np.dot(a, b) # [[19, 22], [43, 50]] ``` - **Universal Functions (ufuncs):** ```python arr = np.array([0, np.pi/2, np.pi]) print(np.sin(arr)) # [0., 1., 0.] print(np.sqrt(arr)) print(np.exp(arr)) ``` - **Aggregate Functions:** ```python arr = np.array([[1, 2], [3, 4]]) print(np.sum(arr)) # 10 (total sum) print(np.sum(arr, axis=0)) # [4, 6] (sum columns) print(np.sum(arr, axis=1)) # [3, 7] (sum rows) print(np.mean(arr)) print(np.min(arr)) print(np.max(arr)) print(np.std(arr)) # Standard deviation print(np.var(arr)) # Variance print(np.argmin(arr)) # Index of min value (flattened) ``` ### Broadcasting Rules for operating on arrays of different sizes. - **Rule 1:** If arrays don't have the same number of dimensions, prepend 1s to the smaller array's shape until dimensions match. - **Rule 2:** Arrays are compatible if, for each dimension, they have the same size, or one of them has size 1. - **Rule 3:** Arrays can be stretched to match a larger dimension. ```python a = np.array([[1, 2, 3], [4, 5, 6]]) # shape (2, 3) b = np.array([10, 20, 30]) # shape (3,) -> becomes (1, 3) print(a + b) # [[11, 22, 33], [14, 25, 36]] c = np.array([[10], [20]]) # shape (2, 1) print(a + c) # [[11, 12, 13], [24, 25, 26]] ``` ### Linear Algebra Common linear algebra operations. - **Dot Product:** `np.dot(a, b)` or `a @ b` - **Determinant:** `np.linalg.det(matrix)` - **Inverse:** `np.linalg.inv(matrix)` - **Eigenvalues/Eigenvectors:** `np.linalg.eig(matrix)` - **Solve Linear System:** `np.linalg.solve(A, b)` (for $Ax=b$) ### Saving & Loading Storing and retrieving NumPy arrays. - **Save to .npy:** ```python arr = np.array([1, 2, 3]) np.save('my_array.npy', arr) ``` - **Load from .npy:** ```python loaded_arr = np.load('my_array.npy') ``` - **Save to .npz (multiple arrays):** ```python a = np.array([1, 2]) b = np.array([3, 4]) np.savez('multiple_arrays.npz', a=a, b=b) ``` - **Load from .npz:** ```python data = np.load('multiple_arrays.npz') print(data['a']) print(data['b']) ``` - **Save/Load as Text:** ```python np.savetxt('my_array.txt', arr, delimiter=',') loaded_txt = np.loadtxt('my_array.txt', delimiter=',') ```