Memory Hierarchy Definition: A system of storage devices organized by speed, cost, and capacity. Components: Auxiliary Memory: Slowest, highest capacity, lowest cost (e.g., magnetic disks, tapes). Main Memory: Faster, smaller capacity than auxiliary, higher cost (e.g., RAM). Communicates directly with CPU and auxiliary memory via I/O processor. Cache Memory: Fastest, smallest capacity, highest cost (e.g., SRAM). Compensates for speed differential between CPU and main memory. Purpose: To achieve the highest possible average access speed while minimizing total memory system cost. Access Time Ratios (Typical): Cache:Main:Auxiliary $\approx$ 1:7:7000 (e.g., 100ns:700ns:700$\mu$s). Block Sizes (Typical): Cache: 1-16 words; Auxiliary: 256-2048 words. Main Memory Function: Central storage for programs and data during computer operation. Technology: Semiconductor integrated circuits. Types: Random-Access Memory (RAM): Read/write memory. Static RAM (SRAM): Internal flip-flops, holds data as long as power is applied. Easier to use, shorter read/write cycles. Dynamic RAM (DRAM): Stores binary info as electric charges on capacitors, requires periodic refreshing. Reduced power consumption, larger storage capacity. Read-Only Memory (ROM): Stores permanent programs and constants. Non-volatile (retains data without power). Bootstrap Loader: A program stored in ROM that initiates computer software operation when power is turned on. RAM and ROM Chips: Control Inputs: Chip select (CS) to enable the chip. Bidirectional Data Bus: Allows data transfer to/from CPU (read/write operations). Uses three-state buffers (logic 1, 0, high-impedance). ROM vs. RAM Capacity: ROM chips can have more bits than RAM chips of the same size due to less internal space per bit. Memory Address Map Definition: A pictorial representation of assigned address space for each memory chip in a system. Purpose: To establish interconnection between memory chips and the CPU based on memory size and available chip types. Example (1024x8 RAM, 512x8 ROM): Component Hexadecimal Address Address Bus (Bits 10 9 8 7-1) RAM 1 0000-007F 0 0 0 xxxxxxx RAM 2 0080-00FF 0 0 1 xxxxxxx RAM 3 0100-017F 0 1 0 xxxxxxx RAM 4 0180-01FF 0 1 1 xxxxxxx ROM 0200-03FF 1 x x xxxxxxxxx Chip Selection: Higher-order address bus lines select specific chips (e.g., line 10 for RAM/ROM selection, lines 8-9 for individual RAM chips). Auxiliary Memory Characteristics: High capacity, low cost, non-volatile, relatively slow access. Key Parameters: Access mode, access time, transfer rate, capacity, cost. Access Time: Time to reach a storage location and obtain its contents. For electromechanical devices, includes seek time and transfer time. Transfer Rate: Number of characters/words transferred per second after head positioning. Devices: Magnetic Disks: Circular plates coated with magnetic material, rotate at high speed. Data stored in concentric circles called tracks , divided into sectors . Read/write heads position over tracks. Transfer is fast once sector is reached. Hard Disks: Permanently attached. Floppy Disks: Small, removable plastic disks. Magnetic Tapes: Strip of plastic coated with magnetic recording medium. Data recorded in blocks (records) with gaps in between. Addressed by record number and character count. Other: Magnetic drums, magnetic bubble memory, optical disks. Associative Memory (Content Addressable Memory - CAM) Definition: Memory accessed by content rather than by address. Function: Reduces search time by performing parallel searches based on data content. Hardware Organization: Memory Array: Stores $m$ words of $n$ bits each. Argument Register (A): $n$ bits, holds the data to be searched. Key Register (K): $n$ bits, acts as a mask. Only bits in A with corresponding 1s in K are compared. Match Register (M): $m$ bits. A bit $M_i$ is set to 1 if word $i$ in memory matches the unmasked argument field. Match Logic: Compares argument bits $A_j$ with stored bits $F_{ij}$ (cell $j$ in word $i$). Equality of two bits $x_j = A_j F_{ij} + A_j' F_{ij}'$. Word $i$ matches if all unmasked $x_j$ are 1: $M_i = \prod_{j=1}^{n} (A_j F_{ij} + A_j' F_{ij}' + K_j')$. More expensive than RAM due to per-cell logic. Read Operation: After matching, scan match register for set bits ($M_i = 1$) and read corresponding words sequentially. Write Operation: Can be done by addressing locations sequentially (RAM-like for writing) or by finding empty locations (CAM-like). Tag Register: Used to distinguish active/inactive words. Active words have tag bit 1, deleted words have tag bit 0. Cache Memory Locality of Reference: Programs tend to access a few localized areas of memory repeatedly over short intervals. Temporal Locality: Recently accessed items are likely to be accessed again soon. Spatial Locality: Items near recently accessed items are likely to be accessed soon. Operation: CPU requests a word. Cache is checked. Hit: Word found in cache, read from fast cache. Miss: Word not in cache, read from main memory. A block of words (including the requested word) is transferred from main memory to cache. Performance: Measured by Hit Ratio = (Number of Hits) / (Total CPU References). High hit ratios (e.g., 0.9) significantly improve average memory access time. Mapping Procedures: How data is transformed from main memory to cache. 1. Associative Mapping: Any main memory block can be stored in any cache block. Cache stores both address (tag) and data. CPU address is compared with all stored tags in parallel (using associative memory). Most flexible, but most expensive. Replacement algorithms (FIFO, LRU) needed when cache is full. 2. Direct Mapping: Each main memory block maps to a specific, unique cache block (index field of address). CPU address is divided into Tag and Index fields. Index selects a cache block; tag stored in cache is compared with CPU tag. Simple to implement, but can have lower hit ratio if frequently accessed blocks map to the same cache location. Block size can be greater than one word (e.g., 8 words), improving hit ratio for sequential access. 3. Set-Associative Mapping: A compromise between associative and direct mapping. Cache is divided into "sets." Each main memory block maps to a specific set, but can be stored in any block within that set. CPU address is divided into Tag , Set Index , and Word fields. Set index selects a set; tag is compared associatively within that set. Improves hit ratio over direct mapping by allowing multiple blocks with same index but different tags to coexist. Writing into Cache: Write-Through Method: Updates main memory with every write operation. Cache is updated in parallel if it contains the word. Ensures main memory always has valid data, important for DMA. Write-Back Method: Only the cache location is updated during a write. Cache block is marked "dirty" with a flag. Main memory is updated only when the modified block is displaced from the cache. More efficient for multiple writes to the same cache block. Cache Initialization: When power is applied or main memory loaded, cache is considered empty. Valid Bit: Included with each cache word, cleared to 0 on initialization. Set to 1 when a word is first loaded from main memory. A new word replaces old data if valid bit is 0, or if valid bit is 1 and tags mismatch. Virtual Memory Concept: Gives programmers the illusion of a very large memory space, even with a small physical main memory. Address Mapping: Translates program-generated virtual addresses (in address space ) to actual physical addresses (in memory space ) in main memory. Done dynamically by hardware. Pages and Blocks: Address space is divided into fixed-size pages . Physical memory is divided into fixed-size blocks (also called page frames ). Programs are split into pages and moved between auxiliary and main memory in page-sized chunks. Address Mapping Using Pages: Virtual address is split into Page Number and Line (offset) . Line address is the same in both virtual and physical addresses. Mapping translates page number to block number. Memory Page Table: Stores block numbers for pages currently in main memory. Includes a Presence Bit to indicate if a page is in main memory (1) or auxiliary memory (0). Associative Memory Page Table (TLB - Translation Look-aside Buffer): More efficient than RAM-based page tables, especially for large address spaces with few active pages. Stores most recently referenced page number/block number pairs. Virtual address page number is searched associatively. If a match (hit) occurs, block number is retrieved quickly. If no match (miss), a slower memory page table is accessed, and the entry is added to the TLB for future reference. Page Replacement: When main memory is full and a new page is needed. Page Fault: Occurs when CPU references a page not in main memory. Program execution is suspended, required page is fetched from auxiliary memory. Replacement Algorithms: Used to decide which page to remove from main memory. FIFO (First-In, First-Out): Replaces the page that has been in memory the longest. Simple to implement. LRU (Least Recently Used): Replaces the page that has not been used for the longest time. More complex (e.g., aging registers/counters), but often more effective. Memory Management Hardware (MMU) Purpose: Manages memory in multiprogramming environments (dynamic relocation, sharing, protection). Components: Dynamic Storage Relocation: Maps logical to physical addresses (similar to paging). Sharing Common Programs: Allows multiple users to share a single copy of a program. Protection: Prevents unauthorized access or modification of memory areas. Logical Address: Address generated by a segmented program. Associated with variable-length segments. Segment: A set of logically related instructions or data elements (e.g., subroutine, array). Segmented-Page Mapping: Logical address is partitioned into Segment Number , Page Number , and Word fields. Uses two tables for mapping: Segment Table: Segment number points to an entry that contains a pointer to a page table. Page Table: Page number (offset from segment table entry) points to the block number in physical memory. Concatenating the block number with the word field gives the physical address. Often uses an associative memory (TLB) to speed up mapping. Memory Protection: Applied to logical address space (more effective than physical address space). Segment Descriptor: An entry in the segment table or segment register. Contains: Base Address Field: Pointer to page table or block base address. Length Field: Specifies segment size (max number of pages). Used for boundary checks. Protection Field: Specifies access rights. Access Rights Examples: Full read/write, read-only, execute-only, system-only.