Memory Management - Poly...Memory Management with Bit Maps • Part of memory with 5 processes, 3...
Transcript of Memory Management - Poly...Memory Management with Bit Maps • Part of memory with 5 processes, 3...
-
1
1
Memory Management
2
Overview
• Basic memory “management” • Address Spaces • Virtual memory • Page replacement algorithms • Design issues for paging systems • Implementation issues • Segmentation
-
2
3
Memory Management • Ideally programmers want memory that is
– large – fast – non volatile
• Memory hierarchy – small amount of fast, expensive memory – cache – some medium-speed, medium price main memory – gigabytes of slow, cheap disk storage
• Memory manager – handles the memory hierarchy – Protects processes from each other.
Approaches
• Single Process, Contiguous Memory • Multiple Processes, Contiguous Memory • Multiple Processes, “Discontiguous” Memory • Multiple Processes, Only partially in memory
4
-
3
5
Basic Memory Management Single Process without Swapping or Paging
Three simple ways of organizing memory - an operating system with one user process
6
Binding
• If a program has a line: int x; When is the address of x determined?
• What are the choices? • At compile time • At load time • At run time
-
4
7
Base / Limit Registers
• Binding done at run time.
• Addresses are added to base value to map to physical address
• Addresses larger than limit value are an error
8
Swapping
Memory allocation changes as – processes come into memory – leave memory
Shaded regions are unused memory
-
5
9
Managing Free Memory
• Assume a process being loaded can ask for any size “chunk” of memory needed.
• We need to be able to find a chunk the right size.
• How can we keep track of free chunks efficiently?
10
Memory Management with Bit Maps
• Part of memory with 5 processes, 3 holes – tick marks show allocation units – shaded regions are free
• Corresponding bit map • Same information as a list
-
6
11
Memory Management with Linked Lists
Four neighbor combinations for the terminating process X
12
Order of Search for Free Memory
• We can search for a large enough free block of free memory starting from the beginning. That’s called first fit.
• If we already skipped over the first N holes because they were too small, maybe it’s a waste of time to look there again. Try next fit.
• Should we be more selective in our choice? After all we’re just grabbing the first thing that works…
• How does first (or next) fit effect fragmentation? Could we do better?
• Can we find a hole that fits faster? What are the downsides?
-
7
13
The Contiguous Constraint
• So far a process’s memory has been contiguous.
• What if it didn’t have to be? • What problems would that help solve? • How would the hardware need to change? • What additional work would the OS have to
do?
14
It’s All Gotta Be in Memory (or does it?)
• We have assumed that the entire process has to be in memory whenever it is running.
• What if it didn’t have to be? • What problems would that help solve? • How would the hardware need to change? • What additional work would the OS have to
do?
-
8
15
Virtual Memory
The position and function of the MMU
16
Paging
The relation between virtual addresses and physical memory addres- ses given by page table
-
9
17
Page Tables
Internal operation of MMU with 16 4 KB pages
18
Page Tables 2-level
• 32 bit address with 2 page table fields • Two-level page tables
Second-level page tables
Top-level page table
-
10
19
Page Table Entry
• Present/absent is also called Valid • Modified is also called Dirty • Referenced is also called Accessed • Why would caching be disable?
20
Pentium PTE
-
11
21
TLBs – Translation Lookaside Buffers
A TLB to speed up paging
22
Inverted Page Tables
Comparison of a traditional page table with an inverted page table
-
12
23
Page Replacement Algorithms
• Page fault forces choice – If there are no free page frames,
we have to make room for incoming page – Which page should be removed?
• Modified page frame must first be saved before being evicted – An unmodified page frame can just overwritten
• Better not to choose an often used page – Likely to be brought back in again soon
24
Optimal Page Replacement Algorithm
• Replace page needed at the farthest point in future – Optimal but unrealizable
• Estimate by … – logging page use on previous runs of process – although this is impractical
-
13
25
Not Recently Used Page Replacement Algorithm
• Each page has Reference bit, Modified bit – bits are set when page is referenced, modified
• Pages are classified 1. not referenced, not modified 2. not referenced, modified 3. referenced, not modified 4. referenced, modified
• NRU removes page at random – from lowest numbered non empty class
26
FIFO Page Replacement Algorithm
• Maintain a linked list of all pages – in order they came into memory
• Page at beginning of list replaced
• Disadvantage – page in memory the longest may be often used
-
14
27
Second Chance Page Replacement Algorithm
• Operation of a second chance – pages sorted in FIFO order – Page list if fault occurs at time 20, A has R bit set
(numbers above pages are loading times)
28
The Clock Page Replacement Algorithm
-
15
29
Least Recently Used (LRU)
• Principle: assume a page used recently will be used again soon – throw out page that has been unused for longest time
• Implementation – Keep a linked list of pages
• most recently used at front, least at rear • update this list every memory reference !!
– Or keep time stamp with each PTE • choose page with oldest time stamp • Again, this must be updated with every memory reference.
30
LRU (Another Hardware Solution)
LRU using a matrix – pages referenced in order 0,1,2,3,2,1,0,3,2,3
-
16
31
LRU Approximation?
• How could we approximate LRU? • We can’t track every time a page is referenced, but we can
sample the data. • How often? Once per clock tick, perhaps. • Update a counter for each page that has been referenced in
the last clock tick. • Take the page with the lowest count
i.e. “Not Frequently Used” (NFU) • How well does this work?
32
Simulating LRU in Software Aging
• The aging algorithm simulates LRU in software • Note 6 pages for 5 clock ticks, (a) – (e)
-
17
33
Thrashing
• A program causing page faults every few instructions is said to be thrashing.
• What causes thrashing? • If a process keeps accessing random new pages, then it is
hard to anticipate what it will use next. • Most programs exhibit temporal and spatial locality.
– Temporal locality: if the process accessed a particular address, it is likely to do so again soon.
– Spatial locality: if the process accessed a particular address, it is likely to access nearby addresses soon.
• Processes that follow this principle tend not to thrash unless they have to fight for memory.
34
Causing Thrashing within a single process
• The first nested loop demonstrates spatial locality • The second thrashes.
const int ROWS = 10000; const int COLS = 1024; int arr[ROWS][COLS];
int main() { for (int row = 0; row < ROWS; ++row) for (int col = 0; col < COLS; ++col) arr[row][col] = row * col;
for (int col = 0; col < COLS; ++col) for (int row = 0; row < ROWS; ++row) arr[row][col] = row * col;
}
-
18
35
The Working Set Page Replacement Algorithm (1)
• The working set is the set of pages used by the k most recent memory references
• w(k,t) is the size of the working set at time, t
36
The Working Set Page Replacement Algorithm (2)
The working set algorithm
-
19
37
The WSClock Page Replacement Algorithm
Operation of the WSClock algorithm
38
Review of Page Replacement Algorithms
-
20
39
Modeling Page Replacement Algorithms Belady's Anomaly
• FIFO with 3 page frames • FIFO with 4 page frames • P's show which page references show page faults
40
“Stack” Algorithms
State of memory array, M, after each item in reference string is processed
-
21
41
The Distance String
Probability density functions for two hypothetical distance strings
42
The Distance String
• Computation of page fault rate from distance string – the C vector – the F vector
-
22
43
Design Issues for Paging Systems Local versus Global Allocation Policies
• Original configuration • Local page replacement • Global page replacement
44
Page Fault Rate
• Page fault rate as a function of the number of page frames assigned • Use to determine if a process should be granted additional pages.
-
23
45
Load Control
• Despite good designs, system may still thrash
• When PFF algorithm indicates – some processes need more memory – but no processes need less
• Solution : Reduce number of processes competing for memory – swap one or more to disk, divide up frames they held – reconsider degree of multiprogramming
46
Page Size (1)
Small page size • Advantages
– less internal fragmentation – better fit for various data structures, code sections – less unused program in memory
• Disadvantages – programs need many pages, larger page tables
-
24
47
Page Size (2)
• Overhead due to page table and internal fragmentation
• Where – s = average process size in bytes – p = page size in bytes – e = page entry
page table space
internal fragmentation
Optimized when
48
Separate Instruction and Data Spaces
• One address space • Separate I and D spaces
-
25
49
Shared Pages
Two processes sharing same program sharing its page table
50
Cleaning Policy
• Need for a background process, paging daemon – periodically inspects state of memory
• When too few frames are free – selects pages to “evict” using a replacement
algorithm. – Evicted pages are kept in a pool in case they are
wanted again. • Dirty pages are scheduled to be written out.
-
26
51
Implementation Issues Operating System vs. Hardware for Paging
1. Process creation - Create and initialize page table. - Pre-fetch pages.
2. Context Switch - Point MMU to page table for new process - TLB flushed
3. Memory Reference - Map virtual address to physical address - Determine if page fault - If fault, determine whose fault and resolve
4. Page Replacement – Record page access / modifies – Determine page to replace.
5. Process termination time - release page table, pages
52
Instruction Backup
An instruction causing a page fault
-
27
Global Flag
• Some pages are used by every process in the system.
• Which? • What would we like to have happen special
for those pages during a context switch. • How can the hardware know?
53
54
Locking Pages in Memory
• Virtual memory and I/O occasionally interact • Proc issues call for read from device into buffer
– while waiting for I/O, another processes starts up – has a page fault – buffer for the first proc may be chosen to be paged out
• Need to specify some pages locked – exempted from being target pages
-
28
55
Backing Store
(a) Paging to static swap area (b) Backing up pages dynamically
56
Separation of Policy and Mechanism
Page fault handling with an external pager
-
29
57
Memory Management NT & Unix
• Exact details not as easy as in process scheduling. • Attempt to keep a set of “free pages” using a “page daemon”. (e.g.
Linux: kswapd, NT: Working Set Manger) • Essentially “demand paging”.
– NT brings in “clusters”, typically 8 pages for code, 4 for data. – Unix’s swapper used to bring in all referenced pages, when swapping a
process back in. Now uses demand paging. • OS present in every process’s virtual address space.
– Reduces changes needed in TLB and cache. – No changes needed for system call.
• Load control: Swap out entire processes “as needed”. This is currently something that one hopes to avoid, but the feature is still present.
• Some structure to describe where pages are currently, e.g. location in paging file. Unix: Core Map. NT: Page Frame Database.
• A portion (e.g., 64KB) of the top/bottom of address space unmapped.
58
MM in Unix • Early versions totally swapping • Current based on paging. • Page Daemon ¼ sec. • Refill free list when less then some value.
– BSD: lotsfree = ¼ memory. – SysV: has some min/max. If less than min, fill until max. Goal is to
reduce frequency of page daemon, thereby reducing thrashing. • Originally used Global Clock. • As memory sizes increased BSD went to 2-handed clock.
– Front clears referenced bit – Back hand picks victim, – Result: a referenced page is really recent.
• SysV instead keeps frequency count of pages not referenced. – Only boots page out if count greater than some value – Seems they had the opposite problem of the BSD folk.
-
30
59
MM in Unix
• Load Control. – If page daemon has to work too often then swap
someone out. – Every few seconds look for someone to bring back.
• Decision based on swapped out time, size, niceness, if it was sleeping…
• Only bring back the “user structure” portion of PCB and the page tables. (I’m surprised page tables are removed. Must record swapped out page for process somewhere…)
• “Core Map” describes frames, free list, location to swap to.
60
MM in NT • Demand paging of “Clusters”.
– (code: 8, data: 4). XP does some prepaging after application’s first run. • Maintains a “working set” per process, not thread (remember scheduling is per
thread) – WS is based on size, not time window. – Keeps a min/max per process – If we have > max and a page fault then steal process’s page. Otherwise add.
• ~ 1sec. Balance Set Manager, checks for sufficient free pages. Calls Working Set Manager to free pages.
– Sort processes by “desirability”. Large idle, first. Foreground last. – If < min and “enough page faults” since last check, then leave alone. – Determine #pages to remove from process. Depends on need, size of WS relative
to min/max. – Examine all pages in a process. Uses an unreferenced count, like AT&T Unix.
• Pages with “high” unreferenced removed. – Continue examining additional processes till sufficient pages free. More aggressive
passes as needed.
-
31
61
MM in NT
• Load Control. – Swapper runs every 4 sec. – If thread asleep for 7 sec. Mark thread’s kernel stack “in transition”. – When all threads in process marked, then swap out.
62
MM in NT
-
32
63
MM in FreeBSD • 1GB for kernel, unless you want lots of small processes using lots of
kernel services, then configure for 2GB • First few virtual pages reserved to catch bad pointers. • Shared libraries placed by default just below default stack limit. • Memory Lists:
– Wired – Active – Inactive. Dirty (min: 0%, max 4.5%)
• Pageout daemon moves pages from Inactive to “cache” – attempts to balance i/o load by not starting too many concurrent writes. – Runs as a kernel process. This way it has access to kernel data structures, etc., but
can be scheduled to run as a process so it can sleep. – Cache. Clean. (min: 3%, max 6%) – Free. (min: 0.7%, max 3%)
• One end is zeroed, the other is not. • Idle process tries to keep 75% of frames on Free list zeroed.
64
MM in FreeBSD • Page coloring. Attempt to avoid cache conflicts.
– Cache holds pages. A 1MB cache with 4KB pages has 256 cache pages. Each physical page on a 1MB boundary maps to the same cache page. Thus each cache page represents as many physical pages as there are MB’s of physical memory.
– When allocating frames, color coding attempts to avoid such potential conflicts by making contiguous virtual pages, get frames that map to contiguous cache pages.
• Pure demand paging when swapping back in. – Back in BSD4.3, count of resident pages was used. DIoF says might
return. • Page Replacement
– Least actively used algorithm. – Page has count of three when first brought in – Incremented each time reference bit found set to a max of 64 – Decremented each time referenced bit found not set.
• At zero page moved from Active to Inactive list.
-
33
65
Segmentation (1)
• One-dimensional address space with growing tables • One table may bump into another
66
Segmentation (2)
Allows each table to grow or shrink, independently
-
34
67
Segmentation (3)
Comparison of paging and segmentation
68
Implementation of Pure Segmentation
-
35
69
Segmentation with Paging: MULTICS (1)
• Descriptor segment points to page tables • Segment descriptor – numbers are field lengths
70
Segmentation with Paging: MULTICS (2)
A 34-bit MULTICS virtual address
-
36
71
Segmentation with Paging: MULTICS (3)
Conversion of a 2-part MULTICS address into a main memory address
72
Segmentation with Paging: MULTICS (4)
• Simplified version of the MULTICS TLB • Existence of 2 page sizes makes actual TLB more complicated
-
37
73
Segmentation with Paging: Pentium (1)
A Pentium selector
74
Segmentation with Paging: Pentium (2)
• Pentium code segment descriptor • Data segments differ slightly
-
38
75
Segmentation with Paging: Pentium (3)
Conversion of a (selector, offset) pair to a linear address
76
Segmentation with Paging: Pentium (4)
Mapping of a linear address onto a physical address
-
39
77
Segmentation with Paging: Pentium (5)
Protection on the Pentium
Level