Tag: page fault

  • Global memory systems (part 1 of 2) notes

    Global memory systems (part 1 of 2) notes

    Introduction

    Summary

    Lessons will be broken down into three modules: GMS (i.e. can we use peer memory for paging across LAN) and DSM (i.e. can we make the cluster appear as a shared memory machine) and DFS (i.e. can we use cluster memory for cooperative caching of files)

    Context for Global Memory Systems

    Summary

    Key Words: working set

    Core idea behind GMS is when there’s a page fault, instead of checking disk, check the cluster memory instead. Most importantly, no dirty pages (does not complicate the behavior): the disk always has copies of the pages

    GMS Basics

    Lesson Outline
    Lesson Outline. GMS – how can we use peer memory for paging across LAN?

    Summary

    With GMS, a physical node carves out its physical memory into two areas: local (for working set) and global (servicing other nodes as part of being in the “community”). This global cache is analagous to a virtual memory manager in the sense that the GSM does not need to worry about data coherence: that’s the job of the upper applications.

    Handling Page Faults Case 1

    Handling Page Faults (Case 1)

    Summary

    Most common use case is a page fault on a local node. When the page fault occurs, the node will get the data from some other node’s global cache, increase its local memory footprint by 1 page, and correspondingly decrease its global memory footprint by 1 page.

    Handling Page Fault Case 2

    Handling Page Faults (Case 2)

    Summary

    In the situation which a node has no global memory (i.e. its local memory entirely consumed by working set), then the host handle a page fault by evicting a local page (i.e. finding a victim, usually through LRU page replacement policy), then requesting some page from another node’s global memory (i.e. the community service). What’s important to call out here is that there’s no changes in the number of local pages and no change in the number of global pages for the node.

    Handling Page Fault Case 3

    Handling Page Faults (Case 3)

    Summary

    Key Words: LRU, eviction, page fault

    Key Take away here is that if the globally oldest page lives in local memory, then we can free it up and expand the globally available memory for community service. Also, something that cleared up for me is this: all pages sitting in the global memory are considered clean since global memory is just a facility as part of the page faulting process so we can assume all pages are clean. But that’s not true for local pages, meaning, if we evict a page from local memory and its dirty, we must first write it out to disk

    Handling Page Faults Case 4

    Handling Page Faults (Case 4)

    Summary

    This is a tricky scenario, the only scenario in which a page lives in the working set on two nodes simultaneously.

    Local and Global Boundary Quiz

    Local and Global Boundary Quiz

    Summary

    Difficult quiz, actually. On Disk is not applicable for Node Q with Page X. And depends what happens with globally LRU (least recently used) stored in local, or global, part.

    Behavior of Algorithm

    Summary

    Basically, if there’s an idle node, its main responsibility will be accommodating peer pages. But once the node becomes busy, it will move pages into its working set

  • How to go beyond physical memory ?

    How to go beyond physical memory ?

    How can the OS make use of a larger, slower device to transparently provide the illusion of a large virtual address space?

    Overview

    Why try and create a large virtual address space, a virtual address space that is larger than the physical amount of address memory installed on the system? Well, the illusion of a larger virtual address space serves as a useful abstraction for application developers: they just request more memory and simply get it — thanks to the operating system.

    Swap Space
    Arpaci-Dusseau and Arpaci-Dusseau 2018) – Swap Space

     

    To this end, the OS creates a memory hierarchy. At the top sits the fastest cache (L1 or L2 or L3) and at the bottom sits hard disks (as much as 10,000 slower). And the speed of the cache is inversely proportional to the cost. That is, fast cache is expensive and slow cache is cheap.

    Essentially, the OS reserves portions of the disk as swap, the location in which memory is paged in (i.e. from disk to memory) and out of (from memory to disk). How does the OS determine which pages currently live in memory and which pages live on disk? By looking at the page table entry metadata. More specifically, if the present bit is set then the physical frame lives in memory. Otherwise, it lives on disk.

    In the scenario when the page lives on disk, the page handler must get invoked. With the present bit set as 0, the page handler will find the memory living in swap (the address is also stored in the metadata of the page table entry), page the memory into either a physical frame (that was available or freed up by the page replacement policy), set the present bit equal to 1, and finally re-run the instruction that generated the page fault (i.e. memory access when present bit set to 0).

    21.1 Swap Space

    Summary

    The swap space is orders of magnitude larger than physical memory, allowing OS to prevent a large amount of memory to the process and program developers

    21.2 The Present Bit

    Summary

    The present bit is a flag that lets the translate lookaside buffer (TLB), hardware or OS software based, whether the physical frame number itself lives in memory or whether it has been swapped out to disk. If the bit/flag is set to 0, then we must invoke the pagefault handler. Also, the TLB locates the page table based off of the page table register but this rises another question for me: who’s responsibility it is to set the page table register. Probably the OS since the OS is responsible for this per process page table. The OS is a beast!

    21.3 The Page Fault

    Summary

    The TLB checks the PTE and if the PTE’s present bit is set to 0, the page fault handler runs. The page fault handler will check the PTE’s metadata for the location / index of where the page table entry exists on disk (or whatever underlying swap mechanism). Then, the page fault handler will rerun the process and the TLB (hardware or software) will then update its cache. During this page fault workflow, the process is put into a blocked state so other processes can run: concurrency for the win!

    21.4 What if Memory is Full

    Summary

    Need to invoke the page replacement policy (like an LRU)

    21.5 Page Fault Control Flow

    Summary

    When a page fault occurs, the hardware (TLB) and software (OS) must carry out a sequence of actions, depending on the scenario. Focusing on the OS, if a page fault occurs, then the OS must fetch a new physical address, populate the page frame with the right contents, update the PTE, setting the valid bit equal to 1, update the PTE’s page frame number, then finally retry the instruction.

    21.6 When replacements really occur

    Summary

    To optimize the address space and ensure there are always pages available, the swap (or page) daemon runs in the background, freeing up memory for running processes and for the OS.