Category: Advanced Operating Systems

  • Memory segmentation

    Memory segmentation

    This blog post contains notes I took on memory segmentation, from “OS in Three Easy Pieces Chapter 16”, one strategy for implementing virtual memory systems. In the next blog post, I’ll cover a different approach: paging.

    16.1 Segmentation: Generalized Base / Bounds

    Summary

    We use segmentation to divide the virtual address space into three distinct segments: code, stack and heap. This technique allows us to accommodate processes with large address spaces.

    16.2 Which segment are we referring to ?

    Summary

    How do hardware know which segment that the request is referring to? Are we requesting memory for stack, heap, or code? Well, there are two approaches: an explicit and implicit approach. For the explicit, we reserve 2 bits at the top of the address. So 00 might refer to code, and 01 might refer to heap, and 10 might refer to the stack. But there’s an implicit approach, too. If the address is based off of the stack base pointer, then we check the stack segment.

    16.3 What about the stack?

    Summary

    Why do we need special consideration for the stack? Because the stack grows backwards, from a higher memory address to lower. Because of this, we need an additional flag to tell the hardware whether the segment grows up or down.

    16.4 Support for sharing

    Summary

    Systems designers realized that with segmenting, processes can share memory with each other. In particular, they could share code segments and in order to support this feature, hardware must add an additional bit to the memory metadata. In particular, the “Protection” field tells us whether or not the segment can be read and executed or can be read and written to. Obviously, for code sharing, we need the segment to be read-execute only.

    16.5 Fine-grained vs Coarse-grained Segmentation Fault

    Summary

    Breaking the address space into (3) segments is considered coarse-grained and some designers preferred a more flexible address space, with a large number of smaller segments, known as fine-grained. The idea behind this was that the designers thought they could learn about which segments were in use and could utilize memory more effectively. Not sure if they were right, though.

    16.6 OS Support

    Summary

    There are two issues with segmentation: a pre-existing problem is how to handle context switches and the second is how to handle external fragmentation? With segmentation, we no longer have a contiguous block of addresses for the process. Instead, we chop them up. So what happens if the stack or heap requires 20KB and although we have 24KB, there’s no continuous space? Well, there are several approaches, the first being extremely inefficient: a background process runs and re-arranges the blocks into contiguous blocks: this approach is extremely inefficient. Another approach is to maintain some sort of free-list and there is a long list of algorithms (e.g. first-fit, worst-fit). Regardless of algorithm, we cannot avoid this issue called external fragmentation.

    16.7 Summary

    Summary

    Segmentation helps accomplish a more effective virtualization of memory but it does not come without cost or complexities. The primary issue is external fragmentation and the second issue is that the if the segments do not line up exactly, then we waste lots of memory.

  • My classmates syllabus in excel form

    My classmates syllabus in excel form

    One of my virtual class mates took the poorly formatted syllabus living on Canvas and converted the document into a beautifully organized excel sheet (above).

    I appreciate him sharing this screenshot since it saves me at least 15 minutes from copying and pasting and wrestling with inevitable formatting issues.  On top of that, I now have a better sense of what to expect over the next 14 weeks of the Fall 2020 term.

    Thanks Luis Batista!

    References

    1 – Piazza Post: https://piazza.com/class/kduvuv6oqv16v0?cid=6_f7

     

  • Snapshotting my understanding before tackling homework assignment #1

    Snapshotting my understanding before tackling homework assignment #1

    Before tackling the homework assignment, I’m going to rate myself on the questions (below), from a scale of 1 (no freaking clue) to 5 (truly understand). The point of this exercise that I just made up myself is that I want to capture my understanding (or lack thereof) at a moment and time. Why do this? Because I know I’ve learned a ton in graduate school (and at work) over the past couple years and it’s easy to lose sight of those little victories and I want to look back (in a few days or in a few weeks) and say “Hey! Look at all your learned!”

    questions from #1 homework assignment (Advanced Operating Systems)
    questions from #1 homework assignment (Advanced Operating Systems)

     

    1. Consider a processor that supports virtual memory. It has a virtually indexed physically tagged cache, TLB, and page table in memory. Explain what happens in such a processor from the time the CPU generates a virtual address to the point where the referenced memory contents are available to the processor.

    (more…)

  • Done with advanced operating systems refresher course

    I’ve finished watching the lectures and taking notes for the operating systems refresher course1 that covers operating system fundamentals, the course covering topics including virtual memory system (e.g. paging, virtually indexed physically tagged) and file systems (e.g. FAT, EXT, inodes) and multi-threaded program (e.g. mutexes, conditional variables, synchronization) and networking2. The notes can be found in the following blog posts:

    Some of the videos really saved my bacon. Without watching the multi-threaded series, I think I would’ve struggled more than necessary on completing the pre-lab homework assignment, which has us students troubleshoot a misbehaving producer/consumer multi-threaded program.

    Reference

    1 – Operating Systems Refresher Course – https://classroom.udacity.com/courses/ud098

    2 – Oh yea, in the title, I included “sort of” because I’m skipping over the networking module. Not because I don’t find the topics interesting (I do), but mainly for two reasons: the interest of time and I think I’m fairly comfortable with those topics (I mean, I did work at Cisco at an network engineer, previously worked in Amazon’s Route53 DNS team and I currently build networking/packet process devices)

  • Advanced Operating System Notes (File Systems)

    Advanced Operating System Notes (File Systems)

    Although I’ve covered file systems in previous courses, including graduate introduction of operating systems, I really enjoy learning more about this concept in a little depth.  The file system’s module is packed with great material, the material introducing a high level introduction of file system and then jumps into the deep end unveils what an operating system does behind the scenes.

    Main Take Aways

    • The high level purpose of a file system provides three key abstractions: file, filename and directory tree.
    • A developer can interface with the file system in two ways: a position based (i.e. cursor) and a memory based (i.e. block)
    • C function strol converts a string to a long
    • Mmap system calls maps a file on the file system to a contiguous block of memory (the second method a developer can interface with a file systeme)
    • FAT (file allocation table) glues and chains disk blocks together. It is essentially a linked list (persisted as a busy bit vector table) that is great for sequential reads (i.e. traversing the linked list) but awful for random access (i.e. to get to the middle, need to traverse from head)
    • EXT linux file system is based on inodes and improves random access using 12 direct pointers (13th pointer provides first level of indirection, 14th pointer second level of indirection, 15th pointer third level of direction)
    • Learned about the buffer cache (i.e. write-through) and how we a journal can help stabilize the system while maintaining decent performance
    Linked list and busy bit table
    Linked list and busy bit table

    (more…)

  • Daily Review – 2020/08/20

    This post reviews yesterday, Wednesday August 19th 2020. Should I change the title to yesterday’s date or keep today’s date? Not sure, but I should probably stay consistent in my posts moving forward.

    Although I’m physically exhausted and tired than usual from waking up early (around 04:30 to 05:00) every day to crank out studying or homework assignments, I feel emotionally and mentally better, less stressed out, knowing that I’m making forward progress on my assignments instead of dealing with an avalanche of work over the weekend.

    Photo of the day

    Breakfast on the hotel floor (Suncadia Hotel) with Elliott and Jess
    Breakfast on the hotel floor (Suncadia Hotel) with Elliott and Jess

    Word of the Day

    Despondent – in low spirits from loss of hope or courage.

    (more…)

  • Advanced Operating Systems – Day 3 Recap

    The below write up consists of notes that I took while watching Multi-Threaded programming module from the Advanced Operating system’s refresher course1. I watched this third module before the second module (on File Systems, the next lecture series I’ll watch) because I started tackling the homework assignment that has us students debug a buggy multi-threaded program written with the POSIX threading library.

    Although I was able to fix many of the problems by writing the man pages of various system calls (e.g. pthread_join2, pthread_exit3), I wanted to back my troubleshooting skills with some theoretical knowledge, and I’m glad I did because I had forgotten that if the main thread (first thread that spawns when OS starts the process) calls exit or return, then all threads exit; this behavior can be modified by having the main thread call pthread_join instead, causing the main thread to wait for some other specific thread to terminate.

    Recap

    Shows that Thread shares the heap, globals, constants, and code - but contains its own stack
    Shows that Thread shares the heap, globals, constants, and code – but contains its own stack

    Writing multi-threaded code is difficult and requires attention to detail. Nonetheless, multi-threaded offers parallelizing work — even on a single core. Threads are cheaper in terms of context switching when compare to process context switches, since threads share the same memory space (although each thread manages its own stack, which must be cleaned up after if the thread is created as a joinable thread — detached threads, on the other hand, are cleaned up automatically when they exit) When using threads, there are a couple different design patterns: team, dispatched, pipeline. Selecting the correct design depends on the application requirements. Finally, when writing multi-threaded programming, the program must keep in mind that there are two different problems that they need to consider: mutual exclusion and synchronization. Regardless, for the program to be semantically correct, the program must exhibit: concurrency, lack of deadlocks and mutual exclusion of shared resources/memory.

     

     

    (more…)

  • Advanced Operating System Notes – Memory Systems (2/2)

    Notes below are from my first study session at 05:00 AM, before I hit the rest of my morning routine: walking the dogs at the local park, blending a delicious fruit smoothy (see my blog post on best vegan smoothie), then jumping behind the desk for hours for work. In the evening, I detoured from watching lectures and instead, focused on working on the pre-lab assignment (write up on this in another post)

    Page table - the underlying data structure that maps virtual pages to physical pages
    Page table – the underlying data structure that maps virtual pages to physical pages

     

    Summary

    Rest of the video lectures wraps up the memory system, discussing how virtual memory and its indirection technique offers a larger address space, protection between processes and sharing between process. For this to work, we need a page table, a data structure mapping virtual page numbers to physical frame numbers. Within this table are page table entries, each entry containing metadata that tracks things such as whether the memory is valid or readable or executable and so on. Finally, we can speed up the entire memory look up process by using a virtually indexed, physically tagged cache, allowing us check the TLB and Cache in parallel.

    (more…)

  • Daily Review – 2020/08/18

    • Seems like my mind and body know to wake me early in the morning (around 05:00 AM), a small window in time in which I can cram in a lot of work before everyone else wakes up
    • Thinking about daily reviews rolling up into weekly reviews, into monthly reviews, etc
    • Out of the corner of my eyes, I witnessed a spider dashing across the room, so I caught it and temporarily placed it under a glass jar, the spider now sitting table and birthday card (will free it later)
    • Jess and Elliott accompanied me this morning, the two of them joining me on my daily dog walk (a couple photos below)
    • Caught Metric red handed: she was about eat her poop, so I slapped the kitchen window, freaking her out, at which point she ran inside the house
    • Watched half an episode of The Community — I love Abed — while eating dinner with Jess
    • Elliott woke up non stop during dinner, not giving Jess a single moment to relax or really even finish dinner in peace (one day this will change, they say)
    • At work, focused on reviewing team member’s code reviews, deploying some of my new features (to us-east-1)
    • Squeezed in two major study sessions, one at 05:30 AM until work (with dog walk in between) and one in the evening (this session focused on me stepping through multi-threaded code using the graphical debugger, me switching threads and inspecting the instructions and stack)
    Tired puppy from chasing the ball
    Tired puppy from chasing the ball
  • Advanced OS – Study Notes Reflection (from day 1)

    I divided studying into two sessions: one in the morning (around 04:30 am) and one in the evening after work and after my daughter has gone to bed.  In the morning, I completed the administrative tasks and watched lectures that cover new material and in the evening I refreshed my memory by taking the operating systems review course.

    OS Fundamentals Review: Quiz on calculating tag, index, and offset for cache entries
    OS Fundamentals Review: Quiz on calculating tag, index, and offset for cache entries

    (more…)