Category: Algorithms

  • Tired like a zombie & Daily Review – Day ending in 2020/09/15

    Tired like a zombie & Daily Review – Day ending in 2020/09/15

    Today is going to be rough. I slept horribly, waking up multiple times throughout the night. Ultimately, I rolled out of my tri-folding foam mattress (a temporary bed while my daughter and wife sleep on the mattress in a separate room as to not wake me up: that parent life) at 03:45 AM this morning. Perhaps the gods above are giving me what I deserve since I had complained yesterday that I had “slept in” and as a result didn’t get a chance to put in any meaningful work before work. So now they are punishing me. Touché. Touché.

    Yesterday

    Fell into a black hole of intense focus while hunting down a bug that was crashing my program (for project 1 of advanced operating systems).  Sometimes I make the right call and distance myself from a problem before falling into a viscous mental loop and sometimes (like this scenario) I make the right call and keep at a problem and ultimately solve it.

    Writing

    Best parts of my day

    A poop explosion. Elliott’s nugget rolling out of her diaper and landing on the bathroom floor. Making us two parents chuckle
    • Teaching Elliott how to shake her head and signal “no”. For the past few months, I’ve tried to teach her during our daily bathes but when I had tried to previously teach her, her body and head were not cooperating with her. When she had tried say no, she was unable to interdependently control her head movement, her entire body would turn left and right along with her. But yesterday, she got it and now, she loves saying “no” even though she really means yes. She’s so adorable.
    • Jess yelling out for me to rush over to the bathroom to help her … pick up Elliott’s poop that rolled out of her diaper, two nuggets falling out, one landing on the tile floor while the other smashing on the floor mat
    • Catching up over the phone with my friend Brian Frankel. He’s launched a new start up called Cocoon, his company aiming to solve the problem of slimy mushrooms and slimy strawberries in the refrigerator. I had bought one of his new inventions mainly to support his vision (it’s always nice to support friends) but also I’m a huge fan of refrigerator organization and cleanliness.  Unfortunately, the box arrived broken (looks like something heavy in the delivery truck landed on the box, crushing it into pieces)

    Mental and Physical Health

    • At the top of the hour (not every hour, unfortunately) I hit the imaginary pause button, pulling my hands off the keyboard and stepping back from my standing desk to stretch my hamstrings and strengthen my hips with Asians squats

    Graduate School

    • Watched no lectures yesterday, all the time (about 2 hours) dumped into polishing up the virtual CPU scheduler, adding a new “convergence” feature that skips the section of code that (re)pins the virtual CPUs to physical CPUs, skipping when the standard deviation falls 5.0% (an arbitrary number that I chose)
    • Wrestled with my program crashing. The crashes’s backtrace was unhelpful since the location of the code that had nothing to do with the code that I had just added.

    Work

    • Attended a meeting lead by my manager, the meeting reviewing the results of the “Tech Survey”. The survey is released by the company every year, asking engineers to answer candidly to questions such as “Is your work sustainable?” or “Is your laptop’s hardware sufficient for your work?”. Basically, it allows the company to keep a pulse of how the developer experience is and is good starting point for igniting necessary changes.
    • Stepped through code written by a principle engineer, C code that promised to bound a binary search by trading off 2 bytes to serve as an index.

    Family and Friends

    • Fed Elliott during my lunch. Was extremely tiring (but at the same time, enjoyable) chasing her around the kitchen floor, requiring me constantly squat and constantly crawl. She’s mobile now, working her way up to taking one to two steps.
    • Bathed Elliott last night and taught her how to touch her shoulders, a body part she’s been completely unaware of. Since she loves playing with my wedding right, I let her play with during our night time routine and last night I would take the ring, and place the ring on her infant sized shoulder, pointing to it and guiding her opposite hand to reach out to grab the ring.
    • Caught up with one of our friends over Facetime. Always nice to see a familiar face during COVID-19, a very isolating experience that all of society will look back on in a few years, all of us wondering if it just all a bad dream because that’s what it feels like

    Miscellaneous

    • Got my second hair cut this year (damn you COVID-19) at nearby hair salon. I love the hair salon for a multiple reasons. First, the entire trip — from the moment I leave my house, to the moment I return back — takes 30 minutes, a huge time saver. Second, the stylist actually listens to what I want (you’d be surprised how many other stylists get offended when I tell them what I want and what I don’t want) and gives me a no non-sense hair cut. And third, the hair cut runs cheap: $20.00.  What a steal! I don’t mind paying more for hair cuts (I was previously paying triple that price at Steele Barber).

    Today

    Mainly will just try to survive the day and not collapse from over exhaustion. If I can somehow afford the time, I’d like to nap this afternoon for half an hour. Will gladly trade off 30 minutes of lunch if that means not being zapped of all my energy for the remainder of the day.

    Writing

    • Publish “Synchronization” notes (part 2)
    • Publish daily review (this one that I’m writing right here)

    Mental and Physical Health

    Graduate School

    • Write up the README for the two parts of my project
    • Change the directory structure of project 1 so that the submission validator passes
    • Submit the project to Canvas
    • Watch 30 minutes worth of lectures (if my tired brain can handle it today)

    Work

    • Interview a candidate “on-site” later this afternoon
    • Continue troubleshooting unreproducible fuzzing failure (will try to tweak our fuzzer for a potential out of memory issue)

    Family

    • Pack a couple more boxes with Jess. Only a couple more weeks and we move our family unit into a new home in Renton.
  • Synchronization (Notes) – Part 1

    Synchronization (Notes) – Part 1

    I broke down the synchronization topic into two parts and this will cover material up to and including the array based queuing lock. I’ll follow up with part 2 tomorrow, which will include the linked list based queuing lock.

    There a couple different types of synchronization primitives: mutual exclusion and barrier synchronization. Mutual exclusion ensures that only one thread (or process) at a time can write to a shared data structure, whereas barrier synchronization ensures that all threads (or processes) reach a certain point in the code before continuing. The rest of this summary focuses on the different ways we as operating system designers can implement mutual exclusion and offer it as a primitive to system programmers.

    To implement a mutual exclusion lock, three instructions need to be bundled together: reading, checking, writing. These three operations must happen atomically, all together and at once. To this end, we need to leverage the underlying hardware’s instructions such as fetch_and_set, fetch_and_increment, or fetch_and_phi.

    Whatever way we implement the lock, we must evaluate our implementation with three performance characteristics:

    • Latency (how long to acquire the lock)
    • Waiting time (how long must a single thread must wait before acquiring the lock)
    • Contention (how long long to acquire lock when multiple thread compete)

    In this section, I’ll list each of the implementations ordered in most basic to most advanced

    • Naive Spin Lock – a simple while loop that calls test_and_set atomic operation. Downside? Lots of bus traffic for cache invalidation or cache updating. What can we do instead?
    • Caching Spin Lock – Almost identical to the previous  implementation but instead of spinning on a while loop, we spin on a read, followed by a check for test_and_set. This reduces noisy bus traffic of test_and_test (which again, bypasses cache and writes to memory, every time)
    • Spinlock with delay – Adds a delay to each test_and_test. This reduces spurious checks and ensures that not all threads perform test_and_set at the same time
    • Ticket Lock – Each new request that arrives obtains a ticket. This methodology is analogous to a how a deli hands out tickets for their customers; it is fair but still signals all other threads to wake up
    • Array based queuing lock – An array that is N in size (where N is the number of processors). Each entry in the array can be in one of two states: has lock and must-wait. Fair and efficient. But trades off space: each lock gets its own array with N entries where N is number of processors. Can be expensive for machines with thousands of processors and potentially wasteful if not all of them contend for the lock

    Lesson Summary

    Summary

    There are two types of locks: exclusive and shared lock. Exclusive lock guarantees that one and only one thread can access/write data to a location. Shared allows multiple readers, but guaranteeing no other thread is writing at that time.

    Synchronization Primitives

    Summary

    Another type of synchronization primitive (instead of mutual exclusion) is a barrier. A barrier basically ensures that all threads reach a certain execution point and only then can the threads advance. Similar to real life when you get to a restaurant and you cannot be seated until all your party arrives. That’s an example of a barrier.

    Programmer’s Intent Quiz

    Summary

    We can write that enforces barrier with a simple while loop (but its inefficient) using atomic reads and writes

    Programmer’s Intent Explanation

    Summary

    Programmer’s Intent Quiz

    Because of the atomic reads and writers offered by the instruction set, we can easily use a flag to coordinator between processes. But are these instructions sufficient for creating a mutual exclusion lock primitive?

    Atomic Operations

    Summary

    The instruction set ensures that each read and write operation is atomic. But if we need to implement a lock, there are three instructions (read, check, write) that need to be bundled together into a single instruction by the hardware. This grouping can be done in one of three ways: test_and_set or (generically) fetch_and_increment or fetch_and_phi

    Scalability Issues with Synchronization

    Summary

    What are the three types of performance issues with implementing a lock? Which of the two are under the control of the OS designer? The three scalability issues are as follows: latency (time to acquire lock) and waiting time (how long does a thread wait until it acquires the lock) and contention (how long does it take, when all threads are competing to acquire lock, for lock to be given to a single thread). The first and third (i.e. latency and contention) are under the control of the OS designer. The second is not: waiting time is largely driven by the application itself because the critical section might be very long.

    Naive Spinlock (Spin on T+S)

    Summary

    Why is the SPIN lock is considered naive? How does it work? Although the professor did not explicitly call out why the SPIN lock is naive, I can guess why (and my guess will be confirmed in the next slide, during the quiz). Basically, each thread is eating up CPU cycles by performing the test and test. Why not just signal them instead? That would be more efficient. But, as usual, nothing is free so there must be a trade off in terms of complexity or cost. But returning back to how the naive scheduler works, it’s basically a while loop with the test_and_test (semantically guaranteeing on a single thread will receive the lock).

    Problems with Naive Spin Lock

    Problems with naive spinlock

    Summary

    Spin lock is naive for three reasons: too much contention (when lock is released, all threads, maybe thousands, will jump at the opportunity) and does not exploit cache (test and set by nature must bypass cache and go straight to memory) and disrupts useful work (only one process can make forward progress)

    Caching Spinlock

    Caching Spin lock (spin on read)

    Summary

    To take advantage of the cache, processors will first call while(L == locked), taking advantage of the cache. Once test_and_set occurs, all processors will then proceed with test_and_set. Can we do better than this?

    Spinlocks with Delay

    Summary

    Any sort of delay will improve performance of a spin lock when compared to the naive solution

    Ticket Lock

    Summary

    A fair algorithm that uses a deli or restaurant analogy: people who come before will get lock before people who come after. This is great for fairness, but performance still lacking in the sense that when a thread releases its lock, an update will be sent across the entire bus. I wonder: can this even be avoided or its the reality of the situation?

    Spinlock Summary

    Array-based queuing lock

    Summary

    A couple differentiations toptions 1) Read and Test and Test (no fairness) 2) Test and Set with Delay (no fairness) 3) Ticket Lock (fair but noisy). How can we do better? How can we only signal one and only thread to wake up and attempt to gain access to the lock? Apparently, I’ll learn this shortly, with queueing locks. This lesson reminds me of a conversation I had with my friend Martin around avoiding a busy while() loop for lock contention, a conversation we had maybe about a year or so ago (maybe even two years ago)

    Array Based Queuing Lock

    Summary

    Create an array that is N in size (where N is the number of processors). Each entry in the array can be in one of two states: has lock and must-wait. Only one entry can ever be in has-lock state. I’m proud that I can understand the circular queue (just by looking at the array) with my understanding of the mod operator. But even so, I’m having a hard time understanding how we’ll use this data structure in real life.

    Array Based Queuing Lock (continued)

    Summary

    Lock acquisition looks like: fetch and increment (queue last) followed by a while(flags[mypalce mod N] == must wait). Still not sure how this avoids generating cache bus traffic as described earlier…

    Array-based queue lock

    Summary

    Array-basd queuing lock (mod)

    Using mod, we basically update the “next” entry in the array and set them to “has_lock”

  • Honoring my body’s internal alarm clock & Daily Review – Day ending in 2020/09/14

    Honoring my body’s internal alarm clock & Daily Review – Day ending in 2020/09/14

    This morning my body woke me up later than usual. After a few blinks, I squeezed the corner of my Casio G-Shock watch, the green background lighting up and shining the time: 05:55 AM. Ugh. About an hour later than I wanted to wake up.

    On one hand, I’m bummed because I won’t be able to squeeze in as much uninterrupted time before work but on the other hand, my body and brain probably needed the extra sleep. Otherwise, why “sleep in” ?  I try to honor and listen to my body’s signals, another reason why over the past 5 years I’ve stopped setting an alarm clock and instead permitted my body to wake up naturally, whenever my body is ready.

    Oh well. Let’s get cracking.

    Yesterday

    Writing

    Best parts of my day

    • My co-worker unintentionally making me chuckle. During my team’s daily stand up meeting yesterday, I had asked my co-workers how they were coping with all the smoke blanketing the Seattle skies. And my co-worker’s response caught me by surprise. She said that back home in India, there’s always a thick cloud of smoke always resting above their heads and really, the wildfire smoke in Seattle reminds her of home.

    Mental and Physical Health

    • Yesterday I resumed my ritual of switching back and forth between sitting and standing (thank you Jarvis standing desk). At the top of every hour, I try to hit the imaginary pause button and stretch my hamstrings by reaching for the ground with my finger tips. Not much exercise but every little bit counts.

    Graduate School

    • Started watching Barrier synchronization lectures. It’s so amazing how deep I am diving into computer science. I really do enjoy learning how to an OS system designer (maybe me somebody) implements primitive structures such as mutual exclusion and barrier synchronization.
    • Started updating virtual CPU scheduler code to support the notion of “convergence”. The idea is that if the underlying physical CPU utilization ever so slightly deviate (that exact percentage is yet to be determined) then the scheduler should leave them as is and not try to re-balance the work load.

    Work

    • Scheduled an ad-hoc meeting with a principle engineer that took place that day, the two of us troubleshooting a fuzzing failure, the worst kind of failures: failures that cannot be reproduced.
    • Read a design proposal written up by a different principle engineer who evaluated different types of data structures. Realized that the data structure that I had envisioned us using will not meet performance requirements for IPv6 traffic

    Family

    • Took dogs on a very short walk of about 15 minutes at the local park. I skipped walking them over the weekend because of the thick smoke but the two puppies were getting a bit restless so I compromised, sacrificing a (hopefully) very tiny bit of our long term health in order to get them the physical stimulation they needed for the rest of the day

    Today

    Quote of the day

    Have you ever felt out of place in your place?

    I pulled that quote from a rap line from the song Breathing (Part 2) by Hamzaa featuring Wretch 32 & Ghetts.

    Most of my life, I always felt “out of place”. One instance of being out of place was during my teenage years and being one of the only (or perhaps the only) Vietnamese kids attending Calabasas high school, a high school where mostly everyone is white and where everyone thinks they are white.

     

    Writing

    • Publish “Synchronization” notes
    • Publish daily review (this one that I’m writing right here)

    Mental and Physical Health

    • Won’t be running in this smoke so instead, let’s spend 2 minutes (really, that’s all) throughout the day to get the heart pumping. Maybe some jumping jacks or some push ups. And of course, stretching the hamstrings and hip rotators while working behind by standing desk

    Graduate School

    • Fix memory leak with my recent additions to the CPU scheduler
    • Start writing up documentation that will accompany the submission for project 1
    • Watch 15-20 minutes worth of lectures from the barrier synchronization series

    Work

    • Meetings and meetings and meetings (not a really fun day) of sprint planning, Asians@ planning and development meetings, interview debrief

    Miscellaneous

    • Hair cut at 4:00 PM. This will be the second hair cut this year, the last one taking place in June. Obviously I’m trying to minimize unnecessary interactions with other people but damn, I look like a shaggy dog with my heavy and coarse hair weighing me down.
  • Memory Virtualization (notes)

    Memory Virtualization (notes)

    The operating system maintains a per process data structure called a page table, creating a protection domain and hardware address space: another virtualization technique. This page table maps the virtual address space to physical frame number (PFN).

    The same virtualization technique is adopted by hypervisors (e.g. VMWare). They too have the responsibility of mapping the guest operating system’s “physical address” space (from the perspective of the guest VM) to the underlying “machine page numbers”.  These physical address to machine page number mappings are maintained by the guest operating system in a para virtualized system, and maintained by the hypervisor in a fully virtualized environment.

    In both fully virtualized and para virtualized environments, memory mapping is done efficienctly. In the former, the guest VM will send traps and in response, the hypervisor will perform its mapping translation. But in a para virtualized environment, the hypervisor leans on the guest virtual machine much more, via a balloon driver. In a nutshell, the balloon driver pushes the responsibility of handling memory pressure from the hypervisor to the guest machine. The balloon driver, installed on the guest operating system, inflates when hypervisor wants the guest OS free memory. The beauty of this approach is that If the guest OS is not memory constrained, no swapping occurs: the guest VM just removes an entry in its free-list. When the hypervisor  wants to increase the memory for a VM, it signals (through the same private communication channel) the balloon driver to deflate, signaling the guest OS to page in and increase the size of its memory footprint.

    Finally, the hypervisor employs another memory technique known as oblivious page sharing. With this technique, the hypervisor runs a hashing algorithm in a background process, creating a hash of each of memory contents of each page. If two or more virtual machines contain the same page, then the hypervisor just creates a “copy-on-write” page, reducing the memory footprint.

    Memory Hierarchy

    Memory Hierarchy

    Summary

    The main thorny issue of virtual memory is translating virtual address to physical mapping; caches are physically tagged so not a major source of issues there.

    Memory Subsystem Recall

    Summary

    A process has its own protection domain and hardware address space. This separation is made possible thanks to virtualization. To support memory virtualization, we maintain a per process data structure called a page table, the page table mapping virtual addresses to physical frame numbers (remember: page table entries contain the PFN)

    Memory Management and Hypervisor

    Summary

    The hypervisor has no insight into the page tables for the processes running on the instances (see Figure). That is, Windows and Linux serve as the boundary, the protection domain.

    Memory Manager Zoomed Out

    Summary

    Although the virtual machines operating systems think that they allocate contiguous memory, they are not: the hypervisor must partition the physical address space among multiple instances and not all of the operating system instances can start at the same physical memory address (i.e. 0x00).

    Zooming Back in

    Summary

    Hypervisor maintains a shadow page table that maps the physical page numbers (PPN) to the underlying hardware machine page numbers (MPN)

    Who keeps the PPN MPN Mapping

    Summary

    For fully virtualization, the PPN->MPN lives in the hypervisor. But in a para virtualization, it might make sense to live in the guest operating system.

    Shadow page table

    Zooming back in: shadow page table

    Summary

    Similar to a normal operating system, the hypervisor contains the address of the page table (stored in some register) that lives in memory. Not much different than a normal OS, I would say

    Efficient Mapping (Full Virtualization)

    Summary

    In a fully virtualized guest operating system, the hypervisor will trap calls and perform the translation from virtual private number (VPN) to the machine private number (MPN), bypassing the guest OS entirely

    Efficient Mapping (Para virtualization)

    Summary

    Dynamically Increasing Memory

    Summary

    What can we do when there’s little to no physical/machine memory left and the guest OS needs more? Should we steal from one guest VM for another? Or can we somehow get a guest VM to voluntarily free up some of its pages?

    Ballooning

    VMWare’s technique: Ballooning

    Summary

    Hypervisor installs a driver in the guest operating, the driver serving as a private channel that only the hypervisor can access. The hypervisor will balloon the driver, signaling to the guest OS to page out to disk. And then it can also signal to the guest OS to default, signaling to page in.

    Sharing memory across virtual machines

    Summary

    One way to achieve memory sharing is to have the guest operating system cooperate with the hypervisor. The underlying guest virtual machine will signal, to the hypervisor, that a page residing in the guest OS will be marked as copy on write, allowing other guest virtual machines to share the same page. But when the page is written to, then hypervisor must copy that page. What are the trade offs?

    VM Oblivious Page Sharing

    VMWare and Oblivious Sharing

    Summary

    VMWare ESX maintains a data structure that maps content hash to pages, allowing the hypervisor get a “hint” of whether or not a page can be shared. If the content hash matches, then the hypervisor will perform a full content comparison (more on that in the next slide)

    Successful Match

    Summary

    Hypervisor process runs in the background, since this is a fairly intensive operation, and will update guest VMs pages as “copy-on-write”, only running this process when the hypervisor is lightly loaded

    Memory Allocation Policies

    Summary

    So far, discussion has focused on mechanisms, not policies. Given memory is such a precious resource, a fair policy with me dynamic-idle adjusted shares approach. A certain percentage (50% in the case of ESX) of memory will be taken away if its idle, a tax if you will.

  • Papers to read for designing and writing up the C memory coordinator

    Papers to read for designing and writing up the C memory coordinator

    Below are some memory management research papers that my classmate shared with the rest of us on Piazza1. Quickly scanning over the papers, I think the material will point me in the right direction and will paint a clearer picture of how I might want to approach writing my memory coordinator. I do wonder if I should just experiment on my own for a little and take a similar approach for part 1 project, when I wrote a round robin (naive) scheduler for CPU scheduling. We’ll see.

    Recommended readings

    References

    1 – https://piazza.com/class/kduvuv6oqv16v0?cid=221

     

     

  • A snapshot of my understanding before tackling the memory coordinator

    A snapshot of my understanding before tackling the memory coordinator

    Now that I finished writing the vCPU scheduler for project 1, I’m moving on to the second part of the project called the “memory coordinator” and here I’m posting a similar blog post to a snapshot of my understanding of project 1 , the motivation being that I take for granted what I learned throughout graduate school and rarely celebrate these little victories.

    This post will focus on my unknowns and my knowledge gaps as it relates to the memory coordinator that we have to write in C using libvrt. According to project requirements, the memory coordinator should:

    dynamically change the memory of each virtual machine, which will indirectly trigger balloon driver.

    Questions I have

    As I chip away at the project, more questions will inevitably pop up and when they do, I’ll capture them (but in a separate blog post). So here’s my baseline understanding of what a memory coordinator:

    • What are the relevant memory statistics that should be collected?
    • Will the resident set size (RSS) be relevant to the decision making algorithm? Or it is  irrelevant?
    • What is the upper and lower bounds of memory consumption that triggers the ballooning driver to page memory out or page memory in?
    • Will the balloon driver only trigger for virtual machines that are memory constrained?
    • Does the hypervisor’s memory footprint matter (I’m guessing yes, but to what extent)?

     

  • A naive round robin CPU scheduler

    A naive round robin CPU scheduler

    A couple days ago, I spent maybe an hour whipping together a vary naive CPU scheduler for project 1 in advanced operating systems. This naive scheduler pins each of the virtual CPUs in a round robin fashion, not taking utilization (or any other factor) into consideration. For example, say we have four virtual CPUs and two physical CPUs; the scheduler will assign virtual CPU #0 to physical CPU #0, virtual CPU #1 to physical CPU #1, virtual CPU#3 to physical CPU #0 and virtual CPU#0 to physical CPU#1.

    This naive schedule is far from fancy — really the code just performs a mod operation to wrap around the length of the physical CPUs and avoid an index error and carries out a left bit shift operation to populate the bit map — but performs surprisingly well based off monitoring results (below) that measure the utilization of each physical CPU.

    Of course, my final scheduler will pin virtual CPUs to physical CPUS more intelligently,  taking the actual workload (i.e. time in nanoseconds) of the virtual CPUs into consideration.  But as always, I wanted to avoid pre-optimization and jump to some fancy algorithm published in some research paper and I’m glad I started with a primitive scheduler that, for the most part, evenly distributes the work apart from the fifth test (which generates uneven workloads), the only test in which the naive scheduler creates a more uneven workload.

    With this basic prototype in place, I should be able to come up with a more sophisticated algorithm that takes the virtual CPU utilization into consideration.

    Test Case 1

    In this test case, you will run 8 virtual machines that all start pinned to pCPU0. The vCPU of each VM will process the same workload.

    Expected Outcome

    Each pCPU will exhibit an equal balance of vCPUs given the assigned workloads (e.g., if there are 4 pCPUs and 8 vCPUs, then there would be 2 vCPUs per pCPU).

    --------------------------------------------------
    0 - usage: 103.0 | mapping ['aos_vm1', 'aos_vm8', 'aos_vm4', 'aos_vm6', 'aos_vm5', 'aos_vm2', 'aos_vm3', 'aos_vm7']
    1 - usage: 0.0 | mapping []
    2 - usage: 0.0 | mapping []
    3 - usage: 0.0 | mapping []
    --------------------------------------------------
    0 - usage: 99.0 | mapping ['aos_vm1', 'aos_vm8', 'aos_vm4', 'aos_vm6', 'aos_vm5', 'aos_vm2', 'aos_vm3', 'aos_vm7']
    1 - usage: 0.0 | mapping []
    2 - usage: 0.0 | mapping []
    3 - usage: 0.0 | mapping []
    --------------------------------------------------
    0 - usage: 49.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 47.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 50.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 49.0 | mapping ['aos_vm6', 'aos_vm7']
    --------------------------------------------------
    0 - usage: 60.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 65.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 61.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 61.0 | mapping ['aos_vm6', 'aos_vm7']
    --------------------------------------------------

    Test 2

    In this test case, you will run 8 virtual machines that start with 4 vCPUs pinned to pCPU0 and the other 4 vCPUs pinned to pCPU3. The vCPU of each VM will process the same workload.

    Expected Outcome

    Each pCPU will exhibit an equal balance of vCPUs given the assigned workloads.

    --------------------------------------------------
    0 - usage: 102.0 | mapping ['aos_vm1', 'aos_vm4', 'aos_vm5', 'aos_vm3']
    1 - usage: 0.0 | mapping []
    2 - usage: 0.0 | mapping []
    3 - usage: 101.0 | mapping ['aos_vm8', 'aos_vm6', 'aos_vm2', 'aos_vm7']
    --------------------------------------------------
    0 - usage: 50.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 53.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 51.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 53.0 | mapping ['aos_vm6', 'aos_vm7']
    --------------------------------------------------
    0 - usage: 102.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 100.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 95.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 99.0 | mapping ['aos_vm6', 'aos_vm7']

    Test Case 3

    In this test case, you will run 8 virtual machines that start with an already balanced mapping of vCPU->pCPU. The vCPU of each VM will process the same workload.

    Expected Outcome

    No vCPU->pCPU mapping changes should occur since a balance state has already been achieved.

    --------------------------------------------------
    0 - usage: 63.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 60.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 59.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 58.0 | mapping ['aos_vm6', 'aos_vm7']
    --------------------------------------------------
    0 - usage: 57.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 60.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 60.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 61.0 | mapping ['aos_vm6', 'aos_vm7']
    --------------------------------------------------
    0 - usage: 57.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 59.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 59.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 60.0 | mapping ['aos_vm6', 'aos_vm7']

    Test Case 4

    In this test case, you will run 8 virtual machines that start with an equal affinity to each pCPU (i.e., the vCPU of each VM is equally like to run on any pCPU of the host). The vCPU of each VM will process the same workload.

    Expected Outcome

    Each pCPU will exhibit an equal balance of vCPUs given the assigned workloads.

    3 - usage: 60.0 | mapping ['aos_vm3', 'aos_vm7']
    --------------------------------------------------
    0 - usage: 57.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 61.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 58.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 59.0 | mapping ['aos_vm6', 'aos_vm7']
    --------------------------------------------------
    0 - usage: 59.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 60.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 60.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 61.0 | mapping ['aos_vm6', 'aos_vm7']
    --------------------------------------------------

    Test Case 5

    In this test case, you will run 8 virtual machines that start with an equal affinity to each pCPU (i.e., the vCPU of each VM is equally like to run on any pCPU of the host). Four of these vCPUs will run a heavy workload and the other four vCPUs will run a light workload.

    Expected Outcome

    Each pCPU will exhibit an equal balance of vCPUs given the assigned workloads.

    --------------------------------------------------
    0 - usage: 50.0 | mapping ['aos_vm3']
    1 - usage: 70.0 | mapping ['aos_vm2', 'aos_vm7']
    2 - usage: 142.0 | mapping ['aos_vm1', 'aos_vm4', 'aos_vm6']
    3 - usage: 85.0 | mapping ['aos_vm8', 'aos_vm5']
    --------------------------------------------------
    0 - usage: 88.0 | mapping ['aos_vm1', 'aos_vm7']
    1 - usage: 87.0 | mapping ['aos_vm8', 'aos_vm4']
    2 - usage: 53.0 | mapping ['aos_vm5']
    3 - usage: 119.0 | mapping ['aos_vm6', 'aos_vm2', 'aos_vm3']
    --------------------------------------------------
    0 - usage: 182.0 | mapping ['aos_vm1', 'aos_vm5', 'aos_vm3', 'aos_vm7']
    1 - usage: 36.0 | mapping ['aos_vm8']
    2 - usage: 54.0 | mapping ['aos_vm4']
    3 - usage: 70.0 | mapping ['aos_vm6', 'aos_vm2']
    --------------------------------------------------
    0 - usage: 100.0 | mapping ['aos_vm1', 'aos_vm5']
    1 - usage: 73.0 | mapping ['aos_vm8', 'aos_vm2']
    2 - usage: 99.0 | mapping ['aos_vm4', 'aos_vm3']
    3 - usage: 74.0 | mapping ['aos_vm6', 'aos_vm7']
    --------------------------------------------------
  • A snapshot of my understanding before beginning project 1 (scheduler, memory coordinator)

    A snapshot of my understanding before beginning project 1 (scheduler, memory coordinator)

    Project 1 was released last evening at 08:59 PM PST and this morning, I decided to start on the project by reading through the overview and get the lay of the land. For this project, we’ll need to deliver to operating system components: a scheduler and a memory coordinator (not even sure what that means exactly).

    So what I’m doing as part of this post is just taking a snapshot of the questions I have and topics I do not understand, topics that I’ll probably understand in much more depth as the project progresses. More often than not, I often dismissive of all the work I put in over the semester and this post is one way to honor the time and commitment.

    Overall, this project’s difficulty sits in the right place — not too hard but not too easy. The sweet spot for Deliberate Practice.

    Questions I have

    • What algorithm should I implement for my scheduler?
    • What algorithms fit the needs for this project
    • What the heck is a memory coordinator?
    • Why do we have a memory coordinator? What’s it purpose?
    • How do you measure the success of a memory coordinator?
    • How do I use libvrt library?
    • What is QEMU?
    • Where does the scheduler sit in relationship to the operating system?
    • How will I get the hypervisor to invoke my scheduler versus another scheduler?

    Project Requirements

    • You need to implement two separate C programs, one for vCPU scheduler (vcpu_scheduler.c) and another for memory coordinator (memory_coordinator.c)

    References

    1. Introduction to QEMU