Tag: tradeoffs

  • Five tips for surviving (or thriving) in the OMSCS program as a computer science graduate student

    Five tips for surviving (or thriving) in the OMSCS program as a computer science graduate student

    Overview

    In this post, I’m sharing five tips that I’ve picked up over the last 2 years in the program. At the time of this writing, I’m wrapping up my 7th course (advanced operating systems) in the OMSCS program. This means that I have 3 more courses to complete until I graduate with my masters in computer science from Georgia Tech.

    This post assumes that you are already enrolled as a student in the program. If you are still on the fence as to whether or not you should join the program, check out Adrian’s blog post on “Georgia Tech Review: yes, maybe, no”. Very shortly, I’ll post my own thoughts and recommendations on whether or not I think you should join the program.

    Five Tips

    Prerequisites

    Before enrolling in a course, check the course’s homepage for recommended prerequisites. Once you are officially accepted into the program,  you’ll soon discover that unlike undergraduate, you can enroll in courses despite not meeting prerequisites. The upside of this relaxed approach is that students who are really eager to take a course can do so (assuming seats are available). The downside is that if you are not prepared for the course, you’ll be way in over your head and likely will drop the course within the first few weeks since many courses are front loaded.  For example, thinking about taking advanced operating systems? Before jumping the gun, take the diagnostic exam. Doing so will help you gauge whether or not you set up for success.

    Time management

    Manage your time by setting and schedule and sticking to it. If you are an early bird like me, carve out 50 minutes early in the morning, take a shot of your expresso, and get cracking.  Set specific times in which you will sit down and study (e.g. watch lectures, review notes, work on a project) and aim to stick to that schedule (as a new parent, I understand that plans often can easily be foiled).

    Trade offs

    Know when and what to slip or sacrifice. This is super important. We all 24 hours in a day. And we all have other responsibilities such as a full time job or children to take care of. Or something unexpectedly will just pop up in your life that demands your attention. That’s okay. Graduate school (and doing well in graduate school) is important. But there will be other competing motivators and sometimes, you’ll have to sacrifice your grade. That’s okay.

    Expectations

    Know what you want to get out of the program. What are you trying to get out of the program? Or what are you trying to get out of the current course you are enrolled in? Are you here to learn and improve your craft? Then focus on that. Or are you here to get a good grade? Then do that. Are you here to expand your network and potentially apply to a PhD program? Then make sure you attend the office hours and meet with the professors.

    Course reviews

    Read the reviews on OMSCentral.  If you haven’t read reviews on OMSCentral.com for the courses you plan to take (or are currently taking), please go and do that. Right now. On that website, you’ll find reviews from former students. They give you a better sense of what you’ll learn, what projects you’ll tackle, how engaged the professor is , whether the grades are curved, how difficulty the exams are, and maybe most important: how many hours you should expect to put into the course.

    Notifications

    Create e-mail (or text) notifications in your calendar for upcoming due dates. It sucks when you forget that an upcoming assignment is due soon. Sucks even more when you find out that the due date has passed. Although you can rely on Canvas (Georgia Tech’s current system for managing their courses), I still recommend using your own system.

    Summary

    The above tips do not guarantee a smooth journey. Chaos will somehow creep its way into our time and things will get derailed. That’s okay. Take a deep breathe and revisit the tip above on prioritization. Remind yourself why you are in graduate school. Now go out there and kick some ass.

  • Synchronization notes (part 2/2) – Linked Based Queuing lock

    Synchronization notes (part 2/2) – Linked Based Queuing lock

    In part 1 of synchronization, I talked about the more naive spin locks and other naive approaches that offer only marginally better performance by adding delays or reading cached caches (i.e. avoiding bus traffic) and so on. Of all the locks discussed thus far, the array based queuing lock offers low latency, low contention and low waiting time. And best of all, the this lock offers fairness. However, there’s no free lunch and what we trade off performance for memory: each lock contains an array with N entries where N is the number of physical CPUs. That high number of CPUs may be wasteful (i.e. when not all processors contend for a lock).

    That’s where linked based queuing locks come in. Instead of upfront allocating a contiguous block of memory reserved for each CPU competing for lock, the linked based queuing lock will dynamically allocate a node on the heap and maintain a linked list. That way, we’ll only allocate the structure as requests trickle in. Of course, we’ll need to be extremely careful with maintaining the linked list structure and ensure that we are atomically updating the list using instructions such as fetch_and_store during the lock operation.

    Most importantly, we (as the OS designer that implement this lock) need to carefully handle the edge cases, especially while removing the “latest” node (i.e. setting it’s next to NULL) while another node gets allocated. To overcome this classic race condition, we need to atomically rely on a new (to me at least) atomic operation: compare_and_swap. This instruction will be called like this: compare_and_swap(latest_node, me, nil). If the latest node’s next pointer matches me, then set the next pointer to NIL. Otherwise, we know that there’s a new node in flight and will have to handle the edge case.

    Anyways, check out the last paragraph on this page if you want a nice matrix that compares all the locks discussed so far.

    Linked Based Queuing Lock

    Summary

    Similar approach to Anderson’s array based queueing lock, but using a linked list (authors are MCS: mellor-crummey J.M. Scott). Basically, each lock has a qnode associated with it, the structure containing a got_it and next. When I obtain the lock, I point the qnode towards me and can proceed into the critical section

    Link Based Queuing Lock (continued)

    Summary

    The lock operation (i.e. Lock(L, me)) requires a double atomic operation, and to achieve this, we’ll use a fetch_and_store operation, an instruction that retrieves the previous address and then stores mine

    Linked Based Queuing Lock

    Summary

    When calling unlock, code will remove current node from list and signal successor, achieving similar behavior as array based queue lock. But still an edge case remains: new request is forming while current thread is setting head to NIL

    Linked Based Queuing Lock (continued)

    Summary

    Learned about a new primitive atomic operation called compare and swap, a useful instruction for handling the edge of updating the dummy node’s next pointer when a new request is forming

    Linked Based Queuing Lock (continued)

    Summary

    Hell yea. Correctly anticipated the answer when professor asked: what will the thread spin on if compare_and_swap fails?

    Linked Based Queuing Lock (continued)

    Summary

    Take a look at the papers to clarify the synchronization algorithm on a multi-processor. Lots of primitive required like fetch and store, compare and swap. If these hardware primitives are not available, we’ll have to implement them

    Linked Based Queuing Lock (continued)

    Summary

    Pros and Cons with linked based approach. Space complexity is bound by the number of dynamic requests but downside is maintenance overhead of linked list. Also another downside potentially is if no underlying primitives for compare_and_swap etc.

    Algorithm grading Quiz

    Comparison between various spin locks

    Summary

    Amount of contention is low, then use spin w delay plus exponential backoff. If it’s highly contended, then use statically assigned.

  • Advanced Operating Systems: The SPIN Approach

    Advanced Operating Systems: The SPIN Approach

    What did I learn? What are the main takeaways?

    The concept of border crossing pops up over and over again. This is a new term I never heard of prior to this class.  The term is almost synonymous to a context switch but it is subtly different in the sense that a context switch (switch from one process to another or one thread to another) can occur without a border crossing, without changing the underlying hardware address space.

    SPIN attempts to enforce protection at the compiler level, by using a restrictive language called Modula-3. Unlike the C language, where you can cast a pointer to whatever data structure you like, Modula-3 enforces type safety, only allowing the developer to cast a pointer to specific types of data structures that they had already specified earlier in the code.

    SPIN offers extensibility by allowing different types of OS services to co-exist with one another.

    But what are the trade offs with SPIN, when compared with Microkernel and Exokernel?

    It appears that SPIN would be more performant than Microkernel due to no border crossings while maintaining flexibilty (multiple types of OS services that cater to application processes) and security (via logical protection domains) with Modula-3, allowing code OS services library code to co-locate with kernel code.

    Introduction

    Customizing OS with SPIN

    Summary

    SPIN and Exokernel take two different paths to achieving extensibility. These designs overcome the issue of Microkernel, which compromises in performance due to border crossings, and monolithic, which does not lend itself to extensibility

    What we are shooting for in OS Structure

    Summary

    We want flexibility similar to a microkernel based approach but also want protection and performance of monolithic. We want the best of both worlds: performance protection flexibility

    Approaches to Extensibility

    Damn, this is a super long video (8 minutes, compared to the other one to two minute videos prior)

    Capability based

    Hydra OS (1981)

    Summary

    Hydra did not fully achieve its goal of extensibility

    Micro Kernel Based

    Summary

    Performance took a back seat, since the focus was on extensibility and portability. Bad press for micro kernel based due to the twin goals.

    SPIN approach to extensibility

    Summary

    By co locating the kernel and extension in the same hardware space, the extensions are cheap as procedure call. Doing this by depending on a strongly typed language to provide safety

    Logical Protection Domains

    Summary

    Using a programing language called Modula3, which doesn’t appear to be popular in practice, we can enforce protection at the logical level. This programming language, unlike C, restricts casting pointers to random data structures, only allowing the cast to a particular data type.

    Spin mechanisms for protection domains

    Summary

    The secret sauce of protection and performance are the mechanisms of creating (i.e. expose entry points), resolving (i.e. leverage other entry points), and combining of logical protection domains

    Customized OS with Spin

    Another example of SPIN os customization

    Summary

    There can be multiple file systems (written in Modula3), each file system catering to their callers, and each file system using the same underlying hardware address space. And they can share modules with one another, like the networking entry point.

    Example Extensions

    Summary

    Example of Unix Servers implementing their OS on SPIN as well as a video server / display client building on top of spin

    Quiz: Border Crossings

    Quiz: Least likely for border crossing

    Summary

    Microkernel and SPIN offer performance since they limit the border crossings. In SPIN, Logical domains do not require border crossings

    SPIN Mechanisms for Events

    SPIN classifies three types of event handling: one-to-one, one-to-many, many-to-one

    Summary

    To handle events (like packet arrival) we can have a 1:1 mapping, 1:N mapping or N:1 mapping. For 1:1, an example would be an ICMP packet arriving and the 1 ICMP handler running. In a 1:N mapping, the IP packet arrived event runs and signals three other event handlers like ICMP, UDP, or TCP. Then finally, there is a N:1, and an example of this is an Ethernet and ATM packet event arrives but both funnel into the IP handler

    Default Core Services in SPIN

    Summary

    SPIN offers core services like memory management, CPU scheduling etc. And SPIN will provide a header file that OS programmers need to implement. Remember: these implementations talk to each other through well defined interfaces, offering protection, but are also performant cause there are no border crossings)

    Default Core Services in SPIN (Continued)

    Summary

    Provides primitives, the interface function definition. The semantics are up to the extension itself. SPIN makes sure extensions get time on scheduler