Author: mattchung

  • Distributed Computing CS7210 Distributed Computing – A course review

    Distributed Computing CS7210 Distributed Computing – A course review

    Distributed Computing was offered in the OMSCS program for the first time this past semester (i.e. Spring 2021) and when the course opened up for registration, a storm of newly admitted and seasoned students signed themselves up — me included. I was fully aware that I was walking into unknown territory, a bleeding edge course, and expected a lot of rough edges, of which there were many. That being said, the course is great and with some some tweaks around pacing, has the potential to be the be one of the best courses offered for students, especially those specializing in computing systems.

    Overview

    The course quality is top-notch. The lectures are intellectually challenging, the assigned readers are seminal pieces of work, and the projects really drill the theoretical concepts. Overall, depending on your programming experience, expect putting in at least 20+ hours per week (with some students reporting anywhere between 30-50 hours).

    Recommendation: If you are a seasoned engineer (with at least a couple years of programming experience under your belt) and someone who can handle ambiguity with little hand holding, then I highly recommend taking the course. But if you are just starting out in your computer science journey, then I would hold off for at least a couple semesters; take the recommended pre-requisites (i.e. graduation introduction to operating systems, advanced operating systems, computer networks) and wait until the course’s rough edges are smoothed out. As another student on omscentral pointed out, this class is “for experienced engineers, not students.”

    What to expect from the course

    Pros

    • Lectures are easy to watch and are packed in digestible ~5 minute chunks
    • Assigned readings (from first half of semester) are seminal pieces of work by famous computer scientists like Leslie Lamport and Eric Brewer
    • Skills and knowledge acquired directly apply to my career as a software engineer and computer scientist
    • Instructors and teacher assistants are extremely professional, care about the students well-being, and quite generous with the grading curve

    In this class, you’ll develop a foundation around designing and building distributed systems. You’ll understand the importance of systems keeping track of time and the different ways to implement clocks (e.g. scalar clocks, vector clocks, matrix clocks). In addition, you’ll appreciate how systems achieve consensus and being able to make trade offs between choosing different consistency models such as strict consistency, eventual consistency. You’ll end the semester with learning about the infamous CAP theorem and FLP theorem and how, as a system designer, you’ll make trade offs between consistency, availability, and the ability to withstand network partitions. Of course, you’ll eat and breathe Leslie Lamport’s PAXOS. So if any of these topics interest you, you’re in for a treat.

    Cons

    • Bleeding edge course means that there were lots of rough edges
    • Projects were very demanding, often requiring multiple hours to pass a single test worth very little towards grades
    • Triggered lots of uncertainty and desperation among students throughout the second half of the semester

    As mentioned above, this class induced a lot of unnecessary stress in students. Even for someone like me, who cares less about the actual letter grades on transcripts, felt pretty anxious (this class potentially could’ve held me back another semester, since up until the grades were actually released, I had assumed I would get a C or lower).

    Impact on mental health

    One concerned students published a post on the forum, asking if students were mentally okay:

    I just wanted to check in with everyone on here in the class. I know these projects are stressful and for me it’s been something of a mental health hurdle to keep pushing despite knowing I may very well not succeed. Hope everyone is doing ok and hanging in there. Remember no assignment is worth your sanity or mental health and though we are distanced we are all in this together.

    Anonymous Calc

    Many other students chimed in, sharing their same frustrations

    I found both of the projects very frustrating. Specially this one. I am working for last 2 weeks (spending 50+ hours in writing/rewriting) and still passing only 7/8 tests. I never had unfinished academy projects. This is the first course I am having this.

    Adam

    I couldn’t help but agree:

    Honestly, I was fairly stressed for the past two weeks. Despite loving the course — content and rigor of the project — I seriously contemplated dropping the course (never considered this avenue before, and I’m 2 courses away from graduating after surviving compilers and other difficult systems courses) as to avoid potentially receiving a non-passing grade (got an A on the midterm but its looking pretty bleak for Project 4 with only 12 tests passing). At this point, I’ve fallen behind on lectures and although there is 1 (maybe 2) days left for Project 4, I’ve decided to distance myself from the project. Like many others, I’ve poured an insane number of hours into this project, which doesn’t reflect in the points in Gradescope. I suspect both the professor and the TAs are aware of the large number of people struggling with the project and will take this all into account as part of the final grading process.

    Tips

    Programming Projects

    Here’s a list of the projects, their weight towards the final grade, and the amount allocated to each assignment.

    • Project 1 – Environment Setup – 5% – 2 weeks
    • Project 2 – Client/Server – 10% – 2 weeks
    • Project 3 – Primary/Backup – 15% – 3 weeks
    • Project 4 – PAXOS – 15% – 3 weeks
    • Project 5 – Sharded KV Store – 15% – 4 weeks

    Project 1 and 2 are a walk in the park. The final 3 projects are brutal. Make sure you start early, as soon as the projects are released. I repeat: start early. Some people reported spending over 100+ hours on the latter projects.

    Unless you are one of the handful of people who can pour in 50+ hours per week in the class, do not expect to get an A on the programming projects. But don’t sweat it. Your final grade will be okay — you just need to have faith and ride the curve. All you can do is try and pass as many tests as possible and mentally prepare for the receiving a C or D (or worst) on these assignments.

    Summary

    The course is solid but needs serious tweaking around the pacing. For future semesters, the instructors should modify the logistics for the programming assignments, stealing a couple weeks from the first couple projects and tacking them on to final projects (i.e. Primary/Backup system, PAXOS, Sharded Key-Value Store). With these modifications, students will stress out way less and the overall experience will be much smoother.

     

  • Front-yard overseeding journey

    Front-yard overseeding journey

    Our front yard needs work. About 3 weeks ago, I made my first attempt at overseeding and although a couple seeds germinated, the lawn was left with lots of bare spots.

    Several bare spots in the lawn

    So I’m taking a second stab. This time around though, I’m not just going to simply chuck seeds on top of the grass. Instead, I’m taking going to take the following steps:

    • Dethatch
    • Add topsoil
    • Plant seeds
    • Rake seeds in
    • Plant more seeds
    • Pat seeds in
    • Lay peatmoss

    Essentially, I’m following the advice of these some great YouTube tutorial videos.

    Thatch removed, soil topped off, seeds planted, peat moss planted

    As a new lawn enthusiast, I have no clue as to not whether this will work. It’s sort of an experiment. I’ve watched dozens of YouTube videos and it’s time for the rubber to meet the road. So I’ll be documenting the journey and will report back in a couple days as I continue to water the lawn. Regardless, I’m enjoying the unknown and I’m in good company:

    Metric keeping me company while I work on the lawn
  • Distributed Computing – Goodbye and thanks for the wonderful semester

    Distributed Computing – Goodbye and thanks for the wonderful semester

    I just finished Spring 2021 at Georgia Tech OMSCS and published a farewell note on the classroom’s forum (i.e. Piazza platform) and would like to share that here:

    This was one hell of a semester! Hats off to professor Ada and our great TAs — I learned a great deal about both theoretical and practical distributing computing knowledge, experiencing first hand how tweaking a retry timer by a few hundred milliseconds can make or break your day.

    But above all else, thanks to all the other students in this class, I felt extremely supported, more supported than any of the 8 courses I had previously taken in the program.

    More than once throughout the class I contemplated just throwing in the towel. The last couple projects in particular really wore me out mentally and emotionally (e.g. 10 hours troubleshooting for a one-line fix to pass one single test) and if it wasn’t for the constant support of all my peers over Piazza and Slack, I would’ve probably not only dropped from the course but the journey itself would’ve felt a lot more isolating, especially in the midst of the pandemic.

    Now, there are definitely rough edges with this course, particularly around pacing on the last couple projects. But given that this was the first semester that distributed computing was offered as part of OMSCS, I anticipated minor bumps coming into this class and have no doubts that the logistics will get smoothed out over the next couple semesters.

    Finally, for those of you graduating this semester, congratulations! Way to go out with a bang. And for the rest of us, see you next semester!

    Thanks again for all the support and let’s stay connected (contact info below). Now, time for a much needed nap after taking the final exam:

  • My first lawn seeds germinating!

    My first lawn seeds germinating!

    After watching dozens of YouTube videos on lawn care, I decided about two weeks ago to overseed the front lawn and water the grass twice a day (I really used to think that the earth would just magically nourish our yard). And up until this morning, I wasn’t entirely sure if all my effort was wasted, since it’s really difficult to spot whether or not seeds were actually germinating. On top of this daily maintenance, I’ve been also singing to them, giving them some verbal love.

    My first lawn seeds germinating 1.5 weeks later

    And this morning, about 1.5 weeks later after initial seeding, I discovered that my little seeds were starting to germinate!  Proof! Finally! I was so ecstatic that I snapped a couple photos and then bolted inside, sharing the photos with Elliott and Jess.

    I suppose this is one of the silver linings of COVID-19 and being locked down at home for the last year? I’m turning into a lawn care nut.

  • iperf3 3 and TCP maximum segment size (MSS)

    iperf3 3 and TCP maximum segment size (MSS)

    The above diagram I diagrammed illustrates the impact to a network packet when setting the maximum segment size in iperf3. With an MSS of 1436, the segment (i.e. TCP payload) ends up 1424, due to the overhead of the 12 byte TCP options.

     

  • Got the COVID-19 Pfizer vaccination

    Got the COVID-19 Pfizer vaccination

    What did we do?

    • Jess and I received our first dose of the Pfizer vaccination, our second dose scheduled for 3 weeks from now.
    • We were able to get the vaccination since Jess had heard, through a mom’s group she’s part of, that breastfeeding mothers (along with their partners) were eligible through a Kaiser clinic. So after signing up a week early, we strolled into the Renton vaccine clinic and were in and out within 20 minutes.

    What are the after effects?

    • Jess is a super human and had almost zero effects (she did dose herself lots of vitamin C a week before, so that could be the reason)
    • I felt mentally foggy within the first hour and by the second hour, I was very lethargic, my energy zapped, requiring me to take a 2.5 hour nap to recover
    • Before heading to bed, I had a dead arm, my arm feeling as though someone punched me about 100 times

    Getting the vaccination wasn’t a black/white decision

    • The vaccination is not FDA approved
    • The vaccination was made publicly available at an accelerated rate
    • Jess is/was worried about the long term impact
    • For me, the vaccination is one step forward in returning back to a sense of normalcy, reducing my anxiety and general nervousness of feeling that I’m going to catch COVID-19
    • We can more safely travel to U.K. and visit Jess’s family

     

     

  • PAXOS made moderately complex – slots

    PAXOS made moderately complex – slots

    In the paper “PAXOS made moderately complex”, the authors introduce unfamiliar concepts not mentioned in the original PAXOS paper, concepts such as slots, slot in, slot out, and WINDOW. I found these concepts difficult to understand despite reading both the accompanying pseudo code as well as their Python implementation.

    This post aims to shed light on these concepts and better understand how the following invariant holds:

    R5: A replica proposes commands only for slots for which it knows the configuration: slot_in < slot_out + WINDOW

    The rest of the post focuses on the replica only. We will not discuss the actual consensus algorithm, also known as the SYNOD; this topic will be covered in a separ

    SLOTS Overview

    • Slots are to be occupied by decisions
    • SLOT_IN points to the next “free” unoccupied slot
    • SLOT_OUT points to the next decision to be executed
    • SLOT_IN advances by 1 with every new decision received by leaders
    • SLOT_OUT advances by 1 with every execution of a decision by the replica
    • SLOT_IN + WINDOW is N number of slots allowed to be buffered
    • SLOT_OUT trails behind the SLOT_IN as more decisions flow into the replica
    • SLOT_OUT == SLOT_IN means replica has processed all it’s commands
    Replica Initial state
    Figure 1 – Imagine a WINDOW=5. Initially, both slot_in and slot_out both point to slot 1

     

    Then, the replica, as it receives requests from clients, will send proposals to the leaders. The leaders will, as part of its consensus algorithm, will respond with decisions commands, each accept command tied to a particular slot position. We won’t discuss the consensus algorithm as part of this post (but perhaps another).

    Figure 2 – Replica sending a proposal for slot 1

     

    Then, as the replica receives decisions, it will fill the slot accordingly. Every time a slot fills, the slot in pointer advances by one.

    Figure 3 – As replica receives decisions, it inserts into its buffer, advancing slot in by 1

    Next is the perform phase. The replica will execute the decision — it’s associated command — and will advance the slot out index by 1.

    After receiving a decision, the replica will fill in the slot associated with that particular decision.

    Figure 4 – Replica receives decision and fill in the particular slot associated with the decision

    Then, for illustrative purposes, imagine that the replica sends out another proposal (not shown in the figure below), advances the slot in by 1, then fills in the second slot.

    Figure 5 – Slot IN points to index 3 and Slot OUT still pointing to slot 1

    Finally, during the perform phase, the replica will advance the slot out pointer.

    Figure 6 – As the replica executes commands in the slot, the slot OUT index advances. The slot, previously yellow, is now green, symbolizing that the occupying command has been executed

     

    Finally, in this example, the replica executes the next command, advancing SLOT OUT which now points to the same (unoccupied) slot as SLOT IN.

    Figure 7 – Replica executes it second outstanding decision, advancing SLOT OUT by 1, which now points to the same (unoccupied) slot as SLOT IN

    Summary

    WORK IN PROGRESS

  • PAXOS – I’m coming for you!

    PAXOS – I’m coming for you!

    I’m now half way through Distributed Computing course at Georgia Tech and us students are now tackling the penultimate project: building a replicated state machine using PAXOS. This project will be challenging (probably going to require 40+ hours) and it’ll put my theoretical knowledge to the test and reflect back, in a couple weeks, how much I learned.

    Presently, here’s my little understanding of the consensus algorithm:

    Current PAXOS understanding

    • Servers play different roles – Proposer, Acceptor, Learner
    • Proposers send proposals that monotonically increase
    • Proposals are accepted if and only if a majority of the quorum accept them
    • The 2PC (2 phased commit) protocol essentially tells us whether or not a particular transaction is committed or aborted
    • Guaranteeing linearzability means that, from the clients perspective, real time (i.e. wall clock) should be respected and the client should view the system as if there is a single replica

    Future PAXOS understanding

    • How exactly PAXOS guarantees consensus via its 2 phased commit protocol
    • How does a server determine its role (or does it play multiple roles)
    • How to handle the edge cases (say two proposals arrive at the same time)
    • What role does a client play? Does it serve as a proposer?
    • How does leader election work in PAXOS?
    • Should I just try and mimic the Python based code described in Paxos made moderately difficult
    • How will replication work as the number of nodes in the system scales (say from 3 to 5 to 10)
    • How to detect (and perhaps avoid) split brain (i.e. multiple leaders)

    References

    1. Majority quorum can be defined as floor(n/2) + 1
    2. Python implementation described https://www.cs.cornell.edu/courses/cs7412/2011sp/paxos.pdf
  • Understanding linearizability

    I’m preparing for my Distributing Systems midterm and I was struggling to understand the differences between serializability and linearizability (why are these two words so difficult to spell right). Apparently, these two concepts are very different. To gain some clarity, I searched online and found this awesome YouTube video posted by Martin Kleppmann ; in the video, he dives deep into linearizability and I wanted to share some of the key take aways

    Key Takeaways

    • Happens before relationship (coined by Leslie Lamport) deals with causality and only applies to message sends and receives, related to logical clocks.
    • Linearizability focuses with not with logical clocks, but with real time
    • Linearzability states that an operation must take place sometime after it started but before it ended. That’s not entirely clear, so let’s let’s imagine a scenario with two clients: client 1 and client 2. Client 1 performs a write key=x value=0. And imagine client 2 performs a get key=x. And finally, suppose that the key value store current contains a key=x, value=100. With linearzability, it’s entirely possible that if both client 1 and client 2 overlap, that client 2 gets either value=0 or value=100.
    • Linearzability is not serializability. They are not the same. The latter is isolation between transactions; as if they are executed in some serial order. The former is multiple replicas behaving as if there is a single replica.
  • Speaking up for others

    Speaking up for others

    Ever since I was little boy, if any of my friends were bullied or picked on, and I noticed they couldn’t defend themselves, I would speak up on their behalf. Speaking up for others has always come naturally for me and it’s habit that I still flex even as an adult. However, these days, I’m a tad more reluctant to take action; I’ve learned that sometimes its best to allow people the opportunity to fight their own battles. Knowing when to stay silent or speak up for others is not so black and white: it’s an art.

    I’m constantly walking a fine line.

    In fact, this blog post was sparked by another student in my OMSCS program, who posted a question on the online forum, which lead to a discussion I wasn’t sure I should engage. This particular student had asked for a one day extension for the first programming project, admitting that they vastly underestimated the complexity of the assignment. Then, another anonymous student chimed in, complaining that it would be “unfair” for the other students who actually “budgeted” their time. As soon as I read this anonymous person’s comment, I immediately felt annoyed and wanted to send a knee-jerk response but decided to step away from my keyboard since I didn’t want to type something I would regret.

    Instead, here’s how I responded:

    Piazza post – asking for a single day extension

    And I’m glad I did respond. Because since voicing my opinion, a handful of other students started replying to the thread, taking a similar stance to mine.

    In general, I’m motivated to speak up for others is because I fervently believe in the following quote:

    “The only thing necessary for the triumph of evil is for good men to do nothing.”

    ― Edmund Burke