Author: mattchung

  • Daily Review: Day Ending in 2020/10/15

    Daily Review: Day Ending in 2020/10/15

    Family

    • Difficult time comforting Jess when she’s upset. It’s insane but it’s so easy for me to gently console other people (like friends or even strangers) when they are upset but I find it incredibly difficult to do the same for Jess.  The words just don’t come out. It’s difficult to put into words why I cannot comfort her but I just feel much more vulnerable and my throat chokes when I see her upset, making it difficult for me to even utter any words of comfort.
    • Watched Elliott on my day off work and brought her over to her cousins house for a little play date.  Jess needed time to herself to get some work done so I too Elliott over to her cousins house (10 minutes away, now that we live in Renton) and  I totally get why parents have play dates now. It’s so much easier to watch the kids when they are engaged with one another and you don’t have to be the single source of entertainment.

    What I learned

    • The term anonymous pages. In virtual memory management (VMM), an anonymous page is a page not backed by any particular file1 and is used for things like the stack, or the heap, or copy-on-write pages. The term (anonoymous pages) was brought in the video lectures on global memory systems (see section below: graduate school).

    Graduate School

    Project 3

    • Set up my local development environment using CMake and something called vcpkg. I’m curious why we are using CMake instead of Make: what are the advantages of using the former? And what does vcpkg do?
    • Successfully imported protobuf header in my “store.cc” code. Just like in professional development, the best way to build momentum for a new development project is by committing as small pieces of code that push the project forward, even if its an inch.

    Lectures

    • Watched half the on global memory systems. This is an interesting topic, paging out not directly to disk but to other node’s memory systems using the local area network (LAN) instead. I’m assuming here that the time to spin the head of a physical hard disk exceeds the time to generate a packet and for the packet to traverse the network. Otherwise, what’s the point?

    Relearning from my own notes

    • Unified buffer cache. This term seemed eerily familiar so I searched through my previous blog posts and found the actual post where I learned this concept. It’s always a nice feelings to rediscover your own learning.

    Music

    [fvplayer id=”5″]

    • Played around with different chord options on the guitar. My guitar instructor (Jared Borkowski) shared a beta PDF guide called “Chords with colors” so I spent a few minutes noodling with my guitar and recorded a quick little clip (above).

    References

    1. https://blogs.oracle.com/cwb/so-what-the-heck-is-anonymous-memory
  • Project 3 – Snapshotting my understanding (gRPC)

    Project 3 – Snapshotting my understanding (gRPC)

    Unfamiliar Technologies

    • Never heard of or used gRPC. According to their website, it’s a high performance open source RPC framework. I’m wondering if this package is used outside of academia. Probably is and probably is actually used within Amazon.

    What I do know

    • Google Protobuf. Fortunately, I have some experience with Google Protobufs since we use message serialization format within our stack (and the technology itself is quite ubiquitous)
    • Multi-threading. Prior to joining the OMSCS program, I had little to no understanding how threads worked and how they differed from processes. But since then, I have a much more intimate understanding of threads and even written several multi-threaded applications, both in graduate school and in a professional work setting. So I actually feel quite comfortable and prepared for project 3.

    Questions I Have

    • What does the gRPC framework offer? What advantage does using this framework have over just a simple HTTP framework? What problem does gRPC solve? Is it similar to an OS’s RPC framework but extended so that it works across hosts?
    • What does the wire protocol look like for gRPC? Does gRPC dictate the wire format or not?

    What I’m hoping to learn

    • Best practices for thread pools. Seems pretty straight forward to create a pool of threads and design the system in such a way that each thread will grab work from a (bounded) buffer. But what else should I consider?
    • How I might use gRPC for future projects. In the previous project (i.e. Project 2), I learned about OpenMP and MPI and convinced that if I ever need to write parallel software, I would use those libraries and runtime engines. Maybe the same will be true once I start fiddling with gRPC?

     

     

  • Not putting all your eggs in one basket and Daily Review: Day ending in 2020/10/14

    Not putting all your eggs in one basket and Daily Review: Day ending in 2020/10/14

    Mental Health

    Best Part(s) of My Day

    • Swinging on the swings with Elliott. She was sitting on my lap, facing my direction, as swung us on the swing back and forth. The entire time she was smiling and when her cold chubby cheeks brushed up against mine, some dad love feelings ran through my body.
    • Eating a kick ass lunch. My wife whipped up a delicious Vietnamese dish with bean sprouts and pickled cucumbers and fermented tofu.

    Therapy Session

    • Vented about on call the past week. I shared the handful of times that I was paged out of bed at 03:30 AM and the operational issues that lasted until 10:00 pm on the evenings and being tied to the computer.
    • Shared the sense of betrayal I felt at work. How someone who I thought was in my corner, rooting for me, actually no longer advocates for me after I declined their “opportunity” to lead a project.
    • Declining the project that “sets me back for a promotion” allows me to bring my best self to work. Although one could argue that me not accepting the “opportunity” to lead a project (that would’ve easily tacked on additional 5-10 hours per week) sets me back for my promotion, I did it so maintain a balance between my professional and personal life, which quite frankly is what I am all about: living a balanced life, not one where I am endlessly pursuing the next thing.
    • Realized that I need to build a stronger community of supporters. Take away from this sense of betrayal is that I need to widen my network of supporters and not to put all my eggs in one basket, so to speak.

    Random

    Cool Quotes

    • “Academia is ripe for pursuing ideas on the lunacy fringe”.

    Graduate School

    • Skipping Java and Spring (for now) and jumping into Distributed System lectures. I almost always follow the curriculum in the order prescribed by the syllabus. But I’ve decided to skip (for now) some video lectures on Java and Spring… and this makes me a little bit anxious. Honestly, I find the Java and Spring lectures a bit boring (to be fair, I might actually enjoy them once I start watching the lectures). Right now, I’m interested in learning more about the fundamentals and theories of constructing distributed systems so I’ll jump to those videos.

    Music

    • Dusted off and played the ukulele for Jess and Elliott. I feel like I abandoned the ukulele after picking up the guitar. And strumming my beautiful Soprano ukulele yesterday reminded me that the instrumental has its own vibe and own personality, the strings sounding much more … bright.

    Family

    • Fell asleep at 06:30 pm. I normally fall asleep between 08:30 and 09:30, on both weekdays and weekends. But yesterday I was shattered after my sleep was interrupted several times throughout the night, the most notable when Jess had a coughing fit at around 03:00 AM, at which point I was pretty much unable to fall back asleep.
    • Slept in the same bed as Elliott and Jess last night. This was pretty sweet actually. Although I woke up multiple times throughout the night, I loved having Elliott and Jess in the same bed since I’ve been sleeping alone — well, with the dogs — for the last six months or so.
    • Wrapped a baby net around stair railings. Installing the net (my sister’s idea)  prevents little Elliott from slipping through the rails. Although I do not think her head could possibly squeeze through, the net gives Jess piece of mind and that’s important since she’s the one watching her throughout the day.
  • Daily Review – Day Ending in 2020/10/13

    Daily Review – Day Ending in 2020/10/13

    Family

    • Took Elliott shopping at Safeway while Jess took her work call. We picked up some avocados for our morning smoothies and some blackberries and blueberries (side note: why does a palm sized box of blueberries cost $6.00). While paying for our goods, our cashier awkwardly pulled down her mask for Elliott to see her face. Outside the context of COVID-19, this would be totally normal and even appreciated. But given we are in the midst of the pandemic, I felt awkward but didn’t feel compelled to ask her to put her mask back on given there was a glass screen positioned between us.

    Graduate School

    Lectures

    • Did not read any lectures but did download all the videos for Spring Operating systems. I normally do not download the videos to my laptop however wind was hitting us pretty hard and our lights were flickering so I was preparing for the situation in which our internet disconnected.

    Project Work

    • Completed writing documentation for project 2 submission. This included an analysis of the algorithms that I wrote (i.e. centralized barrier with sense reversal, dissemination and tournament) and why some barriers performed better (or worse) than other barriers.
    • Skimmed through requirements for project 3i.e. build a distributed service using grpc). This project got announced Monday evening and will be due in about 3-4 weeks and I like to get started early to avoid last minute cramming, which leads to unnecessary stress.
  • Lamport’s Clocks (notes)

    Lamport’s Clocks (notes)

    [ez-toc]

    Introduction

    Summary

    Now that we talked about happened before events, we can talk about lamport clocks

    Lamport’s Logical Clock

    Summary

    A logical clock that each process has and that clock monotonically increases as events unfold. For example, if event A happens before event B, then the event A’s clock (or counter value) must be less than that of B’s clock (or counter). The same applies between “happened before” events. To achieve this, a process will increase its own clock monotonically by setting the counter to the maximum of “receipt of the other process’s clock or its own local counter”. But what happens with concurrent events? Will hopefully learn soon

    Events Quiz

    Summary

    C(a) < C(b) means one of two things. A happened before B in the same process. Or B is the recipient and chose the max between its local clock.

    Logical Clock Conditions

    Lamport’s logical clock conditions

    Summary

    Key Words: partial order, converse of condition

    Lamport clocks give us a partial order of all the events in the distributed system. Key idea: Converse of condition 1 is not true. That is, just because timestamp of x is less than timestamp of y does not necessarily mean that x happened before y. Also, is partial order sufficient for us distributed system designers?

    Need For A Total Order

    Need for total order

    Summary

    Imagine a situation in which there’s a shared resource and individual people want access to that resource. They can send each other a text message (or any message) with a timestamp. The sender with the earliest timestamp wins. But what happens if two (or more) timestamps are the same. Then each individual person (or node) must make a local decision (in this case, oldest age of person wins)

    Lamport’s Total Order

    Lamport’s total order

    Summary

    Use timestamps to order events and break the time with some arbitrary function (e.g. oldest age wins).

    Total Order Quiz

    Total order quiz

    Summary

    Identify the concurrency events and then figure out how to order those concurrent events to break the ties

    Distributed Mutual Exclusion Lock Algorithm

    Distributed mutual exclusion lock

    Summary

    Essentially, this algorithm basically requires each process to send their lock requests via messages and these messages need to be confirmed by the other processes. Each request contains a timestamp and each host will make its own local decision, queueing (and sorting) the lock request in their local process and breaking ties by preferring lowest process ID. What’s interesting to me is that this algorithm makes a ton of assumptions like no broken network links or split brain or unidirectional communication: so many failure modes that haven’t been taken into consideration. Wondering if this will be discussed in next video lectures

    Distributed Mutual Exclusion Lock Algorithm (continued)

    Distributed mutual exclusion lock continued

    Summary

    Lots of assumptions of distributed mutual exclusion lock algorithm including 1) All messages are received in the order they are sent and 2) no messages are lost (this is definitely not robust enough for me at least)

    Messages Quiz

    Messages Quiz

    Summary

    With Lamport’s distributed mutual exclusion lock, we need to send 3(N-1) messages. First round that includes timestamp, second for acknowledge, third for release of lock.

    Message Complexity

    Message Complexity

    Summary

    A process can defer its acknowledgement by sending it during the unlock, reducing the message complexity from 3(N-1) to 2(N-1).

    Real World Scenario

    Real world scenario

    Summary

    Logical clocks may not good be enough in real world scenarios. What happens if a system clock (like the banks clock) drifts? What should we be doing instead?

    Lamport’s Physical Clock

    Lamport’s physical clock

    Summary

    Key Words: Individual Clockdrift, Mutual Clockdrift

    Two conditions must be met in order to achieve Lamport’s physical clock. First is that the individual clock drift for any given node must be relatively small. Second, the mutual clock drift (i.e. between two nodes) must be negligible as well. We shall see soon but the clock drift must be negligible when compared to intercommunication process time

    IPC time and clock drift

    IPC and Clock Drift

    Summary

    Clock drift must be negligible when compared to the IPC time. Collectively, this is captured in the equation M >= Epsilon ( 1 – k) where E is the mutual clock drift and where K is individual clock drift.

    Real World Example (continued)

    Real world example

    Summary

    Mutual clock drift must be lower than IPC time. Or put differently, the IPC time must be greater than the clock drift. So the key take away is this: the IPC time (M) must be greater than the mutual clock drift (i.e. E), where k is the individual clock drift. So we want a individual clock drift to be low and to eliminate anomalies, we need IPC time to be greater than the mutual clock drift.

    Conclusion

    Summary

    We can use Lamport’s clocks to stipulate conditions to ensure deterministic behavior and avoid anomalous behaviors

  • Daily Review – Day Ending in 2020/10/12

    Daily Review – Day Ending in 2020/10/12

    Graduate School

    • Wrote up my analysis on the various barrier synchronization algorithms that I implemented. I had to describe the various algorithms (e.g. dissemination barrier, tournament barrier, centralized sense reversal barrier) for the documentation that will accompany our code and experiments as part of Project 2 for advanced operating systems.
    • Finished watching lectures on Active Networks. Didn’t find the material too interesting however I realized there are certainly concepts (like overlay networks) applied to our (AWS) networks.

    What I learned

    • Quote from Donald KnuthBeware of bugs in the above code. I’ve only proved it correct — not tried it.

    Work

    • Represented my team (Blackfoot Edge Applications) at weekly org wide operations meeting. Every week, our senior manager runs a organization (Blackfoot) wide operations meeting where each team reviews their their high severity events and dashboards from the previous week. During the meeting, I shared three particular interesting events that took place.
  • What I learned from writing synchronization barriers

    What I learned from writing synchronization barriers

    Before starting project 2 (for my advanced operating systems course), I took a snapshot of my understanding of synchronization barriers. In retrospect, I’m glad I took 10 minutes out of my day to jot down what I did (and did not) know because now, I get a clearer pictur eof what I learned. Overall, I feel the project was worthwhile and I gained not only some theoretical knowledge of computer science but I was also able to flex my C development skills, writing about 500 lines of code.

    Discovered a subtle race condition with lecture’s pseudo code

    Just by looking at the diagram below, it’s not obvious that there’s a subtle race condition hidden. I only was able to identify it after whipping up some code (below) and analyzing the concurrent flows. I elaborate a little more on the race condition — which results in a deadlock — in this blog post.

    Centralized Barrier

     

    [code lang=”C”]

    /*
    * Race condition possible here. Say 2 threads enter, thread A and
    * thread B. Thread A scheduled first and is about to enter the while
    * (count &amp;gt; 0) loop. But just before then, thread B enters (count == 0)
    * and sets count = 2. At which point, we have a deadlock, thread A
    * cannot break free out of the barrier
    *
    */

    if (count == 0) {
    count = NUM_THREADS;
    } else {
    while (count &amp;gt; 0) {
    printf("Spinning …. count = %d\n", count);
    }
    while (count != NUM_THREADS){
    printf("Spinning on count\n");
    }
    }

    [/code]

    Data Structures and Algorithms

    How to represent a tree based algorithm using multi-dimensional arrays in C

    For both the dissemination and tournament barrier, I had to build multi-dimensional arrays in C. I initially had a difficult time envisioning the data structure described in the research papers, asking myself questions such as “what do I use to index into the first index?”. Initially, my intuition thought that for the tournament barrier, I’d index into the first array using the round ID  but in fact you index into the array using the rank (or thread id) and that array stores the role for each round.

    [code lang=”C”]
    typedef struct {
    bool myflags[PARITY_BIT][MAX_ROUNDS];
    bool *partnerflags[PARITY_BIT][MAX_ROUNDS];
    } flags_t;

    void flags_init(flags_t flags[MAX_ROUNDS])
    {
    int i,j,k;

    for (i = 0; i < MAX_NUM_THREADS; i++) {
    for (j = 0; j < PARITY_BIT; j++) {
    for (k = 0; k < MAX_NUM_THREADS; k++) {
    flags[i].myflags[j][k] = false;
    }
    }
    }
    }
    [/code]

    OpenMP and OpenMPI

    Prior to starting I never heard of neither OpenMP nor OpenMPI. Overall, they are two impressive pieces of software that makes multi-threading (and message passing) way easier, much better than dealing with the Linux pthreads library.

    Summary

    Overall, the project was rewarding and my understanding of synchronization barriers (and the various flavors) were strengthen by hands on development. And if I ever need to write concurrent software for a professional project, I’ll definitely consider using OpenMP and OpenMPI instead of low level libraries like PThread.

  • Daily Review – Day ending in 2020/10/11

    Daily Review – Day ending in 2020/10/11

    I’m thrilled to be “off call” in about 4.5 hours, no longer tied to my pager and no longer anxious from possibility of waking up to the sound of nasty alarm. Really, the anxiety revolves around the randomness and the unknown of being paged.  What’s also variable is the length of these engagements: sometimes the troubleshooting takes 5 minutes and sometimes 5 hours. You just never know.

    The point it this: I’m happy to return to a normal work week.

    Best parts of my day

    • Laying in bed next to my wife at night. For the past four or five months I’ve been sleeping on the floor on a foam tri fold out mattress laid out on the uncomfortable carpet floor. And finally, now that we are finally moved into our new home, I’m sleeping on a real bed and last night my wife and I laid next to one another. Sure, it was only about 5-10 minutes but hey: it’s the little things right?

    Graduate School

    • Wrote up a paper summary on “Building Reliable High Performance Communication Systems from components
    • Drew figures of barrier synchronization on my iPad
    • Wrote up Project 2 (barrier synchronization) work log into Google Docs to share with my project partner
    My doodle of dissemination barrier
    My doodle of dissemination barrier

    Thoughts

    • Amazon Web Services draws inspiration from academia. For example, the techniques and principles used to build overlay networks within EC2 Network resemble the principles from the paper Active Networks (although there’s probably even earlier papers to draw inspiration from)
  • Weekly Review – Week ending in 2020/10/11

    Weekly Review – Week ending in 2020/10/11

    I’m shattered. This past week really broke me, the numerous 3:30 AM wake ups and the long operational issues running until 09:30 PM (past the time I’d like to be asleep). To recover from this taxing work week, I’m taking next Thursday and Friday off.

    Despite the rough week, I’m relieved that my wife and I are totally moved into our new home — not fully unboxed — but at least I’m waking up to a warm home.

    Work

    • Worked consumed my entire life this week
    • Being tied to the laptop and pager impacts not just me but my family as well
    • Sporadic wake ups and late nights off the entire schedule, breaking many of my rituals
    • Because of all this I’m taking 2 days off next week to recover and catch up on lost time eaten by the heavy work week
    • Over 300+ signed up for my event at Amazon
      • Hosting a panel discussion with (4) senior engineers at Amazon on career growth and promotions
    • This particular week was more difficult than others
      • Waking up at 03:30 AM multiple nights in a row
      • Operational issues lasting until 10:00 pm

    Family

    • First week living in the new house in Renton
    • Having a child underscores the fact how fleet time is
    • Family has changed my entire world, flipping it upside down
    • I don’t notice the changes every day but sometimes I’ll pause and take a look at her and she’s not only radically physically changing but developmentally as well
      • She’s taking her first steps
      • She’s couch surfing
      • She’s uttering her first words (surprisingly its “ball”)
    • Moving to a bigger home in Renton in retrospect has been the best thing that has happened
      • Neighbor was mowing their front lawn and offered to mow ours at the same time (took them up on that offer)

    Marketing

    • Chipping away at “This is marketing” book while in bed at night

     

  • Don’t break the (writing) chain … has been broken

    Don’t break the (writing) chain … has been broken

    This week, my cumulative “write every day” streak has been broken (almost 2 months of consistent writing every day), thanks to one of the roughest weeks at work. I normally start every day off with some light blogging — even if its for 5 or 10 minutes — but almost every day this week I was prematurely woken up due to my pager alarming me out of bed. So honestly, I couldn’t be more happier that it’s Friday (TGIF, for real) even though I have 2 more days (over the weekend) of being on call; I haven’t felt this physically and mentally and emotionally exhausted in a long time. Every time I get paged out of bed I’m forced to get my mental gears in motion and it’s very difficult switch off, making it nearly impossible to go back to sleep for a nap.

    So the days have been … very long.

    Oh well.

    So now, it’s time to reset the “cumulative days” of writing counter