Swinging on the swings with Elliott. She was sitting on my lap, facing my direction, as swung us on the swing back and forth. The entire time she was smiling and when her cold chubby cheeks brushed up against mine, some dad love feelings ran through my body.
Eating a kick ass lunch. My wife whipped up a delicious Vietnamese dish with bean sprouts and pickled cucumbers and fermented tofu.
Therapy Session
Vented about on call the past week. I shared the handful of times that I was paged out of bed at 03:30 AM and the operational issues that lasted until 10:00 pm on the evenings and being tied to the computer.
Shared the sense of betrayal I felt at work. How someone who I thought was in my corner, rooting for me, actually no longer advocates for me after I declined their “opportunity” to lead a project.
Declining the project that “sets me back for a promotion” allows me to bring my best self to work. Although one could argue that me not accepting the “opportunity” to lead a project (that would’ve easily tacked on additional 5-10 hours per week) sets me back for my promotion, I did it so maintain a balance between my professional and personal life, which quite frankly is what I am all about: living a balanced life, not one where I am endlessly pursuing the next thing.
Realized that I need to build a stronger community of supporters. Take away from this sense of betrayal is that I need to widen my network of supporters and not to put all my eggs in one basket, so to speak.
Random
Cool Quotes
“Academia is ripe for pursuing ideas on the lunacy fringe”.
Graduate School
Skipping Java and Spring (for now) and jumping into Distributed System lectures. I almost always follow the curriculum in the order prescribed by the syllabus. But I’ve decided to skip (for now) some video lectures on Java and Spring… and this makes me a little bit anxious. Honestly, I find the Java and Spring lectures a bit boring (to be fair, I might actually enjoy them once I start watching the lectures). Right now, I’m interested in learning more about the fundamentals and theories of constructing distributed systems so I’ll jump to those videos.
Music
Dusted off and played the ukulele for Jess and Elliott. I feel like I abandoned the ukulele after picking up the guitar. And strumming my beautiful Soprano ukulele yesterday reminded me that the instrumental has its own vibe and own personality, the strings sounding much more … bright.
Family
Fell asleep at 06:30 pm. I normally fall asleep between 08:30 and 09:30, on both weekdays and weekends. But yesterday I was shattered after my sleep was interrupted several times throughout the night, the most notable when Jess had a coughing fit at around 03:00 AM, at which point I was pretty much unable to fall back asleep.
Slept in the same bed as Elliott and Jess last night. This was pretty sweet actually. Although I woke up multiple times throughout the night, I loved having Elliott and Jess in the same bed since I’ve been sleeping alone — well, with the dogs — for the last six months or so.
Wrapped a baby net around stair railings. Installing the net (my sister’s idea) prevents little Elliott from slipping through the rails. Although I do not think her head could possibly squeeze through, the net gives Jess piece of mind and that’s important since she’s the one watching her throughout the day.
Took Elliott shopping at Safeway while Jess took her work call. We picked up some avocados for our morning smoothies and some blackberries and blueberries (side note: why does a palm sized box of blueberries cost $6.00). While paying for our goods, our cashier awkwardly pulled down her mask for Elliott to see her face. Outside the context of COVID-19, this would be totally normal and even appreciated. But given we are in the midst of the pandemic, I felt awkward but didn’t feel compelled to ask her to put her mask back on given there was a glass screen positioned between us.
Graduate School
Lectures
Did not read any lectures but did download all the videos for Spring Operating systems. I normally do not download the videos to my laptop however wind was hitting us pretty hard and our lights were flickering so I was preparing for the situation in which our internet disconnected.
Project Work
Completed writing documentation for project 2 submission. This included an analysis of the algorithms that I wrote (i.e. centralized barrier with sense reversal, dissemination and tournament) and why some barriers performed better (or worse) than other barriers.
Skimmed through requirements for project 3i.e. build a distributed service using grpc). This project got announced Monday evening and will be due in about 3-4 weeks and I like to get started early to avoid last minute cramming, which leads to unnecessary stress.
Now that we talked about happened before events, we can talk about lamport clocks
Lamport’s Logical Clock
Summary
A logical clock that each process has and that clock monotonically increases as events unfold. For example, if event A happens before event B, then the event A’s clock (or counter value) must be less than that of B’s clock (or counter). The same applies between “happened before” events. To achieve this, a process will increase its own clock monotonically by setting the counter to the maximum of “receipt of the other process’s clock or its own local counter”. But what happens with concurrent events? Will hopefully learn soon
Events Quiz
Summary
C(a) < C(b) means one of two things. A happened before B in the same process. Or B is the recipient and chose the max between its local clock.
Logical Clock Conditions
Lamport’s logical clock conditions
Summary
Key Words: partial order, converse of condition
Lamport clocks give us a partial order of all the events in the distributed system. Key idea: Converse of condition 1 is not true. That is, just because timestamp of x is less than timestamp of y does not necessarily mean that x happened before y. Also, is partial order sufficient for us distributed system designers?
Need For A Total Order
Need for total order
Summary
Imagine a situation in which there’s a shared resource and individual people want access to that resource. They can send each other a text message (or any message) with a timestamp. The sender with the earliest timestamp wins. But what happens if two (or more) timestamps are the same. Then each individual person (or node) must make a local decision (in this case, oldest age of person wins)
Lamport’s Total Order
Lamport’s total order
Summary
Use timestamps to order events and break the time with some arbitrary function (e.g. oldest age wins).
Total Order Quiz
Total order quiz
Summary
Identify the concurrency events and then figure out how to order those concurrent events to break the ties
Distributed Mutual Exclusion Lock Algorithm
Distributed mutual exclusion lock
Summary
Essentially, this algorithm basically requires each process to send their lock requests via messages and these messages need to be confirmed by the other processes. Each request contains a timestamp and each host will make its own local decision, queueing (and sorting) the lock request in their local process and breaking ties by preferring lowest process ID. What’s interesting to me is that this algorithm makes a ton of assumptions like no broken network links or split brain or unidirectional communication: so many failure modes that haven’t been taken into consideration. Wondering if this will be discussed in next video lectures
Lots of assumptions of distributed mutual exclusion lock algorithm including 1) All messages are received in the order they are sent and 2) no messages are lost (this is definitely not robust enough for me at least)
Messages Quiz
Messages Quiz
Summary
With Lamport’s distributed mutual exclusion lock, we need to send 3(N-1) messages. First round that includes timestamp, second for acknowledge, third for release of lock.
Message Complexity
Message Complexity
Summary
A process can defer its acknowledgement by sending it during the unlock, reducing the message complexity from 3(N-1) to 2(N-1).
Real World Scenario
Real world scenario
Summary
Logical clocks may not good be enough in real world scenarios. What happens if a system clock (like the banks clock) drifts? What should we be doing instead?
Two conditions must be met in order to achieve Lamport’s physical clock. First is that the individual clock drift for any given node must be relatively small. Second, the mutual clock drift (i.e. between two nodes) must be negligible as well. We shall see soon but the clock drift must be negligible when compared to intercommunication process time
IPC time and clock drift
IPC and Clock Drift
Summary
Clock drift must be negligible when compared to the IPC time. Collectively, this is captured in the equation M >= Epsilon ( 1 – k) where E is the mutual clock drift and where K is individual clock drift.
Real World Example (continued)
Real world example
Summary
Mutual clock drift must be lower than IPC time. Or put differently, the IPC time must be greater than the clock drift. So the key take away is this: the IPC time (M) must be greater than the mutual clock drift (i.e. E), where k is the individual clock drift. So we want a individual clock drift to be low and to eliminate anomalies, we need IPC time to be greater than the mutual clock drift.
Conclusion
Summary
We can use Lamport’s clocks to stipulate conditions to ensure deterministic behavior and avoid anomalous behaviors
Wrote up my analysis on the various barrier synchronization algorithms that I implemented. I had to describe the various algorithms (e.g. dissemination barrier, tournament barrier, centralized sense reversal barrier) for the documentation that will accompany our code and experiments as part of Project 2 for advanced operating systems.
Finished watching lectures on Active Networks. Didn’t find the material too interesting however I realized there are certainly concepts (like overlay networks) applied to our (AWS) networks.
What I learned
Quote from Donald Knuth “Beware of bugs in the above code. I’ve only proved it correct — not tried it.“
Work
Represented my team (Blackfoot Edge Applications) at weekly org wide operations meeting. Every week, our senior manager runs a organization (Blackfoot) wide operations meeting where each team reviews their their high severity events and dashboards from the previous week. During the meeting, I shared three particular interesting events that took place.
Before starting project 2 (for my advanced operating systems course), I took a snapshot of my understanding of synchronization barriers. In retrospect, I’m glad I took 10 minutes out of my day to jot down what I did (and did not) know because now, I get a clearer pictur eof what I learned. Overall, I feel the project was worthwhile and I gained not only some theoretical knowledge of computer science but I was also able to flex my C development skills, writing about 500 lines of code.
Discovered a subtle race condition with lecture’s pseudo code
Just by looking at the diagram below, it’s not obvious that there’s a subtle race condition hidden. I only was able to identify it after whipping up some code (below) and analyzing the concurrent flows. I elaborate a little more on the race condition — which results in a deadlock — in this blog post.
Centralized Barrier
[code lang=”C”]
/*
* Race condition possible here. Say 2 threads enter, thread A and
* thread B. Thread A scheduled first and is about to enter the while
* (count &gt; 0) loop. But just before then, thread B enters (count == 0)
* and sets count = 2. At which point, we have a deadlock, thread A
* cannot break free out of the barrier
*
*/
if (count == 0) {
count = NUM_THREADS;
} else {
while (count &gt; 0) {
printf("Spinning …. count = %d\n", count);
}
while (count != NUM_THREADS){
printf("Spinning on count\n");
}
}
[/code]
Data Structures and Algorithms
How to represent a tree based algorithm using multi-dimensional arrays in C
For both the dissemination and tournament barrier, I had to build multi-dimensional arrays in C. I initially had a difficult time envisioning the data structure described in the research papers, asking myself questions such as “what do I use to index into the first index?”. Initially, my intuition thought that for the tournament barrier, I’d index into the first array using the round ID but in fact you index into the array using the rank (or thread id) and that array stores the role for each round.
void flags_init(flags_t flags[MAX_ROUNDS])
{
int i,j,k;
for (i = 0; i < MAX_NUM_THREADS; i++) {
for (j = 0; j < PARITY_BIT; j++) {
for (k = 0; k < MAX_NUM_THREADS; k++) {
flags[i].myflags[j][k] = false;
}
}
}
}
[/code]
OpenMP and OpenMPI
Prior to starting I never heard of neither OpenMP nor OpenMPI. Overall, they are two impressive pieces of software that makes multi-threading (and message passing) way easier, much better than dealing with the Linux pthreads library.
Summary
Overall, the project was rewarding and my understanding of synchronization barriers (and the various flavors) were strengthen by hands on development. And if I ever need to write concurrent software for a professional project, I’ll definitely consider using OpenMP and OpenMPI instead of low level libraries like PThread.
I’m thrilled to be “off call” in about 4.5 hours, no longer tied to my pager and no longer anxious from possibility of waking up to the sound of nasty alarm. Really, the anxiety revolves around the randomness and the unknown of being paged. What’s also variable is the length of these engagements: sometimes the troubleshooting takes 5 minutes and sometimes 5 hours. You just never know.
The point it this: I’m happy to return to a normal work week.
Best parts of my day
Laying in bed next to my wife at night. For the past four or five months I’ve been sleeping on the floor on a foam tri fold out mattress laid out on the uncomfortable carpet floor. And finally, now that we are finally moved into our new home, I’m sleeping on a real bed and last night my wife and I laid next to one another. Sure, it was only about 5-10 minutes but hey: it’s the little things right?
Graduate School
Wrote up a paper summary on “Building Reliable High Performance Communication Systems from components“
Drew figures of barrier synchronization on my iPad
Wrote up Project 2 (barrier synchronization) work log into Google Docs to share with my project partner
My doodle of dissemination barrier
Thoughts
Amazon Web Services draws inspiration from academia. For example, the techniques and principles used to build overlay networks within EC2 Network resemble the principles from the paper Active Networks (although there’s probably even earlier papers to draw inspiration from)
I’m shattered. This past week really broke me, the numerous 3:30 AM wake ups and the long operational issues running until 09:30 PM (past the time I’d like to be asleep). To recover from this taxing work week, I’m taking next Thursday and Friday off.
Despite the rough week, I’m relieved that my wife and I are totally moved into our new home — not fully unboxed — but at least I’m waking up to a warm home.
Work
Worked consumed my entire life this week
Being tied to the laptop and pager impacts not just me but my family as well
Sporadic wake ups and late nights off the entire schedule, breaking many of my rituals
Because of all this I’m taking 2 days off next week to recover and catch up on lost time eaten by the heavy work week
Over 300+ signed up for my event at Amazon
Hosting a panel discussion with (4) senior engineers at Amazon on career growth and promotions
This particular week was more difficult than others
Waking up at 03:30 AM multiple nights in a row
Operational issues lasting until 10:00 pm
Family
First week living in the new house in Renton
Having a child underscores the fact how fleet time is
Family has changed my entire world, flipping it upside down
I don’t notice the changes every day but sometimes I’ll pause and take a look at her and she’s not only radically physically changing but developmentally as well
She’s taking her first steps
She’s couch surfing
She’s uttering her first words (surprisingly its “ball”)
Moving to a bigger home in Renton in retrospect has been the best thing that has happened
Neighbor was mowing their front lawn and offered to mow ours at the same time (took them up on that offer)
Marketing
Chipping away at “This is marketing” book while in bed at night
This week, my cumulative “write every day” streak has been broken (almost 2 months of consistent writing every day), thanks to one of the roughest weeks at work. I normally start every day off with some light blogging — even if its for 5 or 10 minutes — but almost every day this week I was prematurely woken up due to my pager alarming me out of bed. So honestly, I couldn’t be more happier that it’s Friday (TGIF, for real) even though I have 2 more days (over the weekend) of being on call; I haven’t felt this physically and mentally and emotionally exhausted in a long time. Every time I get paged out of bed I’m forced to get my mental gears in motion and it’s very difficult switch off, making it nearly impossible to go back to sleep for a nap.
So the days have been … very long.
Oh well.
So now, it’s time to reset the “cumulative days” of writing counter
In yesterday’s post, I had mentioned that I was paged out of bed at around 03:30 AM because of an operational issue. And for the rest of the day, my mind was fried and I practically looked like a zombie the rest of the day, my word constantly slurred. On top of all that, a different issue cropped up in the evening, the event lasting several hours and robbing me from having dinner with my family. Ugh.
Silver lining: my colleague Paul, a god damn saint, overrode the night time shift and he took my on call over night, allowing me to get some much needed rest. The next morning, I sent an e-mail over to my manager, praising Paul and making sure that his good deed(s) does not go unnoticed.
Sadly I didn’t get to start the day off with writing, my morning routine, since my phone paged me out of bed at 3 AM due to an operational issue from work that lasted about about three hours. Because of this, I know that I’ll feel “off” the rest of the day given that I’ve been awake for over 5 hours and the time is barely 9 AM. Oh well. Time to heat up another Chai Latte to keep me awake for the day …