Author: mattchung

  • Daily Review – Day ending in 2020/11/04

    Daily Review – Day ending in 2020/11/04

    My favorite part of the day was waking up to a video that Jess recorded while I was asleep, a video capturing a little frog dancing on our window facing the backyard. My wife: she’s super cute.

    Work

    • Hosted and moderated a tech panel on career growth and promotions. On behalf of Asians@ at Amazon, I lead a conversational fire side chat with four (Asian) senior software development engineers about some of the non technical barriers that our community often faces throughout our career. Not speaking up. Not advocating for one self. Everyone on the panel nailed it (despite how nervous they were) and I look forward to creating and hosting new events in the future.

    Health

    • Fought off a killer headache that kicked in after I ate che (a Vietnamese dessert). My body rejects processed sugars and when I consume too much of it, my body sends me a strong signal in the form of a painful headache that lasts several hours, the only way to fight it off is to gulp down as much water as possible and flush the sugars out of my system.

    Daily Photo

    Dancing frog on the window
  • Daily Review – Day ending in 2020/11/03

    Daily Review – Day ending in 2020/11/03

    Family

    • Jess and I felt nervous about the presidential election results while watching the news online. Despite Biden leading in the polls, just as Hilary four years ago back in 2016, I’m sitting at the edge of my seat as the results come in, not confident at all that Biden can pull off a victory. Again, as I mentioned in earlier posts, although I cast my vote for Biden, I’m more voting to get Trump out.

    Graduate School

    • Learned of a brilliant technique to simplify crash recovery by removing an edge case and exercising the same code as the abort transaction. Technique from RioVista papers. The crash recovery is also idempotent so it can withstand power failure during the crash recovery itself. Idempotent. First heard of this term when I was learning Ansible back in the day. Something to think about again.

    Work

    • Checked in some GRE header code that basically implements RFC 2784 and 2890. Stepping through the RFC, I learned how much unnecessary overhead the protocol carries with it. A 3 bit version field that MUST always be zero? Granted, I understand that the protocol designers were trying to future proof the protocol but now I understand why a protocol like IPIP prevails given that IPIP will encapsulate only IP.

     

  • Daily Review – Day ending in 2020/11/02

    Daily Review – Day ending in 2020/11/02

    Family

    Elliott and I drove to Lowe’s home improvement store and picked up a new faucet. Normally, Elliott will cry after sitting in the car for more than 10 minutes but not this trip. She did amazing and the two us had a blast at Lowe’s and she helped me select our new faucet since the one installed in our kitchen busted, causing water to leak into and underneath the sink. Luckily, my dad is in town and helped replace the existing broken faucet with the one I just purchased.

    Graduate School

    Submitted project 3 assignment on gRPC and multi-threading. Similar to project 2, I partnered up with another Seattle based student (who also happens to be Vietnamese as well) and we divided the work down the middle and stitched our code together.

    Watched half the lectures on RioVista. This lecture series seems to be a continuation of the last lecture series (on lightweight recoverable virtual machines), adding on top of the idea of using transactions to keep a log record of changes to the virtual memory address space. The key different with RioVista though is that they are targeting performance as their primary goal.

    Health

    Walked about 3 miles on the treadmill yesterday while working and studying. Last week, I could not find the remote that controls my treadmill but luckily Jess found it buried in the back of my office and luckily I was able to get some exercise in.

    Work

    Held a dry-run kick off meeting for a panel that I’m hosting and moderating this upcoming Wednesday. The purpose of the panel is to give people a sense of what it looks like to grow one’s career as a software engineer at Amazon.

  • Weekly Review – Week ending in 2020/11/01

    Weekly Review – Week ending in 2020/11/01

    No Halloween this year

    I used to love Halloween growing up, not so much the dressing up part but the knocking on doors and getting handed fist fulls of candy. Now, as an adult, I love returning the favor and always think about giving out larger than average candy and chocolate.

    But not this year, thanks to COVID-19.

    Hopefully 2020 will be the one and only year that we skip Halloween …

    Starting writing my first e-book

    I’m compiling all my blog posts on “advanced operating systems refresher” into a single, nicely formatted e-book. The book will provide a summary and detailed notes on Udacity’s Advanced Refresher course.

    Media Consumption

    Watched the first two episodes of “This is Us”. Again, as I mentioned in my blog posts, the writer’s (and cast and crew) deserve a huge applaud for pivoting and incorporating two major events in history — the COVID-19 pandemic and police brutality on black lives — into the story line. That’s no easy feat but they are pulling it off.

    Watched Borat subsequent film. I found the film hilarious and ingenious. Reveals how complicated people can be. For example, two trump supporters end up taking Borat into their homes and at one point, they even speak up on behalf of women.

    Home Care

    I’m super motivated keeping our new home in tip top shape. I had learned that the previous owner’s took a lot of pride in the house, the retired couple out in the front or back yard on a daily basis, the two of them maintaining the lawn and plants.

    Learning how to take care of the lawn. That includes learning the different modes of mowing (i.e. mulching, side discharge, bagging), the difference between slow release and fast release nitrogen, the importance of aerating, the importance of applying winterizer two weeks before the last historical freeze day, how to edge properly and so on.

    Family

    Still working from home and still appreciate the little gestures from Jess throughout the day. I sometimes get lost in a black hole of thoughts and troubleshooting, not drinking any water or eating snacks for hours at a time. So the little snacks that Jess drops off go a long way.

    Health

    Although taking care of mental health, not so much physical health. Only exercised once last week which was basically jogging on the treadmill.

    Graduate School

    Applied theory of lightweight recoverable virtual machine to work. In advanced operating systems, I took the concept of the abort transaction and suggested that we install a similar handler in our control plane code.

  • Daily Review – Day ending in 2020/11/01

    Daily Review – Day ending in 2020/11/01

    What a weird morning. Totally forgot the the time changed, the clock moving back an hour (i.e. we gain an hour), and found myself studying and working at 03:30 AM (instead of 04:30 AM). When I first woke up and glanced at my Casio watch, I debated whether to get out of bed or not since I was my body was telling me it needed a little more rest. However, knowing that Elliott would soon wake me up and knowing that I had a very small window of time to myself, I rolled out of bed. Parent life.

    Home care

    • Manually aerated the front yard using core aeration tool. The video advertisements for aerating make the process of aerating seem so pleasant. It’s not. It’s actually a lot of hard work and I broke a massive sweat aerating the front yard.I totally understand now why people rent aerating machines at Home Depot (or Lowe’s) for $20.00 bucks an hour. Renting a machine would make the work so much easier.
    • Fed the lawn fertilizer. After aerating the lawn, I poured about 2.2 pounds of fertilizer into the Ace branded bucket, transferred the food into the spreader, and then dispensed the it all over the front yard.

    Family

    • Family trip to Lam’s seafood located in Tequila. This Lam’s seafood location blows the other location (i.e. International District) out of the water. The location in Tequila is not only larger, but it’s cleaner and closely resembles the layout and feel of Uwajimaya.  Shopping in the store reminds me of growing up in Little Saigon, all the Vietnamese chatter and all the people bustling in the background.
    • Dropped off our ballot at the library located across the street. Just before sliding the ballot into the drop off box, I sat in the front seat of the car (with the two dogs next to me) with the Stranger’s cheat sheet opened up, filling out the last few choices that I was undecided.

     

  • Five tips for surviving (or thriving) in the OMSCS program as a computer science graduate student

    Five tips for surviving (or thriving) in the OMSCS program as a computer science graduate student

    Overview

    In this post, I’m sharing five tips that I’ve picked up over the last 2 years in the program. At the time of this writing, I’m wrapping up my 7th course (advanced operating systems) in the OMSCS program. This means that I have 3 more courses to complete until I graduate with my masters in computer science from Georgia Tech.

    This post assumes that you are already enrolled as a student in the program. If you are still on the fence as to whether or not you should join the program, check out Adrian’s blog post on “Georgia Tech Review: yes, maybe, no”. Very shortly, I’ll post my own thoughts and recommendations on whether or not I think you should join the program.

    Five Tips

    Prerequisites

    Before enrolling in a course, check the course’s homepage for recommended prerequisites. Once you are officially accepted into the program,  you’ll soon discover that unlike undergraduate, you can enroll in courses despite not meeting prerequisites. The upside of this relaxed approach is that students who are really eager to take a course can do so (assuming seats are available). The downside is that if you are not prepared for the course, you’ll be way in over your head and likely will drop the course within the first few weeks since many courses are front loaded.  For example, thinking about taking advanced operating systems? Before jumping the gun, take the diagnostic exam. Doing so will help you gauge whether or not you set up for success.

    Time management

    Manage your time by setting and schedule and sticking to it. If you are an early bird like me, carve out 50 minutes early in the morning, take a shot of your expresso, and get cracking.  Set specific times in which you will sit down and study (e.g. watch lectures, review notes, work on a project) and aim to stick to that schedule (as a new parent, I understand that plans often can easily be foiled).

    Trade offs

    Know when and what to slip or sacrifice. This is super important. We all 24 hours in a day. And we all have other responsibilities such as a full time job or children to take care of. Or something unexpectedly will just pop up in your life that demands your attention. That’s okay. Graduate school (and doing well in graduate school) is important. But there will be other competing motivators and sometimes, you’ll have to sacrifice your grade. That’s okay.

    Expectations

    Know what you want to get out of the program. What are you trying to get out of the program? Or what are you trying to get out of the current course you are enrolled in? Are you here to learn and improve your craft? Then focus on that. Or are you here to get a good grade? Then do that. Are you here to expand your network and potentially apply to a PhD program? Then make sure you attend the office hours and meet with the professors.

    Course reviews

    Read the reviews on OMSCentral.  If you haven’t read reviews on OMSCentral.com for the courses you plan to take (or are currently taking), please go and do that. Right now. On that website, you’ll find reviews from former students. They give you a better sense of what you’ll learn, what projects you’ll tackle, how engaged the professor is , whether the grades are curved, how difficulty the exams are, and maybe most important: how many hours you should expect to put into the course.

    Notifications

    Create e-mail (or text) notifications in your calendar for upcoming due dates. It sucks when you forget that an upcoming assignment is due soon. Sucks even more when you find out that the due date has passed. Although you can rely on Canvas (Georgia Tech’s current system for managing their courses), I still recommend using your own system.

    Summary

    The above tips do not guarantee a smooth journey. Chaos will somehow creep its way into our time and things will get derailed. That’s okay. Take a deep breathe and revisit the tip above on prioritization. Remind yourself why you are in graduate school. Now go out there and kick some ass.

  • Distributed Shared Memory (Part 2 of 2) Notes

    Distributed Shared Memory (Part 2 of 2) Notes

    An example

    Example of release consistency model

    Summary

    Key Words: Conditional variable, pthread_signal, pthread_wait

    in the concrete example (screenshot below), P1 instructions that update memory (e.g. flag = 1) can be run in parallel with that of P2 because of release consistency model

    Advantage of RC over SC

     

    Summary

    In a nutshell, we gain performance in a shared memory model using release consistency by overlapping computation with communication, because we no longer wait for coherence actions for every memory access

    Lazy RC (release consistency)

    Lazy Release Consistency

    Summary

    Key Words: Eager

    The main idea here is that the “release consistency” is eager, in the sense that cache coherence traffic is generated immediately after unlock occurs. But with lazy RC, we defer that cache coherence traffic until the acquisition

    Eager vs Lazy RC

    Eager vs Lazy Consistency

    Summary

    Key Words: Eager, Lazy

    Basically, eager and lazy goes boils down to a push (i.e. eager) versus pull (i.e. lazy) model. In the former, every time the lock is released, coherence traffic broadcasts to all other processes

    Pros and Cons of Lazy and Eager

    Summary

    Advantage of lazy (over eager) is that there are less messages however there will be more latency during acquisition

    Software DSM

    Software DSM

    Summary

    Address space is partitioned, meaning each processor is responsible for a certain set of pages. This model of ownership is a distributed, and each node holds metadata about the page and is responsible for sending coherence traffic (at the software level)

    Software DSM (Continued)

    Software DSM (continued)

    Summary

    Key Words: false sharing

    DSM software runs on each processor (cool idea) in a single writer multiple reader model. This model can be problematic because, coupled with false sharing, will cause significant bus traffic that ping pongs updates when multiple data structures live within the same cache line (or page)

    LRC with Mutli-Writer Coherence Protocol

    Lazy Release Consistency with multi-writer coherence

    Summary

    With lazy release consistency, a process will (during the critical section) generate a diff of the pages that have been modified, the diff later applied when another process performs updates to those same pages

    LRC with Multi-Writer Coherence Protocol (Continued)

    Summary

    Need to be able to apply multiple diffs in a row, say Xd and Xd’ (i.e. prime)

    LRC with Multi Writer Coherence Protocol (Continued)

    Summary

    Key Words: Multi-writer

    The same page can be modified at the same time by multiple threads, just so as long as a separate lock is used

    Implementation

    Implementation of LRC

    Summary

    Key Words: Run-length encoded

    During a write operation (inside of a lock), a twin page will get created, essentially a copy of the original page. Then, during release, a run-length encoded diff is computed. Following this step, the memory access is then write protected

    Implementation (continued)

    LRC Implementation (Continued)

    Summary

    Key Words: Data Race, watermark, garbage collection

    A daemon process (in every node) wakes up periodically and if the number of diffs exceed the watermark threshold, then daemon will apply diffs to original page. All in all, keep in mind that there’s overhead involved with this solution: overhead with space (for the twin page) and overhead in runtime (due to computing the run-length encoded diff)

    Non Page Based DSM

    Non-page-based DSM

    Summary

    Two types of library based that offer alternatives, both that do not require OS support. The two approaches are library-based (variable granularity) and structured DSM (API for structures that triggers coherence actions)

    Scalability

    Scalability

    Summary

    Do our (i.e. programmer’s) expectations get met as the number of processors increase: does performance increase accordingly as well? Yes, but there’s substantial overhead. To be fair, the same is true with true shared memory multiple processor

  • Distributed Shared Memory (Part 1 of 2) Notes

    Distributed Shared Memory (Part 1 of 2) Notes

    Introduction

    Summary

    Main question is this: can we make a cluster look like a shared memory machine

    Cluster as a parallel machine (sequential program)

    Cluster as a parallel machine

    Summary

    One strategy is to not write explicit parallel programs and instead use language assisted features (like pragma) that signal to the compiler that this section be optimized. But, there are limitations with this implicit approach

    Cluster as a parallel machine (message passing)

    Cluster as parallel machine (message passing)

    Summary

    Key Words: message passing

    One (of two) styles for explicitly writing parallel programs is by using message passing. Certain libraries (e.g. MPI, PVM, CLF) use this technique and true to a process’s nature, the process does not share its memory and instead, if the process needs to communicate with another entity, it does so by message passing. The downside? More effort from the perspective of the application developer

    Cluster as a parallel machine (DSM)

    Cluster as a parallel machine (parallel program)

    Summary

    The advantage of a DSM (distributed shared memory) is that an application developer can ease their way in, their style of programming need not change: they can still use locks, barriers, and pthreads styles, just the way they always have. So the DSM library provides this illusion of a large shared memory address space.

    History of shared memory systems

    History of shared memory systems

    Summary

    In a nutshell, software DSM has its roots in the 80s, when Ivy league academics wanted to scale the SMP. And now, (in the 2000s), we are looking at clusters of symmetric memory processors.

    Shared Memory Programming

    Shared Memory Programming

    Summary

    There are two types of synchronization: mutual exclusion and barrier. And two types of memory accesses: normal read/writes to shared data, and read/write to synchronization variables

    Memory consistency and cache coherence

    Memory Consistency vs Cache Coherence

    Summary

    Key Words: Cache Coherence, Memory consistency

    Memory consistency is the contract between programmer and the system and answers the question “when”: when will a change to the shared memory address space reflect in the other process’s private cache. And cache coherence answers the “how”, what mechanism will be used (cache invalidate or write update)

    Sequential Consistency

    Sequential Consistency

    Summary

    Key Words: Sequential Consistency

    With sequential consistency, program order is respected, but there’s arbitrary interleaving. Meaning, each individual read/write operations are atomic on any processor

    SC Memory Model

    Sequential Consistency Model

    Summary

    With sequential consistency, the memory model does not distinguish a read/write access to a synchronization read/write access. Why is this important? Well, we always get the coherence action, regardless of memory access type

    Typical Parallel Program

    Typical Parallel Program

    Summary

    Okay, I think I get the main gist and what the author is trying to convey. Basically, since we cannot distinguish the reads and writes for memory accesses — from normal read/write versus synchronization read/write — then that means (although we called it out as a benefit earlier) cache coherence will continue to take place all throughout the critical section. But, the downside here is that that coherence traffic, is absolutely unnecessary. And really, what we probably want is that only after we unlock the critical section should the data within that critical section, be updated across all other processor cache

    Release Consistency

    Release Consistency

    Summary

    Key Words: release consistency

    Release consistency is an alternative memory consistency model to sequential consistency. Unlike sequential consistency, which will block a process until coherence has been achieved for an instruction, release consistency will not block but will guarantee coherence when the lock has been released: this is a huge (in my opinion) performance improvement. Open question remains for me: does this coherence guarantee impact processes that are spinning on a variable, like a consumer

  • Global memory systems (part 1 of 2) notes

    Global memory systems (part 1 of 2) notes

    Introduction

    Summary

    Lessons will be broken down into three modules: GMS (i.e. can we use peer memory for paging across LAN) and DSM (i.e. can we make the cluster appear as a shared memory machine) and DFS (i.e. can we use cluster memory for cooperative caching of files)

    Context for Global Memory Systems

    Summary

    Key Words: working set

    Core idea behind GMS is when there’s a page fault, instead of checking disk, check the cluster memory instead. Most importantly, no dirty pages (does not complicate the behavior): the disk always has copies of the pages

    GMS Basics

    Lesson Outline
    Lesson Outline. GMS – how can we use peer memory for paging across LAN?

    Summary

    With GMS, a physical node carves out its physical memory into two areas: local (for working set) and global (servicing other nodes as part of being in the “community”). This global cache is analagous to a virtual memory manager in the sense that the GSM does not need to worry about data coherence: that’s the job of the upper applications.

    Handling Page Faults Case 1

    Handling Page Faults (Case 1)

    Summary

    Most common use case is a page fault on a local node. When the page fault occurs, the node will get the data from some other node’s global cache, increase its local memory footprint by 1 page, and correspondingly decrease its global memory footprint by 1 page.

    Handling Page Fault Case 2

    Handling Page Faults (Case 2)

    Summary

    In the situation which a node has no global memory (i.e. its local memory entirely consumed by working set), then the host handle a page fault by evicting a local page (i.e. finding a victim, usually through LRU page replacement policy), then requesting some page from another node’s global memory (i.e. the community service). What’s important to call out here is that there’s no changes in the number of local pages and no change in the number of global pages for the node.

    Handling Page Fault Case 3

    Handling Page Faults (Case 3)

    Summary

    Key Words: LRU, eviction, page fault

    Key Take away here is that if the globally oldest page lives in local memory, then we can free it up and expand the globally available memory for community service. Also, something that cleared up for me is this: all pages sitting in the global memory are considered clean since global memory is just a facility as part of the page faulting process so we can assume all pages are clean. But that’s not true for local pages, meaning, if we evict a page from local memory and its dirty, we must first write it out to disk

    Handling Page Faults Case 4

    Handling Page Faults (Case 4)

    Summary

    This is a tricky scenario, the only scenario in which a page lives in the working set on two nodes simultaneously.

    Local and Global Boundary Quiz

    Local and Global Boundary Quiz

    Summary

    Difficult quiz, actually. On Disk is not applicable for Node Q with Page X. And depends what happens with globally LRU (least recently used) stored in local, or global, part.

    Behavior of Algorithm

    Summary

    Basically, if there’s an idle node, its main responsibility will be accommodating peer pages. But once the node becomes busy, it will move pages into its working set

  • Daily Review: Day Ending in 2020/10/21

    Daily Review: Day Ending in 2020/10/21

    Work

    • Defined a data specification (in the form of a protobuf definition) that’ll be used to help drive the conversation between us and our customer. Our customer (AWS Networking) will be vending us data but its unclear (at the moment) what data they will be providing not only to us but to other teams. So to help steer the conversation, I documented a one pager that includes a (Google) protobuf definition.

    Mental health

    • Attended my weekly therapy session. Since I moved to Renton, this session (and the previous one) were held over Zoom. However, next week, I’ll drive to North Seattle early in the morning (around 06:30 or maybe even 06:00) to avoid traffic since that upcoming session will occur in person.
    • Sort of figuring out why I find it difficult to articulate love feelings for Jess. Her and I chatted last night and as I’ve mentioned before, I essentially freeze up when Jess feels upset and needs comforting. But during those moments I’m unable to verbally or physically comfort her: essentially I freeze up. Long story short, it’s obvious now the reason why: that’s just how I grew up. This doesn’t mean that I want to carry on this habit but it is helpful (for me at least) to understand the root of the behavioral patterns.

    Home

    • Drove home at 2 miles per hour (MPH) since the new lawn more I picked up at Ace did not fully fit inside the trunk of our Ford Escape.  I drove to our local Ace Hardware store last night since the lawn mower that I had ordered online arrived. Unfortunately, the lawn mower box was about one foot longer than length of the size of the trunk … so I made a judgement call and deciding to drive home — with the trunk open — very very slowly, given that we live less than a mile away from the store.

    Graduate School

    Grades Released

    • Somehow nailed the midterm but didn’t do as well as I expected on Project 1. I managed to store 94.5/100 for the midterm and I can thank all the people who helped out during the war room study sessions that I created. As for Project 1, I thought I hit the ball out of the park but apparently, my virtual CPU scheduler failed many of the test cases. This is ironic since I spent more time on my virtual CPU than I did with the memory scheduler and I somehow aced the memory scheduler.

    Project 3

    • Debugged a silly segfault for about 1.5 hours. Long story short, I called a function that expected a variable stored on the heap but accidentally passed in a variable declared on the stack. Such a silly mistake and the problem was that I had defined the variable twice: once on the heap and once on the stack. Surely this mistake could’ve been caught simply by enabling some flags on the compiler. Oh well.