Author: mattchung

  • 3 tips on getting eyeballs on your code review

    3 tips on getting eyeballs on your code review

    “Why is nobody reviewing my code?”

    I sometimes witness new engineers (or even seasoned engineers new to the company) submit code reviews that end up sitting idle, gaining zero traction. Often, these code reviews get published but comments never flow in, leaving the developer left scratching their head, wondering why nobody seems to be taking a look. To help avoid this situation, check out the 3 tips below for more effective code reviews.

    3 tips for more effective code reviews

    Try out the three tips for more effective code reviews. In short, you should:

    1. Assume nobody cares
    2. Strive for bite sized changes
    3. Add a descriptive summary

    1. Assume nobody cares

    After you hit the publish button, don’t expect other developers to flock to your code review. In fact, it’s safe to assume that nobody cares. I know, that sounds a bit harsh but as Neil Strauss suggests,

    “Your challenge is to assume — to count on — the completely apathy of the reader. And from there, make them interested.”

    At some point in our careers, we all fall into this trap. We send out a review, one that lacks a clear description (see section below “Add a descriptive summary”) and then the code review would sometimes sits there, patiently waiting for someone to sprinkle comments. Sometimes, those comments never come.

    Okay, it’s not that people don’t necessary care. It has more to do with the fact people are busy, with their own tasks and deliverable. They too are writing code that they are trying to ship. So your code review essentially pulls them away from delivering their own work. So, make it as easy as possible for them to review.

    One way to do gain their attention is simply by giving them a heads up.

    Before publishing your code review, send them an instant message or e-mail, giving them a heads up. Or if you are having a meeting with that person, tell them that you plan on sending out a code review and ask them if they can take a look at the code review. This puts your code review on their radars. And if you don’t see traction in an appropriate (which varies, depending on change and criticality), then follow up with them.

    2. Strive for bite sized code reviews

    Anything change beyond than 100-200 lines of code requires a significant amount of mental energy (unless the change itself is a trivial updates to comments or formatting). So how can you make it easier for your reviewer?

    Aim for small, bite sized code reviews.

    In my experience, a good rule of them is submit less than 100 lines of code. What if there’s no way your change can squeeze into double digits? Then consider breaking down the single code review into multiple, smaller sized code reviews and once all those independent code reviews are approved, submit a single code review that merges all those changes in atomically.

    And if you still cannot break down a large code review into these lengths and find that it’s unavoidable to submit a large code review, then make sure you schedule a 15-30 minute meeting to discuss your large code review (I’ll create a separate blog post for this).

    3. Add a descriptive summary for the change

    I’m not suggesting you write a miniature novel when adding a description to your code review. But you’ll definitely need to write something with more substance than a one-liner: “Adds new module”. Rob Pike put’s it succinctly and his criteria for a good description includes “What, why, and background”.

    In addition to adding this criteria, be sure to describe how you tested your code — or, better yet, ship your code review with unit tests. Brownie points if you explicitly call out what is out of scope. Limiting your scope reduces the possibility of unnecessary back-and-forth comments for a change that falls outside your scope.

    Finally, if you want some stricter guidelines on how to write a good commit message, you might want to check out Kabir Nazir’s blog post on “How to write good commit messages.”

    Summary

    If you are having trouble with getting traction on your code reviews, try the above tips. Remember, it’s on you, the submitter of the code review, to make it as easy as possible for your reviews to leave comments (and approve).

    Let’s chat more and connect! Follow me on Twitter: @memattchung

  • Daily Review – Day ending in 2020/11/05

    Daily Review – Day ending in 2020/11/05

    Family

    • Assembled our Berkey water filtering system. The instructions are quite complicated, actually. In addition to reading the manuals, I had to pull up a couple instructional YouTube videos to make sure that I was priming the filters correctly.
    • Took Elliott on a late night walk. She had missed her nap and we needed to stretch her and keep her awake until her 7pm bed time.

    Work

    • Spent about an hour or so chipping away at codifying our operational dashboard, as part of hackathon.
    • Read through a ton of documentation and watched a couple videos on virtual private connection (VPC). I should’ve read and watched all this training when I first joined the organization. Oh well. At least the material makes more sense now, now that I’ve had first hand exposure to the dataplane code. Anyways, all this material helped me build a better mental model needed to deliver my tech talk.

    Graduate School

    • The word coordinator kept popping up in the lectures. This coordinator often shows up in different code bases and systems within Amazon and I didn’t realize that the term has roots in the theory of distributed systems.
    • Learning more about the value of checkpoints. I can definitely see me adopting and integrating checkpoints into the software I design and build, at least for the stateful applications that need to have robust recovery mechanisms.
    • I’d like to learn more about two phase commit protocol. Similar to the word coordinator, two phase commit protocol term continues to pop up in the lectures and I bet it’s worth learning more about the specifics of this transaction strategy.
    • Window of vulnerability. The trade off between persisting in memory log records to disk
  • Daily Review – Day ending in 2020/11/04

    Daily Review – Day ending in 2020/11/04

    My favorite part of the day was waking up to a video that Jess recorded while I was asleep, a video capturing a little frog dancing on our window facing the backyard. My wife: she’s super cute.

    Work

    • Hosted and moderated a tech panel on career growth and promotions. On behalf of Asians@ at Amazon, I lead a conversational fire side chat with four (Asian) senior software development engineers about some of the non technical barriers that our community often faces throughout our career. Not speaking up. Not advocating for one self. Everyone on the panel nailed it (despite how nervous they were) and I look forward to creating and hosting new events in the future.

    Health

    • Fought off a killer headache that kicked in after I ate che (a Vietnamese dessert). My body rejects processed sugars and when I consume too much of it, my body sends me a strong signal in the form of a painful headache that lasts several hours, the only way to fight it off is to gulp down as much water as possible and flush the sugars out of my system.

    Daily Photo

    Dancing frog on the window
  • Daily Review – Day ending in 2020/11/03

    Daily Review – Day ending in 2020/11/03

    Family

    • Jess and I felt nervous about the presidential election results while watching the news online. Despite Biden leading in the polls, just as Hilary four years ago back in 2016, I’m sitting at the edge of my seat as the results come in, not confident at all that Biden can pull off a victory. Again, as I mentioned in earlier posts, although I cast my vote for Biden, I’m more voting to get Trump out.

    Graduate School

    • Learned of a brilliant technique to simplify crash recovery by removing an edge case and exercising the same code as the abort transaction. Technique from RioVista papers. The crash recovery is also idempotent so it can withstand power failure during the crash recovery itself. Idempotent. First heard of this term when I was learning Ansible back in the day. Something to think about again.

    Work

    • Checked in some GRE header code that basically implements RFC 2784 and 2890. Stepping through the RFC, I learned how much unnecessary overhead the protocol carries with it. A 3 bit version field that MUST always be zero? Granted, I understand that the protocol designers were trying to future proof the protocol but now I understand why a protocol like IPIP prevails given that IPIP will encapsulate only IP.

     

  • Daily Review – Day ending in 2020/11/02

    Daily Review – Day ending in 2020/11/02

    Family

    Elliott and I drove to Lowe’s home improvement store and picked up a new faucet. Normally, Elliott will cry after sitting in the car for more than 10 minutes but not this trip. She did amazing and the two us had a blast at Lowe’s and she helped me select our new faucet since the one installed in our kitchen busted, causing water to leak into and underneath the sink. Luckily, my dad is in town and helped replace the existing broken faucet with the one I just purchased.

    Graduate School

    Submitted project 3 assignment on gRPC and multi-threading. Similar to project 2, I partnered up with another Seattle based student (who also happens to be Vietnamese as well) and we divided the work down the middle and stitched our code together.

    Watched half the lectures on RioVista. This lecture series seems to be a continuation of the last lecture series (on lightweight recoverable virtual machines), adding on top of the idea of using transactions to keep a log record of changes to the virtual memory address space. The key different with RioVista though is that they are targeting performance as their primary goal.

    Health

    Walked about 3 miles on the treadmill yesterday while working and studying. Last week, I could not find the remote that controls my treadmill but luckily Jess found it buried in the back of my office and luckily I was able to get some exercise in.

    Work

    Held a dry-run kick off meeting for a panel that I’m hosting and moderating this upcoming Wednesday. The purpose of the panel is to give people a sense of what it looks like to grow one’s career as a software engineer at Amazon.

  • Weekly Review – Week ending in 2020/11/01

    Weekly Review – Week ending in 2020/11/01

    No Halloween this year

    I used to love Halloween growing up, not so much the dressing up part but the knocking on doors and getting handed fist fulls of candy. Now, as an adult, I love returning the favor and always think about giving out larger than average candy and chocolate.

    But not this year, thanks to COVID-19.

    Hopefully 2020 will be the one and only year that we skip Halloween …

    Starting writing my first e-book

    I’m compiling all my blog posts on “advanced operating systems refresher” into a single, nicely formatted e-book. The book will provide a summary and detailed notes on Udacity’s Advanced Refresher course.

    Media Consumption

    Watched the first two episodes of “This is Us”. Again, as I mentioned in my blog posts, the writer’s (and cast and crew) deserve a huge applaud for pivoting and incorporating two major events in history — the COVID-19 pandemic and police brutality on black lives — into the story line. That’s no easy feat but they are pulling it off.

    Watched Borat subsequent film. I found the film hilarious and ingenious. Reveals how complicated people can be. For example, two trump supporters end up taking Borat into their homes and at one point, they even speak up on behalf of women.

    Home Care

    I’m super motivated keeping our new home in tip top shape. I had learned that the previous owner’s took a lot of pride in the house, the retired couple out in the front or back yard on a daily basis, the two of them maintaining the lawn and plants.

    Learning how to take care of the lawn. That includes learning the different modes of mowing (i.e. mulching, side discharge, bagging), the difference between slow release and fast release nitrogen, the importance of aerating, the importance of applying winterizer two weeks before the last historical freeze day, how to edge properly and so on.

    Family

    Still working from home and still appreciate the little gestures from Jess throughout the day. I sometimes get lost in a black hole of thoughts and troubleshooting, not drinking any water or eating snacks for hours at a time. So the little snacks that Jess drops off go a long way.

    Health

    Although taking care of mental health, not so much physical health. Only exercised once last week which was basically jogging on the treadmill.

    Graduate School

    Applied theory of lightweight recoverable virtual machine to work. In advanced operating systems, I took the concept of the abort transaction and suggested that we install a similar handler in our control plane code.

  • Daily Review – Day ending in 2020/11/01

    Daily Review – Day ending in 2020/11/01

    What a weird morning. Totally forgot the the time changed, the clock moving back an hour (i.e. we gain an hour), and found myself studying and working at 03:30 AM (instead of 04:30 AM). When I first woke up and glanced at my Casio watch, I debated whether to get out of bed or not since I was my body was telling me it needed a little more rest. However, knowing that Elliott would soon wake me up and knowing that I had a very small window of time to myself, I rolled out of bed. Parent life.

    Home care

    • Manually aerated the front yard using core aeration tool. The video advertisements for aerating make the process of aerating seem so pleasant. It’s not. It’s actually a lot of hard work and I broke a massive sweat aerating the front yard.I totally understand now why people rent aerating machines at Home Depot (or Lowe’s) for $20.00 bucks an hour. Renting a machine would make the work so much easier.
    • Fed the lawn fertilizer. After aerating the lawn, I poured about 2.2 pounds of fertilizer into the Ace branded bucket, transferred the food into the spreader, and then dispensed the it all over the front yard.

    Family

    • Family trip to Lam’s seafood located in Tequila. This Lam’s seafood location blows the other location (i.e. International District) out of the water. The location in Tequila is not only larger, but it’s cleaner and closely resembles the layout and feel of Uwajimaya.  Shopping in the store reminds me of growing up in Little Saigon, all the Vietnamese chatter and all the people bustling in the background.
    • Dropped off our ballot at the library located across the street. Just before sliding the ballot into the drop off box, I sat in the front seat of the car (with the two dogs next to me) with the Stranger’s cheat sheet opened up, filling out the last few choices that I was undecided.

     

  • Five tips for surviving (or thriving) in the OMSCS program as a computer science graduate student

    Five tips for surviving (or thriving) in the OMSCS program as a computer science graduate student

    Overview

    In this post, I’m sharing five tips that I’ve picked up over the last 2 years in the program. At the time of this writing, I’m wrapping up my 7th course (advanced operating systems) in the OMSCS program. This means that I have 3 more courses to complete until I graduate with my masters in computer science from Georgia Tech.

    This post assumes that you are already enrolled as a student in the program. If you are still on the fence as to whether or not you should join the program, check out Adrian’s blog post on “Georgia Tech Review: yes, maybe, no”. Very shortly, I’ll post my own thoughts and recommendations on whether or not I think you should join the program.

    Five Tips

    Prerequisites

    Before enrolling in a course, check the course’s homepage for recommended prerequisites. Once you are officially accepted into the program,  you’ll soon discover that unlike undergraduate, you can enroll in courses despite not meeting prerequisites. The upside of this relaxed approach is that students who are really eager to take a course can do so (assuming seats are available). The downside is that if you are not prepared for the course, you’ll be way in over your head and likely will drop the course within the first few weeks since many courses are front loaded.  For example, thinking about taking advanced operating systems? Before jumping the gun, take the diagnostic exam. Doing so will help you gauge whether or not you set up for success.

    Time management

    Manage your time by setting and schedule and sticking to it. If you are an early bird like me, carve out 50 minutes early in the morning, take a shot of your expresso, and get cracking.  Set specific times in which you will sit down and study (e.g. watch lectures, review notes, work on a project) and aim to stick to that schedule (as a new parent, I understand that plans often can easily be foiled).

    Trade offs

    Know when and what to slip or sacrifice. This is super important. We all 24 hours in a day. And we all have other responsibilities such as a full time job or children to take care of. Or something unexpectedly will just pop up in your life that demands your attention. That’s okay. Graduate school (and doing well in graduate school) is important. But there will be other competing motivators and sometimes, you’ll have to sacrifice your grade. That’s okay.

    Expectations

    Know what you want to get out of the program. What are you trying to get out of the program? Or what are you trying to get out of the current course you are enrolled in? Are you here to learn and improve your craft? Then focus on that. Or are you here to get a good grade? Then do that. Are you here to expand your network and potentially apply to a PhD program? Then make sure you attend the office hours and meet with the professors.

    Course reviews

    Read the reviews on OMSCentral.  If you haven’t read reviews on OMSCentral.com for the courses you plan to take (or are currently taking), please go and do that. Right now. On that website, you’ll find reviews from former students. They give you a better sense of what you’ll learn, what projects you’ll tackle, how engaged the professor is , whether the grades are curved, how difficulty the exams are, and maybe most important: how many hours you should expect to put into the course.

    Notifications

    Create e-mail (or text) notifications in your calendar for upcoming due dates. It sucks when you forget that an upcoming assignment is due soon. Sucks even more when you find out that the due date has passed. Although you can rely on Canvas (Georgia Tech’s current system for managing their courses), I still recommend using your own system.

    Summary

    The above tips do not guarantee a smooth journey. Chaos will somehow creep its way into our time and things will get derailed. That’s okay. Take a deep breathe and revisit the tip above on prioritization. Remind yourself why you are in graduate school. Now go out there and kick some ass.

  • Distributed Shared Memory (Part 2 of 2) Notes

    Distributed Shared Memory (Part 2 of 2) Notes

    An example

    Example of release consistency model

    Summary

    Key Words: Conditional variable, pthread_signal, pthread_wait

    in the concrete example (screenshot below), P1 instructions that update memory (e.g. flag = 1) can be run in parallel with that of P2 because of release consistency model

    Advantage of RC over SC

     

    Summary

    In a nutshell, we gain performance in a shared memory model using release consistency by overlapping computation with communication, because we no longer wait for coherence actions for every memory access

    Lazy RC (release consistency)

    Lazy Release Consistency

    Summary

    Key Words: Eager

    The main idea here is that the “release consistency” is eager, in the sense that cache coherence traffic is generated immediately after unlock occurs. But with lazy RC, we defer that cache coherence traffic until the acquisition

    Eager vs Lazy RC

    Eager vs Lazy Consistency

    Summary

    Key Words: Eager, Lazy

    Basically, eager and lazy goes boils down to a push (i.e. eager) versus pull (i.e. lazy) model. In the former, every time the lock is released, coherence traffic broadcasts to all other processes

    Pros and Cons of Lazy and Eager

    Summary

    Advantage of lazy (over eager) is that there are less messages however there will be more latency during acquisition

    Software DSM

    Software DSM

    Summary

    Address space is partitioned, meaning each processor is responsible for a certain set of pages. This model of ownership is a distributed, and each node holds metadata about the page and is responsible for sending coherence traffic (at the software level)

    Software DSM (Continued)

    Software DSM (continued)

    Summary

    Key Words: false sharing

    DSM software runs on each processor (cool idea) in a single writer multiple reader model. This model can be problematic because, coupled with false sharing, will cause significant bus traffic that ping pongs updates when multiple data structures live within the same cache line (or page)

    LRC with Mutli-Writer Coherence Protocol

    Lazy Release Consistency with multi-writer coherence

    Summary

    With lazy release consistency, a process will (during the critical section) generate a diff of the pages that have been modified, the diff later applied when another process performs updates to those same pages

    LRC with Multi-Writer Coherence Protocol (Continued)

    Summary

    Need to be able to apply multiple diffs in a row, say Xd and Xd’ (i.e. prime)

    LRC with Multi Writer Coherence Protocol (Continued)

    Summary

    Key Words: Multi-writer

    The same page can be modified at the same time by multiple threads, just so as long as a separate lock is used

    Implementation

    Implementation of LRC

    Summary

    Key Words: Run-length encoded

    During a write operation (inside of a lock), a twin page will get created, essentially a copy of the original page. Then, during release, a run-length encoded diff is computed. Following this step, the memory access is then write protected

    Implementation (continued)

    LRC Implementation (Continued)

    Summary

    Key Words: Data Race, watermark, garbage collection

    A daemon process (in every node) wakes up periodically and if the number of diffs exceed the watermark threshold, then daemon will apply diffs to original page. All in all, keep in mind that there’s overhead involved with this solution: overhead with space (for the twin page) and overhead in runtime (due to computing the run-length encoded diff)

    Non Page Based DSM

    Non-page-based DSM

    Summary

    Two types of library based that offer alternatives, both that do not require OS support. The two approaches are library-based (variable granularity) and structured DSM (API for structures that triggers coherence actions)

    Scalability

    Scalability

    Summary

    Do our (i.e. programmer’s) expectations get met as the number of processors increase: does performance increase accordingly as well? Yes, but there’s substantial overhead. To be fair, the same is true with true shared memory multiple processor

  • Distributed Shared Memory (Part 1 of 2) Notes

    Distributed Shared Memory (Part 1 of 2) Notes

    Introduction

    Summary

    Main question is this: can we make a cluster look like a shared memory machine

    Cluster as a parallel machine (sequential program)

    Cluster as a parallel machine

    Summary

    One strategy is to not write explicit parallel programs and instead use language assisted features (like pragma) that signal to the compiler that this section be optimized. But, there are limitations with this implicit approach

    Cluster as a parallel machine (message passing)

    Cluster as parallel machine (message passing)

    Summary

    Key Words: message passing

    One (of two) styles for explicitly writing parallel programs is by using message passing. Certain libraries (e.g. MPI, PVM, CLF) use this technique and true to a process’s nature, the process does not share its memory and instead, if the process needs to communicate with another entity, it does so by message passing. The downside? More effort from the perspective of the application developer

    Cluster as a parallel machine (DSM)

    Cluster as a parallel machine (parallel program)

    Summary

    The advantage of a DSM (distributed shared memory) is that an application developer can ease their way in, their style of programming need not change: they can still use locks, barriers, and pthreads styles, just the way they always have. So the DSM library provides this illusion of a large shared memory address space.

    History of shared memory systems

    History of shared memory systems

    Summary

    In a nutshell, software DSM has its roots in the 80s, when Ivy league academics wanted to scale the SMP. And now, (in the 2000s), we are looking at clusters of symmetric memory processors.

    Shared Memory Programming

    Shared Memory Programming

    Summary

    There are two types of synchronization: mutual exclusion and barrier. And two types of memory accesses: normal read/writes to shared data, and read/write to synchronization variables

    Memory consistency and cache coherence

    Memory Consistency vs Cache Coherence

    Summary

    Key Words: Cache Coherence, Memory consistency

    Memory consistency is the contract between programmer and the system and answers the question “when”: when will a change to the shared memory address space reflect in the other process’s private cache. And cache coherence answers the “how”, what mechanism will be used (cache invalidate or write update)

    Sequential Consistency

    Sequential Consistency

    Summary

    Key Words: Sequential Consistency

    With sequential consistency, program order is respected, but there’s arbitrary interleaving. Meaning, each individual read/write operations are atomic on any processor

    SC Memory Model

    Sequential Consistency Model

    Summary

    With sequential consistency, the memory model does not distinguish a read/write access to a synchronization read/write access. Why is this important? Well, we always get the coherence action, regardless of memory access type

    Typical Parallel Program

    Typical Parallel Program

    Summary

    Okay, I think I get the main gist and what the author is trying to convey. Basically, since we cannot distinguish the reads and writes for memory accesses — from normal read/write versus synchronization read/write — then that means (although we called it out as a benefit earlier) cache coherence will continue to take place all throughout the critical section. But, the downside here is that that coherence traffic, is absolutely unnecessary. And really, what we probably want is that only after we unlock the critical section should the data within that critical section, be updated across all other processor cache

    Release Consistency

    Release Consistency

    Summary

    Key Words: release consistency

    Release consistency is an alternative memory consistency model to sequential consistency. Unlike sequential consistency, which will block a process until coherence has been achieved for an instruction, release consistency will not block but will guarantee coherence when the lock has been released: this is a huge (in my opinion) performance improvement. Open question remains for me: does this coherence guarantee impact processes that are spinning on a variable, like a consumer