Tag: two phase commit protocol

  • PAXOS – I’m coming for you!

    PAXOS – I’m coming for you!

    I’m now half way through Distributed Computing course at Georgia Tech and us students are now tackling the penultimate project: building a replicated state machine using PAXOS. This project will be challenging (probably going to require 40+ hours) and it’ll put my theoretical knowledge to the test and reflect back, in a couple weeks, how much I learned.

    Presently, here’s my little understanding of the consensus algorithm:

    Current PAXOS understanding

    • Servers play different roles – Proposer, Acceptor, Learner
    • Proposers send proposals that monotonically increase
    • Proposals are accepted if and only if a majority of the quorum accept them
    • The 2PC (2 phased commit) protocol essentially tells us whether or not a particular transaction is committed or aborted
    • Guaranteeing linearzability means that, from the clients perspective, real time (i.e. wall clock) should be respected and the client should view the system as if there is a single replica

    Future PAXOS understanding

    • How exactly PAXOS guarantees consensus via its 2 phased commit protocol
    • How does a server determine its role (or does it play multiple roles)
    • How to handle the edge cases (say two proposals arrive at the same time)
    • What role does a client play? Does it serve as a proposer?
    • How does leader election work in PAXOS?
    • Should I just try and mimic the Python based code described in Paxos made moderately difficult
    • How will replication work as the number of nodes in the system scales (say from 3 to 5 to 10)
    • How to detect (and perhaps avoid) split brain (i.e. multiple leaders)

    References

    1. Majority quorum can be defined as floor(n/2) + 1
    2. Python implementation described https://www.cs.cornell.edu/courses/cs7412/2011sp/paxos.pdf
  • Spring 2021: Distributed Computing

    Spring 2021: Distributed Computing

    Yes! I’m finally registered for the distributed computing course. This course is hot off the press! It’s spanking brand new to the OMSCS program and offered for the first time this (Spring 2021) term. I’ve been eagerly waiting over two years for a course centering around distributed systems, combining both theory and practice.

    The course is taught by professor Ada, who also teaches Graduate Introduction to Operating Systems, which received the highest accolades from previous students who wrote reviews over at OMSCentral review. I bet this class will raise the bar, intellectually challenging and stimulating our minds. According to the course’s syllabus, we’ll bridge both theory and practice by delivering 5 programming projects, the assignments based on University of Washington’s Distributed Systems Labs that are published in this repository: https://github.com/emichael/dslabs . The projects will require us students to:

    1. Build a simple ping/pong protocol
    2. Implement an exactly-once RPC protocol (parts of lab 1 reused)
    3. Design and implement a primary-backup protocol to teach fault-tolerance
    4. Implement the famous Paxos
    5. Implement a key value stored using Paxos and implement a two-phased commit protocol

    I’m particularly interested in the last two projects: paxos and two-phased commit protocol. I had first read the original Paxos paper about four years ago, when I first joined Amazon, but never implemented Paxos myself; and I only recently learned about two-phased commit protocols a month or two ago when taking advanced operating systems last semester and it’s a topic I want to understand more deeply.

    I’m hoping that during (and after I take this class) I’ll be able to apply the principles of distributed computing to effectively analyze the reliability of existing and new services powering Amazon Web Services. Moreover, I’ll incorporate the lessons learned into my mental model and refer to them as I both design and build new systems

    Let’s get cracking.

     

     

  • Recovery management in Quicksilver  – Notes and Summary

    Recovery management in Quicksilver – Notes and Summary

    The original paper “Recovery management in quicksilver” introduces a transaction manager that’s responsible for managing servers and coordinates transactions. The below notes discusses how this operating system handles failures and how it makes recovery management a first class citizen.

    Cleaning up state orphan processes

    Cleaning up stale orphan processes

    Key Words: Ophaned, breadcrumbs, stateless

    In client/server systems, state gets created that may be orphaned, due to a process crash or some unexpected failure. Regardless, state (e.g. persistent data structures, network resources) need to be cleaned up

    Introduction

    Key Words: first class citizen, afterthought, quicksilver

    Quicksilver asks if we can make recovery a first class citizen since its so critical to the system

    Quiz Introduction

    Key Words: robust, performance

    Users want their cake and eat it too: they want both performance and robustness from failures. But is that possible?

    Quicksilver

    Key Words: orphaned, memory leaks

    IBM identified problems and researched this topic in the early 1980s

    Distributed System Structure

    Distributed System Structure

    Key Words: microkernel, performance, IPC, RPC

    A structure of multiple tiers allows extensibility while maintaining high performance

    Quicksilver System Architecture

    Quicksilver: system architecture

    Key Words: transaction manager

    Quicksilver is the first network operating system to propose transactions for recovery management. To that end, there’s a “Transaction Manager” available as a system service (implemented as a server process)

    IPC Fundamental to System Services

    IPC fundamental to system services

    Key Words: upcall, unix socket, service_q data structure, rpc, asynchronous, synchronous, semantics

    IPC is fundamental to building system services. And there are two ways to communicate with the service: synchronously (via an upcall) and asynchronously. Either, the center of this IPC communication is the service_q, which allows multiple servers to perform the body of work and allows multiple clients to enqueue their request

    Building Distributed IPC and X Actions

    Bundling Distributed IPC and Transactions

    Key Words: transaction, state, transaction link, transaction tree, IPC, atomicity, multi-site atomicity

    During a transaction, there is state that should be recoverable in the event of a failure. To this end, we build transactions (provided by the OS), the secret sauce for recovery management

    Transaction Management

    Transaction management
    Transaction management

    Key Words: transaction, shadow graph structure, tree, failure, transaction manager

    When a client requests a file, the client’s transaction manager becomes the owner (and root) of the transaction tree. Each of the other nodes are participants. However, since client is suspeptible to failing, ownership can be transferred to other participants, allowing the other participants to clean up the state in the event of a failure

    Distributed Transaction

    Key Words: IPC, failure, checkpoint records, checkpoint, termination

    Many types of failures are possible: connection failure, client failure, subordinate transaction manager failure. To handle these failures, transaction managers must periodically store the state of the node into a checkpoint record, which can be used for potential recovery

    Commit Initiated by Coordinator

    Commit initiated by Coordinator

    Key Words: Coordinator, two phase commit protocol

    Coordinator can send different types of messages down the tree (i.e. vote request, abort request, end commit/abort). These messages help clean up the state of the distributed system. For more complicated systems, like a file system, may need to implement a two phased commit protocol

    Upshot of Bundling IPC and Recovery

    Upshot of bundling IPC and Recovery

    Key Words: IPC, in memory logs, window of vulnerability, trade offs

    No extra communication needed for recovery: just ride on top of IPC. In other words, we have the breadcrumbs and the transaction manager data, which can be recovered

    Implementation Notes

    Key Words: transaction manager, log force, persistent state, synchronous IO

    Need to careful about choosing mechanism available in OS since log force impacts performance heavily, since that requires synchronous IO

    Conclusion

    Key Words: Storage class memories

    Ideas in quicksilver are still present in contemporary systems today. The concepts made their way into LRVM (lightweight recoverable virtual machine) and in 2000, found resurgence in Texas operating system