Matt Chung's Site

Tag: concurrency

Distributed Computing – Lesson 1 Summary
Summary

Distributed systems are everywhere: social media, internet of things, single server systems — all part of larger, distributed systems. But how do you define a distributed system? A distributed system is, according to Leslie Lamport (father of distributed computing), a system in which failure of some component (or node) impacts the larger system, rendering the whole system either unstable or unusable.

A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable – Leslie Lamport

Lesie Lamport. Source: https://www.heidelberg-laureate-forum.org/laureate/leslie-lamport.html

To better understand distributed systems, we can model their behaviors using nodes and messages. Nodes send messages to one another, over unreliable communication links in a 1:1 or 1:N fashion. Because of the assumed flaky communication, messages may never arrive due to node failure. This simple model of nodes and messages, however, does not take the nodes capturing events into consideration; for this, our model must be extended into a slightly more complex one, introducing the concept of “state” being stored at each node.

When evaluating distributed system, we can and should leverage models. Without them, we’d have to rely on building prototypes and running experiments to prove our system; this approach is untenable due to issues such as lack of resources or due to the scale of the system, which may require hundreds of thousands of nodes. When selecting a model, we want ensure that it can accurately represent the problem and be used to analyze the solution. In other words, the model should hold two qualities: accurate and tractable.

Using Models. Source: Georgia Tech

Building distributed systems can be difficult for three reasons: asynchrony, failures, consistency. Will a system assume messages are sent immediately within a fixed amount of time? Or will it be asynchronous, potentially losing messages? How will the system handle the different types of failures: total, grey, byzantine (cannot tell, misbehaving) ? How will the system remain consistent if we introduce caching or replication? All of these things are part of the larger 8 fallacies of distributed computing systems.

CAP Theorem. Source: Georgia Tech

Ultimately, when it comes to building reliable and robust systems, we cannot have our cake and eat it too. Choose 2 out of the 3 properties: consistency, availability, partitioning. This tradeoff is known as CAP theorem (which has not been proven and is only a theorey). If you want availability and partitioning (e.g. cassandra, dynamodb), then you sacrifice consistency; similarly, if you want consistency and partitioning (e.g. MySQL, Megastore), you sacrifice availability.

References
January 15, 2021
Enterprise Java Beans – notes and summary

Introduction

Key Words: EJB, enterprise java beans

Discuss how we can structure system software for large scale distributed sytem service

Inter Enterprise View

Inter Enterprise View: The motivation of using enterprise java beans

Key Words: monolithic, supply chain model, amalgam, survivability, complexity

From a user perspective, we view a system (like Ebay or Google) as a blackbox. But in reality, much more complex than that and within the system, there may be multiple enterprise working together and this can be difficult when trying to handle failures, complexity, and so on

An Example

Key Words: scheduling, parallel systems, synchronization, communication, atomicity, concurrency, object technology, reuse

Using a system like Expedia, we face same issues in parallel systems like synchronization and concurrency and scheduling and atomicity and so on

N-tier applications

N tier applications

Key Words: concurrency, parallelism, embarrassingly parallel applications

Want to reduce network communication (latency), security (for users) by not compromising business logic, and increase concurrency

Structuring N Tier Applications

Structuring N tier applications

Key Words: JEE, protection domain, bean

We can split the protection domain uses containers, at the client or at the presentation layer, business layer or database layer. Each container contains N beans, bundle of java objects

Design alternative (Coarsegrain session beans)

Coarse grain approach. Although this approach protects business logic from outside network, fails to expose concurrency at the EJB container.

Key Words: monolithic, concurrency

Each session bean is very granular, the session bean encapsulating most of the logic. The upshot of this approach is that the container offers little service and the logic is not exposed outside the corporate network. The downside is that we do not exploit concurrency. There’s a missed opportunity where the EJB container and pull in data for multiple sessions

Design Alternative

Key Words: trade offs, persistent, bean managed persistence, container managed persistence, data access object

With data access object design, we confront the short comings of the coarse grain session bean, by exploiting concurrency for data access, limiting I/O and network latency when accessing the database. The trade off? Business logic is exposed to the web container

Design alternative (session bean with entity bean)

Design alternative: session bean with entity bean. Hybrid approach of coarse grain and data access object.

Key Words: session facade, design pattern, remote interface, RMI

We now embed a session facade, that provides concurrency AND does not exposed business logic. To that end, we use RMI (remote interface) to communicate between the containers and/or between the session facade and entity bean

November 28, 2020
Distributed Systems Introduction notes

The main take away with the introduction to distributed systems lectures is that as system designers, we need to carefully inspect our program and identify what events in our system can run concurrently (as well as what cannot run concurrency or must be serialized). To this end, we need to identify what events must happen before other events. And most importantly, we should only consider running distributed systems to increase performance when the time it takes to process an event exceeds the time it takes to send a message between nodes. Otherwise, just stick to local computation.

Key Words: happened before, concurrent events, transitive

Introduction

Summary

Lots of similarity between parallel systems and distributed systems. Symbiotic relationship between hardware and software (protocol stack)

Quiz: What is a distributed system?

Summary

Distributed systems are connected via a LAN/WAN, communicate only via messages, and message time is greater than the event time (not sure what this means really)

Distributed Systems Definition

Summary

Key Words: event computation time, message transmission time; Third property is that message communication time is significantly larger than the event computation time (on a single node). Leslie’s definition is: A system is distributed if the message transmission time (TM) is not negligible to the time between events in a single process. The main take away is that the algorithms or applications run on distributed noes must take longer (in computation time) than the communication otherwise no benefit of parallelism.

A fun example

Summary

Key Words: happened before relationship; Set of beliefs ingrained in a distributed system example. Within a single process, events are totally ordered. Second belief is that you cannot receive the receipt of a message until after the initial message is sent.

Happened Before Relationship

Happened Before Event description

Summary

The happened before relationship means one of two things. For events A and B, it means that 1) A and B are in the same process or 2) A is the sender of the message and B is the receiver of the message

Quiz Relation

Summary

You cannot assume or say anything about the order of A and B

Happy before relationship (continued)

Summary

When you cannot assume the ordering of events, they might be concurrent. One event might run before the other during one invocation. Then perhaps the order gets flipped. So, these types of events are not happened before, no apparent relationship between the two events. But my question is: what are some of the synchronization and communication and timing issues that he had mentioned?

Identifying Events

Identifying concurrent and dependent and transitive events

Summary

Remember the events that “happened before” are not only between processes but within processes themselves. Also, two concurrent events are basically when we cannot guarantee whether or not an event in process A will run before an event in process B, again between processes

Example of Event Ordering

Summary

Basically the events between processes can be concurrent since there’s no guarantee in terms of wall clock time when they will execute, no ordering

October 2, 2020