Tag: undo record

RioVista – Summary and notes

Introduction

Lesson outline for RioVista

Key Words: ACID, transactions, synchronous I/O

RioVista picks up where LRVM left off and aims for a performance conscience transaction. In other words, how can RioVista reduce the overhead of synchronous I/O, attracting system designers to use transactions

System Crash

Two types of failures: power failure and software failure

Key Words: power crash, software crash, UPS power supply

Super interesting concept that makes total sense (I’m guessing this is actually implemented in reality). Take a portion of the memory and battery back it up so that it survives crashes

LRVM Revisited

Upshot: 3 copies by LRVM

Key Words: undo record, window of vulnerability

In short, LRVM can be broken down into begin transaction, end transaction. In the former, portion of memory segment is copied into a backup. At the end of the transaction, data persisted to disk (blocking operation, but can be bypassed with NO_FLUSH option). Basically, increasing vulnerability of system to power failures in favor of performance. So, how will a battery backed memory region help?

Rio File Cache

Creating a battery backed file cache to handle power failures

Key Words: file cache, persistent file cache, mmap, fsync, battery

In a nutshell, we’ll use a battery backed file cache so that writes to disk can be arbitrarily delayed

Vista RVM on Top of RIO

Vista – RMV on top of Rio

Key Words: undo log, file cache, end transaction, memory resisdent

Vista is a library that offers same semantics of LRVM. During commit, throw away the undo log; during abort, restore old image back to virtual memory. The application memory is now backed by file cache, which is backed by a power. So no more writes to disk

Crash Recovery

Key Words: idempotency

Brilliant to make the crash recovery mechanism the exact same scenario as an abort transaction: less code and less edge cases. And if the crash recovery fails: no problem. The instruction itself is idempontent

Vista Simplicity

Key Words: checkpoint

RioVista simplifies the code, reducing 10K of code down to 700. Vista has no redo logs, no truncation, all thanks to a single assumption: battery back DRAM for portion of memory

Conclusion

Key Words: assumption

By assuming there’s only software crashes (not power), we can come to an entirely different design

November 16, 2020
Lightweight recoverable virtual machine – Summary and Notes

Summary and main take away

As system designers, we can make persistence into the virtual memory manager, offering persistence to application developers. However, it’s no easy feat: we need to ensure that the solution performs well. To this end, the virtual machine manager offers an API that allows developer to wrap their code in transactions; underneath the hood, the virtual machine manager uses redo logs that persists the user changes to disk which can defend against failures.

Persistence

Why is persistence needed?

Key Words: inode, subsystem, virtual memory management, log sequence

We can bake persistent into the virtual memory manager (VMM) but building an abstraction is not enough. Instead, we need to ensure that the solution is performant and instead of committing each VMM change to disk, we aggregate them into a log sequence (just like the previous approaches in distributed file system) so that 1) we write in a contiguous block

Server Design

Server Design – persist metadata, normal data structures

Key Words: inodes, external data segment

The designer of the application gets to decide which virtual addresses will be persisted to external data storage

Server Design (continued)

Key Words: inodes, external data segment

The virtual memory manager offers external data segments, allowing the underlying application to map portions of its virtual address space to segments backed by disk. The model is simple, flexible, and performant. In a nutshell, when the application boots up, the application selects which portions of memory must be persisted, giving the application developer full control

RVM Primitives

Key Words: transaction

RVM Primitives: initialization, body of server code

There are three main primitives: initialize, map, and unmap. And within the body of the application code, we use transactions: begin transaction, end transaction, abort transaction, and set range. The only non obvious statement is set_range: this tells the RVM runtime the specific range of addresses within a given transaction that will be touched. Meaning, when we perform a map (during initialization), there’s a larger memory range and then we create transactions within that memory range

RVM Primitives (continued)

RVM Primitives – transaction code and miscellaneous options

Key Words: truncation, flush, truncate

Although RVM automatically handles the writing of segments (flushing to disk and truncating log records), application developers can call those procedures explicitly

How the Server uses the primitives

How the server uses the primitives – begin and end transaction

Key Words: critical section, transaction, undo record

When transaction begins, the LRVM creates an undo record: a copy of the range specified, allowing a rollback in the event an abort occurs

How the Server uses the primitives (continued)

How the server uses the primitives – transaction details

Key Words: undo record, flush, persistence

During end transaction, the in memory redo log will get flushed to disk. However, by passing in a specific mode, developer can explicitly not call flush (i.e. not block) and flush the transaction themselves

Transaction Optimizations

Transaction Optimizations – ways to optimize the transaction

Key Words: window of vulnerability

With no_restore mode in begin transaction, there’s no need to create a in memory copy; similarly, no need to flush immediately with lazy persistence; the trade off here is that there’s an increase window of vulnerability

Implementation

Implementation – redo log and commit

Key Words: forward displacement, transaction, reverse displacement

Redo log allows traversal in both directions (reverse for recovery) and only new values are written to the log: this implementation allows good performance

Crash Recovery

Crash Recovery – resuming from a crash

Key Words: crash recovery

In order to recover from a crash, the system traverses the redo log, using the reverse displacement.Then, each range of memory (along with the changes) are applied

Log Truncation

Log truncation – runs in parallel with forward processing

Key Words: log truncation, epoch

Log truncation is probably the most complex part of LRVM. There’s a constant tug and pull between performance and crash recovery. Ensuring that we can recover is a main feature but it adds overhead and complexity since we want the system to make forward progress while recovering. This end, the algorithm breaks up data into epochs

November 15, 2020