Notes From Heck

Exploring Temporality in Databases

Aditya Mukhopadhyay — Tue, 04 May 2021 11:06:26 GMT

Getting Started with Time

Time is of some importance to humans. The best scientific minds have yet to unravel all its mysteries. Even its definition is a recurring subject of debate. Yet, we all innately have a sense of what time is. It would be hard to discuss any subject at length without reference to the aspect of time. It is a fundamental component of all knowledge. In fact, it is a fundamental component of language itself - try constructing a contextually self-contained sentence without a single verb.

It is evident that in all accessories we've devised to store and dispense knowledge, time has always been a deeply integrated, first-class citizen. Databases are no exception. It is not uncommon to see a time-like field associated with a fact stored in a database.

Facts That Change with Time

For the purpose of this post, we can think of timestamped facts as being of one of two kinds - event and state.

An event is a record in the database of something that occurred in the real world. The actual event, once it has occurred, can no longer be altered (unless you're Dr. Who). However, its database record, although ideally immutable, may sometimes need to be updated or superseded (to correct an error in data entry, for example).

A state, on the other hand, represents a portion of our overall knowledge of an object or entity of interest. For example, it could be a person's employment record, their address, or the location of a parcel in transit. State records in a database are generally mutable.

Note: An important distinction must be made between an entity's identity and its state. The entity itself is usually referenced with an immutable identifier. It is only the stateful attributes associated with the entity that are mutable. There could be exceptions, but they are more often due to poor schema design than to constraints imposed by the real world.

Note: The most common relation between events and state is one of causality. An event is an agent that causes a mutation in state. For example, when a parcel in transit reaches a particular hub, its location on record is updated. In this case, the event is the parcel having reached the hub, and the state is its location (old location before event, new location after). The event (parcel reached hub) causes the state (location) to get updated.

Tracking Mutations

Having established that both events and state have a temporal aspect to them, we next explore how their mutations over time can be tracked. Hereon, the term temporal field refers to a time-like field that can be used to identify versions of an individual fact. A temporal dimension is the domain of a temporal field, i.e. the ordered set of valid values for that field.

For immutable events, temporality does not actually come into play. A timestamp field to record the time of occurrence may be present, but since this record is never altered, the event only ever has a single version, and hence the timestamp may be treated like any other non-temporal field.
For mutable events and for state, change can be fundamentally recorded in one of two ways:
i. Overwrite the previous record with new data - this is easy to implement, but we lose the older version. The older version may be retrieved from the DB replication logs (if we diligently save them), but it is a tedious, time-consuming process.
ii. Append the new data to the DB - supersede the old data related to an entity or event with new (timestamped) data, but preserve the old data such that it is easily accessible. This has the added overhead of providing a mechanism to retrieve the right version of the fact, based on a timestamp and a non-temporal entity/event identifier.

It is the latter case (timestamped revisions using appended updates) that is the subject of interest in this post. This is obviously much harder to implement than simple overwriting, especially if we have to design the versioning layer in the DB schema ourselves (as opposed to the DB internally tracking versions for us), but it does give us time-traveling superpowers.

Approaches to Time-Based Fact-Versioning

Let us take a moment to try and visualize the two different approaches to storing and updating mutable facts that we've seen so far.

Case: 0 Temporal Dimensions

Figure 1: Mutation history with 0 temporal dimensions.

Figure 1 is an abstract representation of what happens in a database where old data for an entity (row/document/key) is always overwritten by new data. The diagram shows the evolution of state through time of a single entity. On the timeline, each state that this entity has even been in is indicated by a box labelled S, S, S, etc.

Even though from our perspective we can see historic states in the diagram, the database only gets to see whatever is projected on the line labelled Projection Hyperplane, i.e., the latest recorded state. This imaginary hyperplane always crosses the timeline through the present instant, and hence acts as a marker of the present moment. Past states are inaccessible to the database. This is depicted in the figure by the projections of older states getting occluded by subsequent newer states.

We shall see shortly why this abstract representation is a useful tool in understanding temporality in databases.

Case: 1 Temporal Dimension

Figure 2: Mutation history with 1 temporal dimension.

Figure 2 demonstrates the case where the database updates data such that it still retains the old versions. This is represented in the figure by reorienting the Projection Hyperplane so that it is now parallel to the time axis. Due to this, older states can now project their presence onto the hyperplane without occlusion. The last known state gets to project its presence to along the temporal dimension. is represented by a dotted line crossing the time axis.

A note on implementation specifics: The visible temporal axis may be implemented as an internal database feature, or as part of explicit schema design (visible to the application layer), or as schema-transparent middleware residing in the persistence layer. In all cases, the concepts explored in this post remain the same.

Case: 2 Temporal Dimensions

An interesting property of our representational model is that it is not constrained to just one temporal field or temporal dimension. By simply adding another temporal dimension to the diagram, (and to the hyperplane), we can now track object revisions using two temporal fields instead of one. We will shortly see whether this makes any sense in a real-world context, but for now, Figure 3 presents an example of a bi-temporal projection.

Figure 3: Mutation history with 2 temporal dimensions. For an interactive, 3D version (where you get to rotate/zoom the drawing to see it from different angles), visit https://www.geogebra.org/m/ey3sky2s.

Here, the states S,S, S, S, etc. all represent doubly-timestamped historic states of the same entity. The two temporal dimensions - Time and Time are laid orthogonal to each other as coordinate axes. The Projection Hyperplane is now a 2-dimensional plane hovering above the bi-temporal coordinate plane.

Each state gets to project its presence onto the Projection Hyperplane, bounded below by the timestamps of its creation, and above by the timestamps of its respective successors along each temporal dimension. The last known states on each dimension get to project their presence to along that dimension. is represented by a dotted line crossing each axis.

Note: The boxes on the bi-temporal coordinate plane, representing states at different tuples from the pair of temporal dimensions, would rarely be arranged in a perfectly aligned grid. In reality, they would be more likely scattered around seemingly haphazardly. The example in Figure 3 has adopted a less chaotic arrangement to try and simplify a rather complex diagram.

Putting it All Together

We will now explore how these representational state plots map to data in the real world. For up to two time dimensions, the ideal mappings have already been extensively researched and widely adopted, and I will just reiterate here, the conventions already in use:

A transaction time dimension consists of values from a temporal field that tracks the actual time of recording a fact. This is the point of time in the real world when the fact was recorded into the database, and is often auto-filled by the database itself. This field (along with the associated state or event data) is best kept immutable to maintain the sanctity of the system of record.
It represents what was known to the system at the time of record, or the truth, as it was best known at the time of record. It is impossible for fact versions stored along the transaction time dimension to be placed chronologically out of order. This is because they are ordered by their insertion timestamps.
A valid time dimension consists of values from a temporal field that tracks the effective contextual time of a fact, as it was known at the time of record. This is perhaps better explained by a few examples:
i. If an employee joined an organization on the 5th of June, 2020, but their employment record was preemptively created on 3rd June 2020, then the former is a valid time field and the latter is a transaction time field. This is an instance of recording before the fact.
ii. If a parcel got delivered to a customer at 12 PM, but the system was updated later at 2 PM to mark the delivery, then the former is the valid time and the latter is the transaction time. This is an instance of recording after the fact.
iii. If a user places an order on an e-commerce portal, then the time of recording the order is itself considered to be the effective time of placing the order. In this case, the transaction time and valid time are always in sync. For cases like this, a bi-temporal representation can gracefully degenerate to a uni-temporal representation. In this post, we will refer to them as actual-time systems.
A valid time field (and its associated state or event data) is generally kept mutable, and is often updated to correct erroneous data entries or to consolidate conflicting inputs from multiple systems. Fact versions stored along the valid time dimension may arrive preemptively, retroactively, or chronologically out of order w.r.t. each other.

An Example

When working with the two most commonly used temporal dimensions described above, certain interesting observations can be made.

Let us work with an example to help us understand better. Consider a flight booking system. Typically, for a given flight, the price changes over time - you get a different price depending on when you book. There are usually a lot of other factors affecting price (nowadays, almost in real time), but for this example, let us keep things simple by having the price change on a daily basis.

Now, for a given flight, the following attributes would be required at the minimum:

Flight number,
Flight date,
Booking date,
Price

Note: This example provides another interesting insight - that not every date field in an entity can or should be treated as temporal. For instance, in our example, the flight's "flight date" can actually be considered to be part of its immutable identifier. Together with the flight number, it forms a complete and unique identifier.

In our example, the booking date acts as the valid time dimension, since the price point (mutable state) goes in and out of validity based on the booking date. The time of record must invariably serve as the transaction time dimension for any sane use case.

Any system that stores this flight information would have to record some or all of the above fields for each flight, along with a explicit or implicit time of record. In principle, something like the following has to be recorded:

Table 1: A tabular representation of versioned flight data

Record Time	Booking Date	Price
T	D	P
T	D	P
T	D	P
T	D	P

However, as we shall see, the actual mechanism (and therefore capabilities) of storage and retrieval vary from DB to DB (or schema to schema), depending on whether it is uni-temporal w/ transaction time, uni-temporal w/ valid time, or bi-temporal.

Note: If you've gone through some of the literature referenced here or elsewhere on temporal dimensions in databases, you will have observed that both transaction time and valid time fields are marked using a pair of start time and end time values. This is implicitly handled in our example by marking only the start times explicitly, and assuming the end times either coincide with the start time of the next version or stretch to .

Transaction Time Database

A transaction time database is an immutable system of versioning facts. A fact version, once stored in such a database, is indelible for the lifetime of the system. In our example, both price and booking date are essential attributes of a flight, and so this database must store them both (along with a record date).

However, the fact that it is a uni-temporal, transaction time database means that the "booking date" field is not available for use as a filter for the temporal version selection process. This could be due to the way it encodes its data for storage (storing deltas between versions instead of fully built state objects, for example).

Table 2: Versioned flight data in a *transaction time* database

Record Time	State
T	Booking Date: D, Price: P
T	Booking Date: D, Price: P
T	Booking Date: D, Price: P
T	Booking Date: D, Price: P

In order to find the last known price for a given booking date, this system would have to scan its version record (in reverse) until it finds the first version with that booking date. This is obviously a highly inefficient system for this use case.

A transaction time-based uni-temporal system is usually only useful in actual-time (as defined above) cases where the transaction time dimension doubles as the valid time dimension. An example would be the booking management sub-system for an airline, since a booking, once made, is processed in actual time (i.e., neither preemptively nor retroactively) for the span of its active lifecycle (booked -> minor changes like seat allocation, meals, etc. -> check-in -> security -> boarding -> destination).

So, although a transaction time database is useless for serving a catalog, it works just fine for live order management.

Valid Time Database

A valid time database is a mutable system of versioning facts. A fact version stored here (along with its valid timestamp) may be altered whenever new knowledge of the truth surfaces, and differs from the existing record. In our example, since the booking date is the valid time dimension, the state stored against a particular booking date can be modified in-place.

By doing so, however we lose the older price information that was known for that state. In order to find the last known price for a booking date, a simple filter on the booking date field would yield the required state.

Table 3: Versioned flight data in a *valid time* database

Booking Date	State
D	~~Record Time: T, Price: P~~
D	~~Record Time: T, Price: P~~
D	Record Time: T, Price: P
D	Record Time: T, Price: P

Bi-Temporal Database

Finally, a bi-temporal database would have access to both "record time" and "booking date" to filter on, during its version search phase. On such a database, we can ask questions like - "What was the price for booking date D as it was known to the database at record time T? (Answer: P)". The data storage scheme is identical to the scheme presented in Table 1.

Appendix A: Beyond Bi-Temporal

Continuing with our projection-based representation model, we note that we needn't stop at just two dimensions. In fact this model can be extended to an arbitrary number of dimensions, and it is possible to design a database capable of storing and querying facts with n temporal dimensions. However, beyond two dimensions, the semantic mappings quickly turn murky.

This Wikipedia article speaks of tri-temporal databases (with the 3rd temporal dimension being decision time), but the semantics of >2 dimensions seem to remain unclear, while also adding to our cognitive load. Even Snodgrass, et al, in their 1985 ACM SIGMOID paper, limited themselves to just the two dimensions discussed above, with every other time-like field being relegated to the category of user-defined time (a polite way of saying "none of the DB's temporal business").

I think one way of understanding multiple temporal dimensions is to imagine each dimension as the transaction time of some system - the first being the transaction time of the database itself, the second, third.. and so on being transaction times of external systems (including non-computer systems, such as record books, meeting minutes, journals, diaries, etc.) through which the data has traveled to reach our database, and the final being the valid time dimension, i.e. the transaction time of the real world itself, also known as the clock time or wall time. Fact versions along all but the 1st dimension should be mutable.

Disclaimer: I have not put this scheme to test. It is just a hypothesis that may or may not yield usable designs.

Appendix B: Parallel/Branching Realities

Since we're discussing version control for data, it is natural to draw parallels with file-based version control, and ask how branching fits into the scheme. My take is that all the dimensions and fact versions taken together represent one version of reality. To spawn an alternate reality is to hatch a whole new set of n dimensions and start recording fact versions from scratch there. In this grand scheme of things, a branch is nothing but a child reality that has been pre-populated with data from a parent reality.

Introducing foxx-tracer

Aditya Mukhopadhyay — Mon, 17 Aug 2020 05:09:00 GMT

About OpenTracing

Application performance monitoring and distributed tracing have become invaluable tools in the hands of modern application developers, providing critical insights into the inner workings of applications and helping isolate performance bottlenecks and stress points.

OpenTracing is a free and widely used standard for implementing distributed tracing. It allows developers and operations teams to trace execution pathways across applications built on different platforms and running on different servers, even across multiple data centers, tracking trace context across application and service boundaries as they communicate with each other as long as they all implement the common tracing semantics specified by the OpenTracing standard.

...per-process logging and metric monitoring have their place, but neither can reconstruct the elaborate journeys that transactions take as they propagate across a distributed system. Distributed traces are these journeys.
Source: https://medium.com/opentracing/towards-turnkey-distributed-tracing-5f4297d1736

About ArangoDB

ArangoDB is a free and open-source native multi-model database system developed by ArangoDB GmbH. The database system supports three data models (key/value, documents, graphs) with one database core and a unified query language AQL (ArangoDB Query Language). The query language is declarative and allows the combination of different data access patterns in a single query. ArangoDB is a NoSQL database system but AQL is similar in many ways to SQL.

ArangoDB has been referred to as a universal database but its creators refer to it as a "native multi-model" database to indicate that it was designed specifically to allow key/value, document, and graph data to be stored together and queried with a common language.

The Foxx Runtime

Quoted from the ArangoDB website:

Foxx is a JavaScript framework for writing data-centric HTTP microservices that run directly inside of ArangoDB.
Traditionally, server-side projects were developed as standalone applications that guide communications between the client-side front-end and the database back-end. Through the Foxx Microservice Framework, ArangoDB allows application developers to write their data access and domain logic as microservices and running directly within the database with native access to in-memory data.
Source: https://www.arangodb.com/why-arangodb/foxx/

Foxx services consist of JavaScript code running in the V8 JavaScript runtime embedded inside ArangoDB. Each service is mounted in each available V8 context (the number of contexts can be adjusted in the server configuration). Incoming requests are distributed across these contexts automatically.

Great, so this is just like programming for the Node.js environment, which also runs on V8 and supports the CommonJS module loading mechanism.

Except, there's a catch - Foxx is 100% synchronous!

Why foxx-tracer?

Most tracing libraries in the nodeverse are asynchronous, and so do not work in the synchronous V8 runtime that ArangoDB uses to run its Foxx services. foxx-tracer bridges this gap by being a 100% synchronous, dedicated module built for the Foxx runtime.

It is a CommonJS-loadable package available through the NPM registry. However, it relies on a number of features only available in a Foxx environment. It also depends on a companion collector service which itself is a Foxx microservice. These dependencies make this module incompatible with Node.js and browser-based runtimes.

The foxx-tracer ecosystem

foxx-tracer works in conjunction with a bunch of other applications/modules which together comprise the foxx-tracer ecosytem. These are:

foxx-tracer - a module that you include in your foxx microservice when you want to enable distributed tracing for it.
foxx-tracer-collector - a collector agent that receives OpenTracing spans within the sychronous Foxx environment, and then asynchrounously pushes them to multiple, configurable destinations. The collector supports a simple plugin mechanism through which one can add multiple reporters to talk to different destinations.
A bunch of reporters that let the collector push its incoming spans to different APM endpoints (like Datadog, NewRelic, etc). It is very easy to write your own reporter if you don't find one for your specific endpoint in the NPM registry. Two reporters written by me are:
1. A console reporter that prints traces to the ArangoDB log.
2. A production-ready reporter available for the Datadog Cloud Monitoring Service.

All components are well documented and there is a reference implementation in RecallGraph, which one can look up to see how all the pieces fit together.

RecallGraph Presented @ Open Source Directions

Aditya Mukhopadhyay — Tue, 28 Jul 2020 18:30:00 GMT

A webinar recording of RecallGraph, where I discuss its roadmap, adoption and development efforts. Apologies in advance for the occasional drop in quality.

https://www.youtube.com/watch?v=A953O3hT1Os

RecallGraph v1 Released

Aditya Mukhopadhyay — Tue, 28 Jul 2020 18:30:00 GMT

RecallGraph is a versioned-graph data store - it retains all changes that its data (vertices and edges) have gone through to reach their current state. It supports point-in-time graph traversals, letting the user query any past state of the graph just as easily as the present.

https://github.com/RecallGraph/RecallGraph/releases/tag/v1.0.0

RecallGraph Presented at ArangoDB Online Meetup

Aditya Mukhopadhyay — Tue, 28 Jul 2020 18:30:00 GMT

A webinar recording of RecallGraph, presented at ArangoDB Online Meetup, as part of their Community Pioneers Initiative.

https://www.youtube.com/watch?v=UP2KDQ_kL4I

Introducing CivicGraph

Aditya Mukhopadhyay — Fri, 31 Jan 2020 18:30:00 GMT

I would like to introduce an open source, Apache 2.0 licensed project of mine:

https://github.com/CivicGraph/CivicGraph

CivicGraph is a versioned-graph data store - it retains all changes that its data (vertices and edges) have gone through to reach their current state. It supports point-in-time graph traversals, letting the user query any past state of the graph just as easily as the present.

It is a Foxx Microservice for ArangoDB that features VCS-like semantics in many parts of its interface, and is backed by a transactional event tracker. It is currently being developed and tested on ArangoDB v3.5, with support for v3.6 in the pipeline.

CivicGraph is a potential fit for scenarios where data is best represented as a network of vertices and edges (i.e., a graph) having the following characteristics:

Both vertices and edges can hold properties in the form of attribute/value pairs (equivalent to JSON objects).
Documents (vertices/edges) mutate within their lifespan (both in their individual attributes/values and in their relations with each other).
Past states of documents are as important as their present, necessitating retention and queryability of their change history.

Its API is split into 3 top-level categories:

Document

Create - Create single/multiple documents (vertices/edges).

Replace - Replace entire single/multiple documents with new content.

Delete - Delete single/multiple documents.

Update - Add/Update specific fields in single/multiple documents.

(Planned) Explicit Commits - Commit a document's changes separately, after it has been written to DB via other means (AQL / Core REST API / Client).

(Planned) CQRS/ES Operation Mode - Async implicit commits.

Event

Log - Fetch a log of events (commits) for a given path pattern (path determines scope of documents to pick). The log can be optionally grouped/sorted/sliced within a specified time interval.

Diff - Fetch a list of forward or reverse commands (diffs) between commits for specified documents.

(Planned) Branch/Tag - Create parallel versions of history, branching off from a specific event point of the main timeline. Also, tag specific points in branch+time for convenient future reference.

(Planned) Materialization - Point-in-time checkouts.

History

Show - Fetch a set of documents, optionally grouped/sorted/sliced, that match a given path pattern, at a given point in time.

Filter - In addition to a path pattern like in 'Show', apply an expression-based, simple/compound post-filter on the retrieved documents.

Traverse - A point-in-time traversal (walk) of a past version of the graph, with the option to apply additional post-filters to the result.

I hope some of you may find this a useful service to address several types of data modelling challenges pertaining to retention and querying of historical graph data.

A Closer Look at Delta Arithmetic

Aditya Mukhopadhyay — Sun, 30 Jun 2019 18:30:00 GMT

In their 1996 paper published in ACM Transactions on Database Systems, Ghandeharizadeh et al. defined a formal algebra around Deltas (the encoded difference between any two states of a system) in a relational database. Their definition is actually generic enough to apply to any system whose state can be represented as a set of tuples, which is why it is still used in contemporary research on database versioning, including those involving non-relational data models (Khurana et al. Storing and Analyzing Historical Graph Data at Scale). In order to support my deep dives into a couple of modern designs that use Delta Arithmetic for versioning graph databases, I will spend some time in this post laying down its foundations, clarifying a few under-explained points along the way.

Delta Arithmetic - The Ground Rules

We start with a database containing the following relations or tables, for the fictional inventory management system of a bicycle manufacturer:

Suppliers - A list of vendors and the parts they sell,
Orders - A record of orders placed to vendors, the parts and their quantities.

The contents of the two tables described above are shown below:

Table 1: Suppliers
Supplier	Part
Trek	frame
Campy	brakes
Trek	pedals

Table 2: Orders
Part	Quantity	Supplier	Expected
frame	400	Trek	8/31/93
brakes	150	Campy	9/1/93

Tuple Representation

Every record in the database is mapped to a classified tuple of the form \(RelName(field\_value_1, field\_value_2, ...)\). For example the first row in the Suppliers table is represented as \(Suppliers(Trek, frame)\) and the first row in the Orders table as \(Orders(frame, 400, Trek, 8/31/93)\). The order of values in the tuple is the same as the order of field definitions in the tables above. In this way, every row in every table in the database is mapped to a tuple. For this small database, we can write the entire initial state \(S_a\) as a set of tuples, as shown below:

It should be noted at this point that none of the tables above sport row identifiers or primary keys. However, since the algebraic definitions assume the pure relational model, every tuple is considered unique in itself, and cannot exist more than once anywhere in the database. This constraint is captured mathematically by the property of sets that require every element to be unique. Additionally, if one or more of the fields in the tuple constitute a key, then at most one tuple with a particular combination of values for those fields can exist at a time in the state set. For example, in the Orders table, if the fields (Part, Quantity, Supplier) fields constituted the key (a bad design in reality, but will suffice for this example), then every tuple in the set \(S_a\), apart from being unique in itself, must also be unique w.r.t to the sub-tuple formed by the above 3 fields.

Aside: The simplest way to remediate the uniqueness problem without enforcing uniqueness in tuples with semantic (business-relevant) fields is to add a field representing a dumb primary key. For example, a Order ID field in the orders table. This also permits repeating values for semantic sub-tuples in the Orders set.

Signed Atom

This is an expression of the form \(\pm\langle\text{RelName}\rangle\langle\text{Tuple}\rangle\) and corresponds to an insertion or deletion operation, depending on the \(+\) or the \(-\) prefix respectively. For example, \(+Suppliers(Shimano, brakes)\). This is the smallest unit of modification that can happen to the overall state set.

Aside: An update operation would require use of two atoms:

One for deletion of the old value, and
One for insertion of the new value.

As far as the set algebra used for delta arithmetic is concerned, the order of the above two operations does not matter, as long as the consistency constraints defined in the next section are met. Most databases, of course, allow atomic updates.

Since we're using sets to represent a pure relational model, a signed atom representing insertion of a tuple already present in the current state results in a No-Op. Similarly, a signed atom representing deletion of a non-existent tuple from a set also results in a No-Op.

Delta

A delta is an unordered, finite set of signed atoms. For example,

Consistent and Failed Deltas

A delta is called consistent if it does not contain both positive and negative versions of the same atom. Otherwise, it is called an inconsistent, or failed delta. For example, the delta defined in (2) is a consistent delta, whereas a failed delta would look like:

Also, the delta being a set subject to the same uniqueness constraints imposed by keys (if present), we have the following: For a relation with two fields denoted by \(R[A, B]\), if \(A\) is the key and there exist two signed atoms \(+R[a, b]\) and \(+R[a, c]\) in a delta, then we must necessarily have \(b = c\) for the delta to be consistent (essentially collapsing them to a single signed atom).

Aside: One question that rose to my mind when I looked at (3) is why this should be disallowed. After all, the signed atoms within the delta are inverse operations of each other (insertion and deletion), and hence should just cancel each other out, resulting in a No-Op at worst. The paper does not directly address this question, though, as we shall see in a later section, this restriction is necessary to allow for delta operations to be safely applicable in any order (remember, the delta is an unordered set).

Delta Breakdown - Snapshot Fragments

The paper defines the following relations for a consistent delta \(\Delta\):

The consistency requirement can now be expressed as:

For example, \(\Delta_1\) from (2) would be split into the following:

Although the paper doesn't explicitly name the definitions in (4), I will label them here as snapshot fragments. These represent the same type of elements in a set as the snapshot (tuples categorized by relation). The original delta, which represents events or operations, cannot be a direct algebraic operand along with the snapshot but its derivative snapshot fragments can.

Now that we have defined delta components that can directly combine with the snapshot algebraically, we define the application of a delta \(\Delta\) to a snapshot \(S\) as the following equivalent functions:

Why Inverse Signed Atoms Lead to a Failed Delta

A few sections earlier, we saw that a delta of the form illustrated in (3) is a failed delta, but did not elaborate on why it is so. Now that we have the commutative criterion of the delta function as described in (6), we can get a clearer picture.

We will examine two examples of failed deltas:

One where the signed atoms represent an element already present in current state, and
One where they represent a new element.

Say our current state has the following elements:

First let us consider case 1. Let \(\Delta_{f} = \left\{ \begin{array}{l} +Suppliers(Trek, frame), \\ -Suppliers(Trek, frame) \end{array} \right\}\).

Applying (6), we get:

We also note that violating (5). To see if this delta still satisfies the commutative criterion of (6), we need:

which is a contradiction.

Now let us consider case 2. Let \(\Delta_{f'} = \left\{ \begin{array}{l} +Suppliers(Shimano, brakes), \\ -Suppliers(Shimano, brakes) \end{array} \right\}\).

Applying (6), we get:

We also note that violating (5). To see if this delta still satisfies the commutative criterion of (6), we need:

which is also a contradiction.

Therefore, we see that in order to satisfy the requirements in (6), we need to satisfy (5).

Delta Composition

Finally, I examine how the paper has formalized delta composition (or chaining) operations, thereby allowing us to apply at succession of deltas on a given state to arrive at the final state.

Smash Composition

One type of composition operation described is called a smash, denoted by \("!"\). This is the composition used by most active databases. Algebraically, a smash of two deltas is their union, with conflicts resolved in favour of the second argument. Given:

Then using (2) and (7) we get:

The formal definition for the smash operation is given by:

The reader is encouraged verify whether the above holds true for our example, using (2), (7), (4), and plugging into (9).

The most important characteristic of the smash composition is that it supports function composition, i.e.:

Merge Composition

The second type of composition defined by the paper is the merge. This is less common in real world databases, and hence is not examined in detail. The merge is denoted by \("\&"\) and its formal definition is given by:

Refer the paper for a slightly more detailed explanation of this.

Summary

The authors have used elementary set algebra to come up with some elegant mathematical formalizations for representing database changes over time. Though they have presumed the presence of a pure relational context, there is nothing exceptional being done at the set algebra level - all its rules are strictly followed. This opens up the possibility of using delta arithmetic to analyze any kind of database whose states can be represented as sets of tuples. As we shall see in a future post, this is exactly what Khurana et al.[^2] have done when designing their Temporal Graph Index.

Exploring Graph Database Versioning Approaches

Aditya Mukhopadhyay — Thu, 27 Jun 2019 18:30:00 GMT

Building on my previous post, where I emphasize the need for versioned graph databases, I explore a few of the possible approaches to designing one. This is not a ground-up approach that delves into the design of the database engine itself, but rather an overlay approach that can let us run version control atop any standard graph database.

A Naive Approach

Think of a historical key-value store - one where the history of values against every key is retained, and retrieved using a composite of the key and a revision number (absolute or relative). Optionally, to get the current value of a key, we may omit the revision number. This is one of the simplest conceptual forms of a historical database. Since many real world graph databases are built on top of an underlying key-value store (ArangoDB, JanusGraph, DataStax Enterprise Graph), we will try to use our historical key-value store to see if it gives our graph database some degree of history retention.

We can naively concoct a graph representation on this database by storing documents as key-value pairs - keys being the unique document ids and values being their respective attribute-value pairs or property lists. For the sake of this thought experiment, we will not bother with how the connections are internally represented, i.e. whether source and destination node ids are stored as edge attributes, or incoming and outgoing edge lists are stored as node attributes, or some other fancy scheme (In real-life graph databases, this is dictated by the design the underlying implementation).

The Historical KV Store - A Closer Look

Figure 1: Evolution of stored data over time in the historical KV store

The figure above shows what the contents of this key-value store might look like for a few sample key-value pairs. Every write operation (create/update/delete) for a given KV pair is associated with a revision number and a timestamp. The latest value for a key is always immediately available using a simple key lookup. This can happen in O(1) for an in-memory store. This is depicted in the figure by projecting the latest values of existing pairs onto the line Tnow. However, when we want to find out the state of the database at some point of time in the past, things get a little more complicated. The table below shows the values of all the keys depicted in the figure, at different times.

Table 1: Projection of historical values at different times
	T₁	T₂	T₃	T₄	T_now
K₁	V₀ @ (R₀, T_a)	V₀ @ (R₀, T_a)	V₀ @ (R₀, T_a)	V₁ @ (R₁, T_f)	V₁
K₂	--	V₀ @ (R₀, T_c)	V₁ @ (R₁, T_e)	V₁ @ (R₁, T_e)	D @ (R₂, T_h)
K₃	--	V₀ @ (R₀, T_d)	V₀ @ (R₀, T_d)	D @ (R₁, T_g)	D @ (R₁, T_g)
K₄	V₀ @ (R₀, T_b)	V₀ @ (R₀, T_b)	V₀ @ (R₀, T_b)	V₀ @ (R₀, T_b)	V₀

Temporal Queries

This table was easy to build from the visual depiction in Figure 1. But how does our historical KV store figure out the state of the DB at, say time T3? Here's what we have to work with:

Revision numbers can be relatively referenced (similar to Git), so we can assume these to be integer values starting with 0. Positive numbers reflect revisions w.r.t. the beginning of revision history (R0) and negative numbers w.r.t to the end (Rlatest).
Since the database allows both key-only lookups (for latest revisions) and key+revision lookups, we can assume that it internally maintains a 2-field hash index (key, revision), keeping the 2nd level sorted in reverse order of the revision number (to facilitate retrieving the latest version for key-only lookups).

Therefore, to find the value of, say, K2 at time T3, we need to perform the following steps:

Determine if T3 is closer to T0 or Tnow. Based on this, we will start from the oldest or the newest revision of K2 respectively.
If T3 is closer to T0. i. Set n to 0. ii. Lookup K2 @ Rn and note its timestamp T. iii. If T > T3 then return K2 @ R(n-1) (If n = 0 then return null). iv. Else increase n by 1 and go to step (ii).
Else if T3 is closer to Tnow. i. Get K2 @ Tnow. Note the revision number and assign to N. Note its timestamp T. ii. Set n to N. iii. If T < T3 then return K2 @ Rn. iv. Else decrease n by 1. v. Get K2 @ Rn, note its timestamp T and go to step (iii).

This is evidently far more complex than a simple key or key+revision lookup, and not in O(1) anymore. It is now O(N(K)) where N(K) is the number of revisions for key K. This greatly inefficient lookup can be speeded up drastically by additionally maintaining a 2-field skiplist (see below) index (key, timestamp), keeping the 2nd level sorted in reverse chronological order. This is still at best an O(log(N(K))) operation.

A skiplist index supports both equality and range queries. So we can ask it to return all values for key = K and timestamp < T.

Conclusion

We see that while this design provides a history of every individual document out of the box, and the history can be queried by revision number or point-in-time,

It does not readily expose the the structure of the graph as a whole (or subgraphs, or k-hop neighborhoods - all of which might be of interest to a network analyst.),
Fetching a past state of the graph is more expensive than fetching its current state (this should be an acceptable trade-off for most OLTP scenarios),
The revision-number based historical KV store does not offer any intrinsic benefits for time-based lookups, and
There is potential for a lot of storage and write-bandwidth overhead when storing multiple versions of large documents where each version only updates a small portion of a document, since it results in redundant storage.

We see that this approach has several drawbacks which make it unsuitable for an efficient historical graph data store. We will explore other, hopefully better approaches in future posts.

The Case for Versioned Graph Databases

Aditya Mukhopadhyay — Thu, 27 Jun 2019 18:30:00 GMT

Why Graph Databases?

Graph databases have become ubiquitous over the years due to their incredible representational and querying capabilities, when it comes to highly interconnected data. Wherever data has an inherent networked structure, graph databases fare better at storing and querying that data than other NoSQL databases as well as relational databases, because they naturally persist the underlying connected structure. This allows for traversal semantics in declarative graph query languages, and also, better performance than SQL - especially for deep traversals. Additionally, they often help unravel emergent network topologies in legacy data, that had not previously been mined for such structures. At the very least, they make the process a lot less tedious.

Most real world network data intrinsically lend themselves to graphical representations, and hence, can be modeled in graph databases. These include, to name a few:

Wired and wireless computer networks and cellular networks
Road, rail, air and shipping routes,
Supply and distribution chains,
Biological and artificial neural networks,
Complex chemical and nuclear reaction chains,
Social networks,
Software package and library dependency trees, and many more.

Why Versioned Graphs?

In addition to reaping the benefits of living in graph databases, many real world applications also stand to take advantage of network evolution models, i.e. a record of changes to a network over time; for example, analyzing railway track utilization efficiency as a function of signal array timing, or the simulation of nucleotide concentration changes over time in a nuclear fission reactor. However, in most of the prominent mainstream graph databases that are freely available at the time of this writing, I have not come across any that offer some sort of built-in revision tracking (meaning older versions of data are retained for future retrieval).

Particularly for graph databases, the concept of revisions applies not only to individual nodes and edges, but also to the structure of the graph as a whole, i.e. it should be relatively easy to store and retrieve not only individual document (node/edge) histories, but also the structural history of the graph or a portion of it. This is a key difference between a hypothetical versioned or historical graph database and a general purpose event store (see below), which is usually tuned for the former but not the latter.

An event store is a database that records entity write operations (creates/updates/deletes) as a series of deltas wrapped in events. Each delta is the difference between the contents of the updated entity and its previous version. It is part of an event payload, where the event represents the particular write operation (create/update/delete) that occurred. Thus, deltas encode the entire write history of the entity.

I therefore submit that there is a need for a practical, historical graph database that has the following minimal set of characteristics:

A mechanism for efficiently recording individual document (node/edge) writes (creates/updates/deletes) in such a way that they can be rewound and replayed.
An internal storage architecture that not only maintains the current structure of the graph, but also allows for a quick rebuild and retrieval of its structure at any point of time in the past. This could, optionally, be optimized to retrieve recent structures faster than older ones.
An efficient query engine that can traverse current/past graph structures to retrieve subgraphs or k-hop neighborhoods of specified nodes. In case of historical traversals, this should be optimized to rebuild only the relevant portions of the graph, where feasible.

The Current State of Historical Graph Databases

There is a general consensus in the computing and scientific research community for the need of a historical graph database, and to the best of my knowledge, research has been carried out along two primary forks:

Graph databases with built-in revision support at the DB engine level, i.e. they are designed from the ground up to support revisions:
i. Vijitbenjaronk et al. Scalable time-versioning support for property graph databases.
Database designs, supplemented by external application/service layers to provide a revision tracking facade on top of conventional static (based on the database taxonomy proposed by Snodgrass et al. in their 1985 ACM SIGMOID paper titled A Taxonomy of Time in Databases.) graph database engines:
i. Khurana et al. Efficient Snapshot Retrieval over Historical Graph Data,
ii. Khurana et al. Storing and Analyzing Historical Graph Data at Scale.

There would be many more publications and implementations available upon a quick search, but I believe they would all fall under one of the above two categories.