#### Introduction * GraphLab abstraction exposes asynchronous, dynamic, graph-parallel computation model in the shared-memory setting. * This paper extends the abstraction to the distributed setting. * [Link](http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf) to the paper. #### Characteristics of MLDM (Machine Learning and Data Mining) * Graph Structured Computation * Sometimes computation requires modeling dependencies between data. * eg modeling dependencies between similar users for the recommendation use case. * Asynchronous Iterative Computation * In many cases, asynchronous procedures outperform synchronous ones. * eg linear systems, belief propagation, stochastic optimization etc. * Dynamic Computation * Iterative computation converges asymmetrically. * Convergence can be accelerated by dynamic scheduling. * eg do not update parameters that have already converged. * Serializability * Ensuring that all parallel executions have an equivalent serial execution is desirable for both correctness and faster convergence. #### GraphLab Abstraction ### Data Graph * Store program state as a directed graph. * **G = (V,E,D)** where D is the user defined data (model parameters, algorithm state, statistical data etc). * The graph data **D** is mutable but the state of the graph **(V,E)** is immutable. #### Update Function * Stateless procedure that modifies the data within the scope of a vertex and schedules the execution of the *update* function on other vertices. * **Scope** of a vertex (S) - data corresponding to the vertex, its edges and its adjacent vertices. * **update: $f (v, S_v) -> (S_v, T)$** where T is the set of vertices where *update* function is scheduled to be invoked. * Scheduling of computation id decoupled from movement of data and no message passing is required between vertices. #### Execution Model * Input to the model is G and T, the initial set of vertices to be updated. * During each step, a vertex is extracted from T, updated and a set of vertices is added to T (for future computation). * Vertices in T can be executed in any order with the only constraint that all vertices be eventually executed. #### Sync Operation * Sync operation runs in the background to maintain global aggregates concurrently. * These global values are read by *update* function and written by the sync operation. #### Consistency Models * Full consistency * Full read/write access in the *scope*. * Scope of concurrently updating vertices cannot overlap. * Edge consistency * Read/write access on the vertex and the adjacent edges but only read access to adjacent vertices. * Slightly overlapping scope. * Vertex consistency * Write access to the vertex and read access to adjacent edges and vertices. * All vertices can run update function simultaneously. ### Distributed Data Graph * Two-phase partitioning process for load balancing the graph on arbitrary cluster size. * In the first phase, partition the graph into k parts (k >> number of machines). * Each part, called **atom**, is a file of graph generating commands. * Atom also stores information about **ghosts** (set of vertices and edges adjacent to the partition boundary). * Atom index file contains connectivity structure and file location for the k atoms as a meta-graph. * In the second phase, this meta-graph is partitioned over the physical machines. ### Distributed GraphLab Engines #### Chromatic Engine * A vertex coloring (no adjacent vertices have the same color) is constructed to serialize parallel execution of dependent tasks (in our case, vertices in the graph). * For edge consistency model, execute all vertices of the same color before going to next color and run sync operation between color steps. * Changes to ghost vertices and edges are communicated asynchronously as they are made. * Vertex consistency is trivial - assign same color to all the vertices. * For full consistency, construct second-order vertex coloring (no vertex shares the same color as any of its distance two neighbors) #### Distributed Locking Engine * Associate reader-writer locks on each vertex. * Each machine can update only the local vertices. * Optimisations * Ghosting system uses caching to eliminate wait on remote, unchanged data. * Lock request and synchronization are pipelined to hide network latency. * Each machine maintains a pipeline of vertices for which locks have been requested but not granted. * A vertex is executed once lock acquisition and data synchronization are complete. * Nonblocking reader-writer locks, that work through callback functions, are used. ### Fault Tolerance * Distributed checkpointing via two modes: * Synchronous checkpointing * Suspend computation to save all modified data since the last checkpoint. * Asynchronous checkpointing based on Chandy-Lamport snapshot algorithm. * The snapshot step becomes an *update* function in the GraphLab abstraction. * Better than synchronous checkpointing. ### System Design * One instance of GraphLab runs on each machine. * These processes are symmetric and communicate via RPC. * The first process additionally acts as the master and computes placement of atoms based on atom index. * Each process maintains a local scheduler (for its vertices) and a cache to access remote data. * Distributed consensus algorithm to decide when all the schedulers are empty. ### Observations * The biggest strength of the paper are its extensive experiments. * GraphLab benefits from the use of background asynchronous communication and pipelined locking but its communication layer is not as efficient as MPI's communication layer.