Course Outline

Introduction

  • Graph databases and libraries

Understanding Graph Data

  • The graph as a data structure
  • Using vertices (dots) and edges (lines) to model real-world scenarios

Using Graph Databases to Model, Persist and Process Graph Data

  • Local graph algorithms/traversals
  • neo4j, OrientDB and Titan

Exercise: Modeling Graph Data with neo4j

  • Whiteboard data modeling

Beyond Graph Databases: Graph Computing

  • Understanding the property graph
  • Graph modeling different scenarios (software graph, discussion graph, concept graph)

Solving Real-World Problems with Traversals

  • Algorithmic/directed walk over the graph
  • Determining circular cependencies

Case Study: Ranking Discussion Contributors

  • Ranking by number and depth of contributed discussions
  • A note on sentiment and concept analysis

Graph Computing: Local, In-Memory Graph toolkits

  • Graph analysis and visualization
  • JUNG, NetworkX, and iGraph

Exercise: Modeling Graph Data with NetworkX

  • Using NetworkX to model a complex system

Graph Computing: Batch Processing Graph Frameworks

  • Leveraging Hadoop for storage (HDFS) and processing (MapReduce)
  • Overview of iterative algorithms
  • Hama, Giraph, and GraphLab

Graph Computing: Graph-Parallel Computation

  • Unifying ETL, exploratory analysis, and iterative graph computation within a single system
  • GraphX

Setup and Installation

  • Hadoop and Spark

GraphX Operators

  • Property, structural, join, neighborhood aggregation, caching and uncaching

Iterating with Pregel API

  • Passing arguments for sending, receiving and computing

Building a Graph

  • Using vertices and edges in an RDD or on disk

Designing Scalable Algorithms

  • GraphX Optimization

Accessing Additional Algorithms

  • PageRank, Connected Components, Triangle Counting

Exercis: Page Rank and Top Users

  • Building and processing graph data using text files as input

Deploying to Production

Closing Remarks

Requirements

  • An undersanding of Java programming and frameworks
  • A general understanding of Python is helpful but not required
  • A general understanding of database concepts

Audience

  • Developers
  28 Hours
 

Number of participants


Starts

Ends


Dates are subject to availability and take place between 09:30 and 16:30.
Open Training Courses require 5+ participants.

Testimonials (2)

Related Courses

Related Categories