PYTORCH-BIGGRAPH: A LARGE-SCALE GRAPH EMBEDDING SYSTEM - SysML Conference

Page created by Paula Kelly

Science

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

P Y T ORCH -B IG G RAPH : A L ARGE - SCALE G RAPH E MBEDDING S YSTEM

Adam Lerer 1 Ledell Wu 1 Jiajun Shen 1 Timothee Lacroix 1 Luca Wehrstedt 1 Abhijit Bose 1
Alex Peysakhovich 1

A BSTRACT
Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety
of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and
trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph
(PBG), an embedding system that incorporates several modifications to traditional multi-relation embedding
systems that allow it to scale to graphs with billions of nodes and trillions of edges. PBG uses graph partitioning
to train arbitrarily large embeddings on either a single machine or in a distributed environment. We demonstrate
comparable performance with existing embedding systems on common benchmarks, while allowing for scaling to
arbitrarily large graphs and parallelization on multiple machines. We train and evaluate embeddings on several
large social network graphs as well as the full Freebase dataset, which contains over 100 million nodes and 2
billion edges.

1 I NTRODUCTION to embed graphs with 1011 − 1012 edges in a reasonable
time. Second, a model with two billion nodes and even 32
Graph structured data is a common input to a variety of embedding parameters per node (expressed as floats) would
machine learning tasks (Wu et al., 2017; Cook & Holder, require 800GB of memory just to store its parameters, thus
2006; Nickel et al., 2016a; Hamilton et al., 2017b). Working many standard methods exceed the memory capacity of
with graph data directly is difficult, so a common technique typical commodity servers.
is to use graph embedding methods to create vector repre-
sentations for each node so that distances between these We present PyTorch-BigGraph (PBG), an embedding sys-
vectors predict the occurrence of edges in the graph. Graph tem that incorporates several modifications to standard mod-
embeddings have been have been shown to serve as useful els. The contribution of PBG is to scale to graphs with
features for downstream tasks such as recommender sys- billions of nodes and trillions of edges. Important compo-
tems in e-commerce (Wang et al., 2018), link prediction in nents of PBG are:
social media (Perozzi et al., 2014), predicting drug interac-
tions and characterizing protein-protein networks (Zitnik & • A block decomposition of the adjacency matrix into
Leskovec, 2017). N buckets, training on the edges from one bucket at a
Graph data is common at modern web companies and poses time. PBG then either swaps embeddings from each
an extra challenge to standard embedding methods: scale. partition to disk to reduce memory usage, or performs
For example, the Facebook graph includes over two billion distributed execution across multiple machines.
user nodes and over a trillion edges representing friendships,
likes, posts and other connections (Ching et al., 2015). The • A distributed execution model that leverages the block
graph of users and products at Alibaba also consists of more decomposition for the large parameter matrices, as well
than one billion users and two billion items (Wang et al., as a parameter server architecture for global parameters
2018). At Pinterest, the user to item graph includes over 2 and feature embeddings for featurized nodes.
billion entities and over 17 billion edges (Ying et al., 2018).
• Efficient negative sampling for nodes that samples neg-
There are two main challenges for embedding graphs of ative nodes both uniformly and from the data, and
this size. First, an embedding system must be fast enough reuses negatives within a batch to reduce memory band-
1
Facebook AI Research, New York, NY, USA. Correspondence width.
to: Adam Lerer .
• Support for multi-entity, multi-relation graphs with per-
Proceedings of the 2 nd SysML Conference, Palo Alto, CA, USA, relation configuration options such as edge weight and
2019. Copyright 2019 by the author(s).
choice of relation operator.