EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics

Page created by Bill Reeves
 
CONTINUE READING
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay Shopbot
Conversational AI
Recommendation
System

                                                                              Getting Your Transformation Right

                                                                                    SYMPOSIUM
                                                                                        March 22-23, 2018

     5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay Shopbot Presentation Origin Story

    Presentation based on “eBay ShopBot: Graph-
        powered Conversational Commerce”

Ajinkya Kale, Senior Applied Researcher, AI, New Product Development, eBay,
Anju Vasta, New Product Development, eBay
eBay ShopBot: Graph-powered Conversational Commerce, GraphConnect
(2017, Oct. 23-24) [Video].

                  5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   2
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay Marketplace at a Glance

One of the world’s largest and most vibrant marketplaces

                                                                                   Takeaway: Scale,
$20B                1.1B                     80%
                                             Items sold as
                                                                                   Volume, Variety,
GMV                 Live listings
                                             new                                   and Velocity
                                                                                   Matters
67%                 12M                      60%
Transactions        New listings             Platform
that ship for       added via                GMV
free (US, UK, DE)   mobile per               touched by
                    week                     mobile
                                                            Data as of Q1 2017

                    5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018            3
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay Velocity and Variety

2017 Fun Facts: Velocity                                                    Frequency of product purchases via
                                                                            desktop and mobile
Stats by Region                                                             Takeaway: Behavior is complex

                                             US                                                 UK

         A tool is purchased every 11 sec              A makeup product is purchased every 3 sec
  A smartphone is purchased every 5 sec                           A car part is purchased every 1 sec
       A watch is purchased every 4 sec                         An appliance is purchased ever 8 sec

                                             DE                                                  AU

          A tire is purchased every 16 sec                 A wedding item is purchased every 26 sec
        A tablet is purchased every 3 sec              A home decor item is purchased every 14 sec
        A Lego is purchased every 18 sec                A car or truck part is purchased every 4 sec

                       5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                    4
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay Shopbot – Natural Language Understanding
              NLU is not a rule based search engine – Personalization is K EY!

Personal          Powered Developed
Shopping           By AI  at eBay’s
Assistant                    NPD
                            Group
Conversational    eBay’s goal is      Shopbot
Commerce          to be as close      launched in
bridges the gap   as possible         2016 on
between           using AI tech       facebook
stateless         such as Natural     messenger
search engine     Language            platform. Now
and a             Understanding,      also available
shopper’s         Knowledge           thru Google
actual intent     Graphs and          Home
                  Computer
                  Vision

                                5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   5
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay Shopbot

Why – did eBay build their own Shopbot?                             Why - is it hard to build a Shopbot
• 3rd Party Bot Frameworks                                          Most production chatbots have curated dialog flows
  Tuned for general purpose bots                                   • It’s easy for restricted domains like flight booking bots
      o Intent detection (e.g. weather, flight schedules,…)         • With 20k+ categories of products with 150K+ attributes,
      o Entity extraction (e.g. number, temperature,…)
                                                                      its hard to get right and scale!
    Limited non-linear conversation support
                                                                    • Need a solution which scale for dialog conversations on
    Coarse grained bot memory
                                                                      thousands of products
    No inherent gain with experience
                                                                    • Different interaction / input formats such as voice and
    Existing offerings in various states of maturity
      o Scalability (data size, complexity, and speed)
                                                                      image
      o Capabilities, API’s, tools, extensibility…                  • Speed and effectiness at (any) scale is required by
                                                                      users
 What – is eBay trying to solve with Shopbot?
 • Conversational Commerce not search                               Why - is Shopbot important to eBay
 • Dialog systems                                                   • Unreasonable Effectiveness of Data – Halevy,Norvig,Pereria
 • What’s the next best question to ask                             • eBay has seen “almost” all queries that are important
 • How do we build the learning system that humans                  • No production system runs dialogs just based on Deep
   gain inherently with experience                                    Learning models
 • Can collaborative inferencing be used
                                                                 In other terms… To sell more stuff!
                                  5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                       6
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
Natural Language Understanding

             5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   7
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay Shopbot

How eBay addressed the need for continuous NLU Shopbot functionality

● Build a probabilistic inference graph

   o Data comes from eBay.com user behavioral patterns and other sources

   o When a user is looking for a “running shoes” what’s the most important aspect of
     the shoe that they care about (maybe NLU style query flow here with next
     questions)
● Enter GraphDB (Neo4j) for eBay Shopbot knowledge cache
   o Tried RDBMS
   o Needed query / speed / schema flexibility / production system capabilities
     provided by the Neo4j property graph database technology and tools
                       5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   8
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay ShopKnowledge Workflow at a Glance

                                                                                                     Product
                                                                                                    Knowledge

                                                                         Query
                                                                      Understanding

                                                                                                   Trends

       Apache Airflow

                                                                                               Price
                                        Knowledge
                                                                                             Prediction
                                          Graph
                                                                               Aspect
                                                                                Reco

                                                                 Entity
                                                               Extraction

         Data Sources

                                                                             Microservices

                                                                                                                9

                        5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                      9
EBay Shopbot Conversational AI Recommendation System - SYMPOSIUM - Big Data & Business Analytics
eBay Data
                       Not just com m erce data

Data Sources                                   Graph DB Characteristics and High Level
                                               Functionality
● Latest wikidata stats
                                               ● > 0.5B nodes
● Items / txn items
                                               ● > 16B relationships
● Buyer behavioral data…
                                               ● Functionality
● WiW – What’s It Worth
                                                     o Probabilistic graphical models
● And…
                                                     o Supervised and semi-supervised
                                                       models
                                                                                           10

                  5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018        10
Under the Shopbot hood ...

● Probabilistic inferencing on behavioral patterns from past users

● Almost like Expert system navigation

   o Parse, categorize, group, contextualize natural language

● Easy to combine World Knowledge to augment user behavior data with
  external datasets:

   o Curation to augment data

   o Machine learning models to augment data

● ML model as a cache in Neo4j
                   5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   11
Inferencing From the Graph
                    How to ask the “nex t best question”

                                   Color

                                                                      Red
          Women’s                                                                     Ask for shoes by color –
           Shoes
                                                                                      likely to be a woman or
                                                                                      man?
                                                                     Blue
                                                                                      Basis to “ask the next
shoes
                                                                                      best question”

                                                                     Coach            e.g. F: Brand question /
           Men’s                                                                           suggestion
           Shoes
                                                                                           M: Color question /
                                                                     ALDO
                                                                                           suggestion
                                   Brand

                      5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                     13
An expanded look at part of a Knowledge Graph
        K now ledge Graph Encapsulates Shopping Behavior

                 5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   14
An expanded look at part of a Knowledge Graph
        Look ed sim ple. W hy need queries and algorithm s?

                                              Bags

                 5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   15
An expanded look at part of a Knowledge Graph
         W hat’s the ”nex t best question” (a.k .a. query the graph)

                                                                                      Handbag

   Men’s Backpack

                      5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018    16
Transfer Learning From Interaction
                            Ex am ple Product Conversation

                                                       Category:     Athletic Shoes
                                                       Product:      Eggplant Foamposite Sneakers
                                                       Brand:           Nike
                                                       Style:             Basketball Shoes
                                                       Release Date: 2009

I am looking for the eggplant foamposites.

                Color:          Material: Foamposite
                Purple

                                                                Image Source: https://www.sneakershouts.com/sneaker-deals/2017/8/4/nike-air-foamposite-one-eggplant-under-retail

                            5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                                                                                 17
Which one would you pick?
               ”K now ledge” added to graph

               I am looking for the eggplant iphone case.

                                          Color:
                                          Purple              Product: Phone Case

            Data mine from the graph that eggplant == purple
            Derived meaning from data (unsupervised)

              5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   18
Why Neo4j ?!

● Graph database is the right choice to store
  Knowledge Graphs
● Neo4j is battle tested and fast!
● New data sources are easy to integrate to
  extend Graph inferencing ability
● Great set of support tools, to name a few –
    ○   Interactive browser                                                                   *
    ○   In-Database Procedures
    ○   Bulk imports
    ○   Graph-algorithms                      *Walter         1st tried to build with RDBMS

                     5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018       19
Graphs can do some amazing things
                  W alter: “Say data joins one m ore tim e”

         eBay “initial tried RDBMS for knowledge graph”. What happened?

Find all direct reports and how many people they manage, up to 3 levels down
            SQL Query                                                         Graph DB Query
                                                                        (using Cypher Query Language)

                                                              MATCH (sub)-[:REPORTS_TO*0..3]->(boss),
                                                                    (report)-[:REPORTS_TO*1..3]->(sub)
                                                              WHERE boss.name = “John Doe”
                                                              RETURN sub.name AS Subordinate,
                                                                count(report) AS Total

                                                            Graph database queries can do amazing
                                                            things fast, but graph algorithms add a
                                                            whole new level of capability…

                     5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018              20
eBay – Graphs are the future for AI…

 … and recommendation engines, and …

 ● Business logic and interference, and “creative juices” in graph.

 ● Interactive search

 ● Search science pieces as graph

 ● Everything except backend source systems (origin indexes, etc.)

 ● Semi-supervised techniques like label propagation

                  5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   21
Neo4j – Some Graph Algorithms
                                                 Path Finding and Traversal Algorithm s
                           Algorithm Type                                            What It Does                                                                    Example Uses
                         Parallel Breadth -       Traverses a tree data structure by fanning out to explore the nearest                   BFS can be used to locate neighbor nodes in peer-to-peer networks
                         First Search (BFS)       neighbors and then their sub- level neighbors. It’s used to locate                      like BitTorrent, GPS systems to pinpoint nearby locations and social
                                                  connections and is a precursor                                                          network services to find people within a specific distance.
                                                  to many other algorithms. BFS is preferred when the tree is less balanced or the
Find the shortest
                                                  target is closer to the starting point. It can also be used to find the shortest path
path or evaluate the
                                                  between nodes or avoid recursive processes of DFS.
availability and quality
                         Parallel Depth- First    Traverses a tree data structure by exploring as far as possible down each               DFS is often used in gaming simulations where each choice or
of routes.
                         Search (DFS)             branch before backtracking. It’s used on deeply hierarchical data and is a              action leads to another, expanding into a tree-shaped graph of
                                                  precursor to many other algorithms. DFS is preferred when the tree is more              possibilities. It will traverse the choice tree until it discovers an
                                                  balanced or the target is closer to an endpoint.                                        optimal solution path (e.g., win).
                         Single-Source            Calculates a path between a node and all other nodes whose summed value                 Single-Source Shortest Path is often applied to automatically
                         Shortest Path            (weight of relationships such as cost, distance, time or capacity) to all other         obtain directions between physical locations, such as driving
                                                  nodes are minimal.                                                                      directions via Google Maps.
                                                                                                                                          It’s also essential in logical routing such as telephone call routing
                                                                                                                                          (least cost routing).
                         All-Pairs Shortest       Calculates a shortest path forest (group) containing all shortest paths between         All-Pairs Shortest Path can be used to evaluate alternate routes for
                         Path                     the nodes in the graph.                                                                 situations such as a freeway backup or network capacity.
                                                  Commonly used for understanding alternate routing when the shortest route is            It’s also key in logical routing to offer multiple paths, for example,
                                                  blocked or becomes suboptimal.                                                          call routing alternatives.
                         Minimum Weight           Calculates the paths along a connected tree structure with the smallest value           MWST is widely used for network designs: least cost logical or
                         Spanning Tree            (weight of the relationship such as cost, time or                                       physical routing such as laying cable, fastest garbage collection
                         (MWST)                   capacity) associated with visiting all nodes in the tree. It’s also employed to         routes, capacity for water systems, efficient circuit designs and
                                                  approximate some NP-hard problems such as the traveling salesman problem                much more. It also has real-time applications with rolling
                                                  and randomized or iterative rounding.                                                   optimizations such as processes in a chemical refinery or driving
                                                                                                                                          route corrections.

                                                     5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                                                                                        22
Neo4j – Some Graph Algorithms
                                                     Centrality Algorithm s

                    Algorithm Type   What It Does                                      Example Uses
                    PageRank         Estimates a current node’s importance from        PageRank is used in quite a few ways to estimate importance and influence. It’s
                                     its linked neighbors and then again from their    used to suggest Twitter accounts to follow and for general sentiment analysis.
                                     neighbors. A node’s rank is derived from the      PageRank is also used in machine learning to identify the most influential
                                     number and quality of                             features for extraction.
Determine the
                                     its transitive links to estimate influence.
importance of                                                                          In biology, it’s been used to identify which species extinctions within a food web
                                     Although popularized by Google, it’s widely
distinct nodes in                                                                      would lead to biggest chain- reaction of species death.
                                     recognized as a way of detecting influential
a network of
                                     nodes in any network.
connected data.
                    Degree           Measures the number of relationships a            Degree Centrality looks at immediate connectedness for uses such as evaluating
                    Centrality       node (or an entire graph) has. It’s broken        the near-term risk of a person catching a virus or hearing information. In social
                                     into indegree (flowing in) and outdegree          studies, indegree of friendship can be used to estimate popularity and
                                     (flowing out) where relationships are directed.   outdegree as gregariousness.
                    Closeness        Measures how central a node is to all its         Closeness centrality is applicable in a number of resources, communication and
                    Centrality       neighbors within its cluster. Nodes with the      behavioral analysis, especially when interaction speed is significant.
                                     shortest paths to all other nodes are             It has been used to identifying the best location of new public services for
                                     assumed to be able to reach the entire            maximum accessibility. In social analysis, it can be used to find people with the
                                     group the fastest.                                ideal social network location for faster dissemination of information.
                    Betweenness      Measures the number of shortest paths             Betweenness Centrality applies to a wide range of problems in network science
                    Centrality       (first found with BFS) that pass through a        and can be used to pinpoint bottlenecks or likely attack targets in communication
                                     node. Nodes that most frequently lie on           and transportation networks.
                                     shortest paths have higher betweenness            In genomics, it has been used to understand the control certain genes have in
                                     centrality scores and are the bridges             protein networks for improvements such as better drug- disease targeting.
                                     between different clusters. It is often
                                                                                       Betweenness Centrality has also be used to evaluate information flows between
                                     associated with the control over the flow of
                                                                                       multiplayer online gamers and expertise sharing communities of physicians.
                                     resources and information.

                                             5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                                                         23
Neo4j – Some Graph Algorithms
                       Community Detection Algorithms (a.k.a. clustering and partitioning)
                          Algorithm Type                             What It Does                                                           Example Uses
                         Label Propagation    Spreads labels based on neighborhood majorities as a           Label Propagation has diverse applications from understanding consensus
                                              means of inferring clusters. This extremely fast graph         formation in social communities to identifying sets of proteins that are
                                              partitioning requires                                          involved together in a process (functional modules) for biochemical
                                              little prior information and is widely used in large-scale     networks.
Evaluate how a
                                              networks for community detection. It’s a key method for        It’s also used in semi- and unsupervised machine learning as an initial
group is clustered                            understanding the organization of a graph and is often a       preprocessing step.
or partitioned, as                            primary step in other analysis.
well as its tendency
                         Strongly Connected   Locates groups of nodes where each node is reachable           Strongly Connected is often used to enable running other algorithms
to strengthen or                              from every other node in the same group following the          independently on an identified cluster. As a preprocessing step for directed
break apart.                                  direction of relationships. It’s often applied from a depth-   graphs, it can help quickly identify disconnected groups.
                                              first search.                                                  In retail recommendations, it can help identify groups with strong affinities
                                                                                                             that then can be used for suggesting commonly preferred items to those
                                                                                                             within that group who have not yet purchased the item.
                         Union-Find /         Finds groups of nodes where each node is reachable             Union-Find / Connected Components is often used in conjunction with
                         Connected            from any other node in the same group, regardless of           other algorithms, especially for high-performance grouping. As a preposing
                         Components /         the direction of relationships. It provides near               step
                         Weakly Connected     constant-time (independent of input size) operations           for undirected graphs, it can help quickly identify disconnected groups.
                                              to add new
                                              groups, merge existing groups and determine whether
                                              two nodes are in the same group.
                         Louvain Modularity   Measures the quality (i.e., presumed accuracy) of a            Louvain is used to evaluate social structures in Twitter, LinkedIn and
                                              community grouping by comparing its relationship               YouTube. It’s used in fraud analytics to evaluate whether a group has just a
                                              density to a suitably defined random network. It’s often       few bad behaviors or is acting as a fraud ring that would be indicated by a
                                              used to evaluate the organization of complex networks,         higher relationship density than average.
                                              in particular, community hierarchies. It’s also useful for     Louvain revealed a six-level customer hierarchy in a Belgian telecom
                                              initial data preprocessing in unsupervised machine             network.
                                              learning.

                                                 5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                                                                      24
Neo4j – Some Graph Algorithms
               Community Detection Algorithms (a.k.a. clustering and partitioning)
                        Algorithm Type                                 What It Does                                                      Example Uses
                       Local Clustering     For a particular node, it quantifies how close its neighbors are to   Local Cluster Coefficient is important to estimating
                       Coefficient /        being a clique (every node is directly connected to every other       resilience by understanding the likelihood of group
                       Node Clustering      node). For example, if all your friends knew each other directly,     coherence or fragmentation.
Evaluate how a         Coefficient          your local clustering coefficient would be 1. Small values for a      Analysis of a European power grid using this method
group is clustered                          cluster would indicate that although a grouping exists, the           found that clusters with sparsely connected nodes were
or partitioned, as                          nodes are not tightly connected.                                      more resilient against widespread failures.
well as its tendency   Triangle-Count and   Measures how many nodes have triangles and the degree to              The Average Clustering Coefficient is often used to
to strengthen or       Average Clustering   which nodes tend to cluster together. The average clustering          estimate whether a network might exhibit “small-world”
break apart.           Coefficient          coefficient is 1 when there is a clique, and 0 when there a no        behaviors which are based on tightly knit clusters. It’s also a
                                            connections. For the clustering coefficient to be meaningful it       factor for cluster stability and resiliency. Epidemiologists
                                            should be significantly higher than a version of the network          have used the average clustering coefficient to help
                                            where all of the relationships have been shuffled randomly.           predict various infection rates for different communities.
                       Local Clustering     For a particular node, it quantifies how close its neighbors are to   Local Cluster Coefficient is important to estimating
                       Coefficient /        being a clique (every node is directly connected to every other       resilience by understanding the likelihood of group
                       Node Clustering      node). For example, if all your friends knew each other directly,     coherence or fragmentation.
                       Coefficient          your local clustering coefficient would be 1. Small values for a      Analysis of a European power grid using this method
                                            cluster would indicate that although a grouping exists, the           found that clusters with sparsely connected nodes were
                                            nodes are not tightly connected.                                      more resilient against widespread failures.
                       Triangle-Count and   Measures how many nodes have triangles and the degree to              The Average Clustering Coefficient is often used to
                       Average Clustering   which nodes tend to cluster together. The average clustering          estimate whether a network might exhibit “small-world”
                       Coefficient          coefficient is 1 when there is a clique, and 0 when there a no        behaviors which are based on tightly knit clusters. It’s also a
                                            connections. For the clustering coefficient to be meaningful it       factor for cluster stability and resiliency. Epidemiologists
                                            should be significantly higher than a version of the network          have used the average clustering coefficient to help
                                            where all of the relationships have been shuffled randomly.           predict various infection rates for different communities.

                                               5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                                                          25
Some related and interesting material

eBay Presentation / Article (one of many)
 Ajinkya Kale and Anju Vasta, New Product Development, eBay
 https://www.youtube.com/watch?v=hRpmIeJjx-Y
 https://www.youtube.com/playlist?list=PL9Hl4pk2FsvVdYIoyksOyAMyDEDYv6O4K
 https://www.forbes.com/sites/rachelarthur/2017/07/19/conversational-commerce-ebay-ai-
 chatbot/#a38d7361efb9

Why Knowledge Graphs Are Foundational to Artificial Intelligence
 Jim Webber, Chief Data Scientist, Neo4j, March 20, 2018
 https://www.datanami.com/2018/03/20/why-knowledge-graphs-are-foundational-to-artificial-intelligence

Graph Analytics: Graph Algorithms inside Neo4j
 Amy Hodler and Michael Hunger, Neo4j, January 26, 2018
 https://www.youtube.com/watch?v=y10Bt7OkCRM

                          5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018   26
Some related and interesting material

Thank you! Neo4j

Remember – “The world is connected. Value
comes from being able to make and
understand the connections”*

                                   * A wise quote from my wonderful wife spouse before saying: “I know you did it, and don’t do it again.”

              5th Annual Big Data & Business Analytics Symposium – March 22-23, 2018                                                    27
You can also read