Databases on AWS: How To Choose The Right Database

Page created by Louis Casey
 
CONTINUE READING
Databases on AWS: How To Choose The Right Database
Databases on AWS: How To
Choose The Right Database
Randall Hunt               Markus Ostertag                                                               Dr. Sebastian Brandt
Software Engineer at AWS   CEO at Team Internet AG                                                       Senior Key Expert - Knowledge
@jrhunt                    @Osterjour                                                                    Graph at Siemens CT
randhunt@amazon

    SUMMIT                   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
SUMMIT   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
DynamoDB     DocumentDB
                                                                                                                       Redis
                                                                                                                                   Aurora      QLDB
                                SQL Server                      PostgreSQL                                             Elasticsearch    Timestream
                Oracle    DB2                     MySQL                                                             MongoDB
                                                                                                                                      Neptune
                                        Access                                                             Cassandra

1970               1980                1990                                            2000                              2010

       SUMMIT                           © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
Modern apps have modern requirements

                                                                                                                       Users: 1 million+
                                                                                                 Data volume: TB–PB–EB
                                                                                                                Locality: Global
                                                                                                 Performance: Milliseconds–microseconds
                                                                                                  Request rate: Millions per second
                                                                                                                    Access: Web, Mobile, IoT, Devices
                                                                                                                        Scale: Up-down, Out-in
                                                                                                        Economics: Pay for what you use
Ride hailing   Media streaming   Social media   Dating                               Developer access: No assembly required

      SUMMIT                                    © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
Common data categories and use cases

   Relational           Key-value           Document                      In-memory                                  Graph      Time-series            Ledger

    Referential         High            Store                            Query by key                           Quickly and     Collect, store,    Complete,
  integrity, ACID throughput, low- documents and                             with                              easily create     and process    immutable, and
   transactions,    latency reads   quickly query                        microsecond                           and navigate    data sequenced verifiable history
     schema-         and writes,       on any                              latency                             relationships       by time      of all changes to
     on-write       endless scale     attribute                                                                  between                        application data
                                                                                                                    data

 Lift and shift, ERP, Real-time bidding,       Content                 Leaderboards,       Fraud detection,                    IoT applications,         Systems
    CRM, finance        shopping cart,      management,              real-time analytics, social networking,                    event tracking      of record, supply
                        social, product    personalization,                caching         recommendation                                          chain, health care,
                      catalog, customer        mobile                                           engine                                                registrations,
                         preferences                                                                                                                     financial

   SUMMIT                                          © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
AWS: Purpose-built databases

         Relational       Key-value   Document            In-memory                                  Graph                  Search        Time-series   Ledger

     Amazon RDS           Amazon        Amazon               Amazon                                Amazon                   Amazon
                         DynamoDB                                                                                                           Amazon      Amazon
                                      DocumentDB           ElastiCache                             Neptune                Elasticsearch
                                                                                                                                          Timestream    Quantum
                                                                                                                             Service
                                                                                                                                                         Ledger
Aurora    Community Commercial                             Redis Memcached                                                                              Database

           SUMMIT                             © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
SUMMIT   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
Amazon Relational Database Service (RDS)
Managed relational database service with a choice of six popular database engines

     Easy to administer             Available and durable                                          Highly scalable            Fast and secure

   No need for infrastructure        Automatic Multi-AZ data                                Scale database compute       SSD storage and guaranteed
  provisioning, installing, and   replication; automated backup,                             and storage with a few         provisioned I/O; data
   maintaining DB software               snapshots, failover                                   clicks with no app          encryption at rest and in
                                                                                                    downtime                        transit

    SUMMIT                                   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
Traditional SQL
• TCP based wire protocol                                               INSERT INTO table1
• Well Known, lots of uses                                                 (id, first_name, last_name)
• Common drivers (JDBC)                                                 VALUES (1, Randall’, Hunt’);
• Frequently used with ORMs
• Scale UP individual instances                                         SELECT col1, col2, col3
• Scale OUT with read replicas                                          FROM table1
• Sharding at application level                                         WHERE col4 = 1 AND col5 = 2
• Lots of flavors but very similar                                      GROUP BY col1
  language                                                              HAVING count(*) > 1
• Joins                                                                 ORDER BY col2

    SUMMIT                  © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Databases on AWS: How To Choose The Right Database
Amazon Aurora
MySQL and PostgreSQL-compatible relational database built for the cloud
Performance and availability of commercial-grade databases at 1/10th the cost

        Performance                      Availability
                                                                                                     Highly secure          Fully managed
       and scalability                  and durability

   5x throughput of standard       Fault-tolerant, self-healing                                   Network isolation,        Managed by RDS:
   MySQL and 3x of standard        storage; six copies of data                                      encryption at       No hardware provisioning,
  PostgreSQL; scale-out up to    across three Availability Zones;                                    rest/transit       software patching, setup,
        15 read replicas        continuous backup to Amazon S3                                                          configuration, or backups

    SUMMIT                                  © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SQL vs NoSQL
                  SQL                                                                                            NoSQL

      Optimized for storage                                                                Optimized for compute
          Normalized/relational                                                              Denormalized/hierarchical

             Ad hoc queries                                                                                 Instantiated views

             Scale vertically                                                                               Scale horizontally

            Good for OLAP                                                                          Built for OLTP at scale

 SUMMIT                         © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon DynamoDB
Fast and flexible key value database service for any scale

                                                                                                     Comprehensive            Global database for
     Performance at scale                    Serverless                                                 security             global users and apps

      Consistent, single-digit          No server provisioning,                               Encrypts all data by        Build global applications with
  millisecond response times at          software patching, or                             default and fully integrates fast access to local data by easily
 any scale; build applications with   upgrades; scales up or down                            with AWS Identity and      replicating tables across multiple
  virtually unlimited throughput      automatically; continuously                           Access Management for                 AWS Regions
                                          backs up your data                                     robust security

      SUMMIT                                    © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon DynamoDB Data Structure
                                                                                                                                Table
                                                                                                                                Items

    Attributes

                       Partition    Sort                                                                       All items for key
                         Key        Key                                                                        ==, , >=,
DynamoDB Schema and Queries
•   Connects over HTTP                                             import boto3
•   Global Secondary Indexes and Local                             votes_table =
    Secondary Indexes                                              boto3.resource('dynamodb').Table('votes')
•   Speed up queries with DAX                                      resp = votes_table.update_item(
•   Global tables (mutli-region-multi-                               Key={'name': editor},
    master)                                                          UpdateExpression="ADD votes :incr",
•   Transactions across multiple tables                              ExpressionAttributeValues={":incr": 1},
•   Change Streams                                                   ReturnValues="ALL_NEW"
•   Rich query language with expressions                           )
•   Provision read and write capacity units
    separately with or without autoscaling
•   Also supports pay per request model

     SUMMIT                          © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DynamoDB
 Advancements over the last 22 months
February 2017               April 2017                                              April 2017                               June 2017

            Time To                      VPC                                                            DynamoDB                         Auto
            Live (TTL)                   endpoints                                                      Accelerator (DAX)                scaling

November 2017              November 2017                               November 2017                                  March 2018

                                  On-demand                                                                                  Point-in-time
          Global tables                                                                 Encryption at rest
                                  backup                                                                                     recovery

June 2018                 August 2018                                  November 2018                                   November 2018

                                    Adaptive
          99.999% SLA                                                                        Transactions                      On-demand
    SLA
                                    capacity                                     ACID

      SUMMIT                              © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon DocumentDB: Modern cloud-native architecture
   What would you do to improve scalability and availability?
        1                     2                      3

    Decouple          Distribute data in        Increase the
  compute and         smaller partitions       replication of
     storage                                      data (6x)
Amazon DocumentDB
Fast, scalable, and fully managed MongoDB-compatible database service

              Fast                       Scalable                                            Fully managed                      MongoDB
                                                                                                                               compatible

 Millions of requests per second Separation of compute and                                Managed by AWS:                Compatible with MongoDB 3.6;
 with millisecond latency; twice storage enables both layers                          no hardware provisioning;           use the same SDKs, tools, and
  the throughput of MongoDB        to scale independently;                            auto patching, quick setup,           applications with Amazon
                                 scale out to 15 read replicas                          secure, and automatic                     DocumentDB
                                          in minutes                                           backups

     SUMMIT                                  © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Document databases
• Data is stored in JSON-like
  documents                                               JSON documents are first-class objects
• Documents map naturally                                          of the database
                                             {
  to how humans model data                         id: 1,
• Flexible schema and                              name: "sue",
                                                   age: 26,
  indexing                                         email: "sue@example.com",
• Expressive query language                        promotions: ["new user", "5%", "dog lover"],
                                                   memberDate: 2018-2-22,
  built for documents (ad                          shoppingCart: [
                                                     {product:"abc", quantity:2, cost:19.99},
  hoc queries and                                    {product:"edf", quantity:3, cost: 2.99}
  aggregations)                                    ]
                                             }

  SUMMIT                © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon ElastiCache
Redis and Memcached compatible, in-memory data store and cache

    Redis & Memcached                 Extreme                                                     Secure and             Easily scalable
        compatible                  performance                                                    reliable

   Fully compatible with open   In-memory data store and                               Network isolation,           Scale writes and reads with
  source Redis and Memcached      cache for microsecond                             encryption at rest/transit,        sharding and replicas
                                     response times                                   HIPAA, PCI, FedRAMP,
                                                                                     multi AZ, and automatic
                                                                                             failover

    SUMMIT                              © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Redis
•   Redis Serialization Protocol over TCP (RESP)

•   Supports Strings, Hashes, Lists, Sets, and Sorted Sets

•   Simple commands for manipulating in memory data structures

•   Pub/Sub features

•   Supports clustering via partitions

       SUMMIT                                © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Memcached
•   Text and Binary protocols over TCP

•   Small number of commands: set, add, replace, append, prepend, cas, get, gets, delete, incr,decr

•   Client/Application based partitioning

       SUMMIT                               © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Relationships enable new applications

   Social networks   Restaurant recommendations                                                     Retail fraud detection

  SUMMIT                © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use cases for highly connected data

    Social networking                   Recommendations                                                Knowledge graphs

     Fraud detection                            Life Sciences                                       Network & IT operations

  SUMMIT                © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Different approaches for highly connected data

                                                                                        Purpose-built to answer questions about
 Purpose-built for a business process
                                                                                                     relationships

SUMMIT                          © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Neptune
Fully managed graph database

                                                                                                                 Easy                 Open
              Fast                           Reliable

 Query billions of relationships   Six replicas of your data across                           Build powerful queries       Supports Apache TinkerPop &
   with millisecond latency        three AZs with full backup and                             easily with Gremlin and         W3C RDF graph models
                                                restore                                               SPARQL

    SUMMIT                                     © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LEADING GRAPH MODELS AND
                FRAMEWORKS

   PROPERTY GRAPH                                                        RESOURCE DESCRIPTION
                                                                           FRAMEWORK (RDF)
Open Source Apache TinkerPop™                                                              W3C Standard
  Gremlin Traversal Language                                                           SPARQL Query Language

SUMMIT                  © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Neptune high level architecture

                                                                                                Bulk load
                                                                                                from S3

                                                                                                Database
                                                                                                 Mgmt.

SUMMIT              © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Gremlin
g.addV('person').property(id, 1).property('name', 'randall')

g.V('1').property(single, 'age', 27)

g.addV('person').property(id, 2).property('name', 'markus')

g.addE('knows').from(g.V('1')).to(g.V('2')).property('weight', 1.0)

g.V().hasLabel('person')

g.V().has('name', 'randall').out('knows').valueMap()

http://tinkerpop.apache.org/docs/current/reference/#graph-traversal-steps

     SUMMIT                              © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SPARQL and RDF

https://www.w3.org/TR/sparql11-query/

     SUMMIT                             © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
THE BENEF IT OF URIs: LINKED DATA
                Linking across datasets by referencing globally unique URIs
     Wikidata

     GeoNames
                                                                                                             PermID

                                   Example: PermID (re)uses 
                                   as a global Identifier for the USA, which is an identifier rooted in GeoNames.
 SUMMIT                      © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Timestream (sign up for the preview)
Fast, scalable, fully managed time-series database

 1,000x faster and 1/10th the          Trillions of                                       Time-series analytics               Serverless
 cost of relational databases          daily events

   Collect data at the rate of   Adaptive query processing                                Built-in functions for              Automated setup,
     millions of inserts per      engine maintains steady,                             interpolation, smoothing,            configuration, server
     second (10M/second)          predictable performance                                  and approximation           provisioning, software patching

     SUMMIT                                © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Quantum Ledger Database (QLDB) (Preview)
Fully managed ledger database
Track and verify history of all changes made to your application’s data
                                     Cryptographically
           Immutable                     verifiable                                               Highly scalable             Easy to use

 Maintains a sequenced record of        Uses cryptography to                                  Executes 2–3X as many        Easy to use, letting you
  all changes to your data, which    generate a secure output                                transactions than ledgers      use familiar database
  cannot be deleted or modified;     file of your data’s history                               in common blockchain       capabilities like SQL APIs
 you have the ability to query and                                                                  frameworks              for querying the data
       analyze the full history

     SUMMIT                                   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Database Migration Service (AWS DMS)

                                                                         Migrate between on-premises and AWS
MIGRATING
DATABASES                                                                Migrate between databases
TO AWS
                                                                         Automated schema conversion
100,000+
databases migrated
                                                                         Data replication for
                                                                         zero-downtime migration

    SUMMIT           © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DMS—Logical replication

                  Customer                                                                                                                    Aurora
                  Premises,                                VPN/Network                                                                      PostgreSQL
                  EC2, RDS

Start a replication instance                                                                                         Let the AWS Database Migration
                                                                                                                     Service create tables and load data
Connect to source and target databases                                                                               Uses change data capture to keep
                                                                Application Users                                    them in sync
Select tables, schemas, or databases                                                                                 Switch applications over to the target
                                                                                                                     at your convenience
       SUMMIT                            © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SUMMIT   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customers are moving to AWS Databases
           Verizon is migrating over 1,000 business-critical applications and database backend systems to AWS,
           several of which also include the migration of production databases to Amazon Aurora.
           By December 2018, Amazon.com will have migrated 88% of their Oracle DBs (and 97% of
           critical system DBs) moved to Amazon Aurora and Amazon DynamoDB. They also migrated their
           50 PB Oracle Data Warehouse to AWS (Amazon S3, Amazon Redshift, and Amazon EMR).

           Trimble migrated their Oracle databases to Amazon RDS and project they will pay about 1/4th
           of what they paid when managing their private infrastructure.

           Wappa migrated from their Oracle database to Amazon Aurora and improved their
           reporting time per user by 75 percent.

            Samsung Electronics migrated their Cassandra clusters to Amazon DynamoDB for their Samsung
            Cloud workload with 70% cost savings.
           Intuit migrated from Microsoft SQL Server to Amazon Redshift to reduce data-processing timelines
           and get insights to decision makers faster and more frequently.

           Equinox Fitness migrated its Teradata on-premises data warehouse to Amazon Redshift. They went
           from static reports to a modern data lake that delivers dynamic reports.

           Eventbrite moved from Cloudera to Amazon EMR and were able to cut costs dramatically, spinning
           clusters up/down on-demand and using Spot (saving > 80%) and Reserved Instances.
  SUMMIT                    © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Most enterprise database & analytics cloud customers

  SUMMIT           © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Most startup database & analytics cloud customers

  SUMMIT           © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SUMMIT   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Problem: Matching IPs to Ranges in Real-Time
Bidding

Real-Time Bidding Requirements:
• High throughput (>25000 req/s)
• Response time is critical
• Scalability

   SUMMIT               © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Possible choices

                          Relational                           Key-value                       In-memory

                        Well known, High throughput,                                           Query by key
                      Flexible Indices,  low-latency                                               with
                           Query            reads,                                             microsecond
                         language,      endless scale                                            latency
                       Data scheme
                          obvious

                         Latency?                           Data scheme?                       Data scheme?
                      Query volume?                           Indices?                          Sharding?
                       Future scale?                         Sharding?                         Future scale?

  SUMMIT           © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Possible choices

                          Relational                           Key-value                       In-memory

                        Well known, High throughput,                                           Query by key
                      Flexible Indices,  low-latency                                               with
                           Query            reads,                                             microsecond
                         language,      endless scale                                            latency
                       Data scheme
                          obvious

                         Latency?                           Data scheme?                       Data scheme?
                      Query volume?                           Indices?                          Sharding?
                       Future scale?                         Sharding?                         Future scale?

  SUMMIT           © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DynamoDB?
But how?
• Sharding and Index downsides addressed by transforming data during
  ingest
• Accept multiple entries per range -> sharding & query
  possible/optimized

   SUMMIT                © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DynamoDB it is!

  SUMMIT          © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DynamoDB it is!

  SUMMIT          © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DynamoDB it is!

  SUMMIT          © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Learnings
•   Focus on “core requirements“
•   Be creative about data schemes
•   Accept minor downsides – very often there is no perfect solution
•   Use upsides of chosen database to optimize (i.e. adding DAX for
    caching)

     SUMMIT                 © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Knowledge Graphs
@ Siemens
Sebastian Brandt
Steffen Lamparter

Semantics and Reasoning Group
Siemens Corporate Technology
CT RDA BAM SMR-DE

February 2019

siemens.com                     Unrestricted © Siemens AG 2019
Knowledge graphs become especially powerful for
managing complex queries and heterogeneous data

 Why Knowledge Graphs?
                                                              Game-changing data integration
    • Graphs are a natural way to represent
      entities and their relationships
                                                              Robust data quality assurance
    • Graphs can capture a broad spectrum
      of data (structured / unstructured)
                                                              Intuitive domain modelling
    • Graphs can be managed efficiently
                                                              Flexibility & performance

                                                              Low up-front investment

                           Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology   Siemens AG 2019
What are Graphs? Knowledge representation formalism
semantic descriptions of entities and their relationships

                                June 23, 1957
                                                                                    Werner v. Siemens and
                                                                                                                               Objects
                                                                                    Friedrich Halske                           Real-world objects (things, places,
                              Is born on
    Roesemarie
                                                  Arnbruck
                                                                                                                               people) and abstract concepts (genres,
                                                                       Was founded by
      Kaeser
                                                                                                                               religions, professions)
                                                   Is born in
              Is married to

    Person                                           Is working at
                 Is a                                                                                                          Relationships
                                                                                               Is a
                                   Joe Kaeser
                                                                                                                               Logical connection between two objects
             Was promoted                                                                             Company                  e.g. Joe Kaeser is born in Arnbruck
                                           Has pictures
       August 1, 2013                                                    Is headquartered in

                                                           Munich, Berlin
                                                                                                                               Semantic descriptions
                                                                                                                               The semantic description indicates the
                                                                                                                               meaning of an object or relation, e.g. Joe
Rules make it possible to add further expert knowledge, e.g.                                                                   Kaeser is a person
"Siemens has to be a company, as a person is working there"

                                                                Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology                   Siemens AG 2019
Knowledge graphs become a powerful addition to traditional data
warehouses for managing heterogeneous data with complex relations

                                                                                                                         Knowledge graphs
                             Traditional data warehouses

                                                                      low                                 high
             Dimensions      Limited relations across data,                                                      High number & complex
         of data relations   e.g. time series data                                                               relations, e.g. social networks

                Variety of   "We have a clear scope of                low                                 high   "We do not yet know all the
          questions to be    user questions"  fixed set of                                                      different questions, users will
               answered      dashboard needed                                                                    ask"  Chatbot

                             Only few different data types            low                                 high   Analysis needs to combine he-
       Data heterogeneity    and data sources e.g.                                                               terogeneous data, e.g. sensor
                             invoicing database in ERP                                                           data, text data, product data

                                                                      low                                 high   Need to handle imperfect,
                             Ensured consistency of stored
           Quality of data                                                                                       incomplete or inconsistent
                             data
                                                                                                                 data

                                   Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology                         Siemens AG 2019
Use cases for knowledge graphs can be clustered into
five categories – overview and use case examples

                                                             Degree of complexity

                               Data access &                                                      Recommender                   Constraints &
 Data quality                                                  Digital companion
                               dashboarding                                                       system                        planning

 Improving data availability   Maintaining up-to-date         Enhancing features of               Providing users high          Enabling autonomous
 and quality by combining      meta-data, creating            existing products or                quality recommendations       systems
 and comparing data from       transparency on all            services with digital               by identifying similarities   to understand data and its
 various sources to fill in    available data and             companions that are able            in historical data            dependencies and take
 missing data sets or          making them accessible         to understand and                                                 own decisions, such as
 identify potentially wrong    to users via queries           process user questions                                            autonomous planning of
 data and data duplicates                                     and providing the needed                                          production proces-ses
                                                              data insights

                                            Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology                       Siemens AG 2019
Example: gas-turbine maintenance planning

• Cross-life-cycle data integration at PS DO:                    Fleet browsing and search                  Maintenance planning             Analytics for R&D
  gas-turbines, including maintenance, repair,
  monitoring, and configuration data

• Holistic engineering-centric domain model

• Intuitive graph queries independent of
  source data schemata

• Virtual integration of time-series data                                      Siemens Knowledge Graph

                                                                                    DB
                                                                                                               DB                     Docs              CSV, XML,
                                                              Time-series
                                                                                                                                                        Excel, …

                                                            Power turbine                           Maintenance                    Repair      Configuration
                                             Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology                        Siemens AG 2019
A semantic data model enables flexible linking and an integrated,
intuitive API for applications

  Planning    Simulation    Building     Security aaS      Building    Service Portal   Location Ba-
                                                                                                                                                          Drive
    tools        tool      Operations                    Performance                    sed Services
                                                                                                                                                   applications

                                                                                                                                                BT Knowledge
                                        Semantic Data Model                                                                                            Graph

                                              Building API                        Product API                Data API

     www

                                                                                                                                      Digital Lifecycle Platform
      Customer Data                 Building Structure                  Product Data                   MindSphere
       Weather Data                        BIM
     Public Energy Data

                                                          Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology                   Siemens AG 2019
Creating perfect places based on Services –
a user-centric holistic approach to the modern workplace …

Customer Interest       Relevant KPI’s
                            Optimizing CAPEX and OPEX
Energy and                  CO2 emissions
asset efficiency
                            Asset Performance/Useful Life

                            Cost per space unit

                            Workplace Utilization
Space
efficiency                  Revenue per space unit

                            Vacancy Rate

                            Employee productivity
Individual efficiency
and comfort                 Employee satisfaction

                                Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology   Siemens AG 2019
Industrializing Knowledge Graphs

Industrial Knowledge Graph                           Relevant technologies                                 R&D Areas
                                                                                                                                            ML for
                                                     Decision Making                                       Decision Making
                                  Decision                                                                                                  Graphs
                                                     • Reasoning and Constraint Solving                    • Explanation of AI decisions
                                   Making            • Machine/Deep Learning                               • Data access: Semantic Search
                                                     • Question Answering                                  • Machine Learning on Graphs for
               Knowledge       Storage and                                                                   recommendations, quality, etc.
               Automation       Integration          Storage and Integration                               Storage and Integration            Ontology
                                                     • Graph/NoSQL databases                               • Reusable Semantic Modelling       Library
                                                     • Constraints and Rules                                 and Knowledge Graphs
                                Generation
                                                     • Probabilistic programming                           • Data integration and cleaning (e.g.
                                                     • Ontologies                                            entity reconciliation)            Simple
                                                                                                                                             RDF +
                                                     Generation                                            Generation                        PGM
                                                     • NLP/Text understanding                              • Extraction from unstructured data
                                                     • Machine/Deep Learning                                 (inclusive text, audio, image)
                                                     • Computer vision                                     • Automatic semantic annotation
                                                     • Sound recognition                                     of structured data
                                                                                                                                               ML for
                                                     • Virtual data Integration                            • Learning of domain-specific
                                                                                                                                            automated
                                                                                                             rules/patterns
                                                     • Information retrieval                                                              annotation
                                                     • …
           Machines   Humans

                               Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology                           Siemens AG 2019
Get in touch!

                             Dr. Sebastian Brandt
                             Senior Key Expert – Knowledge Graph and Data Management
                             CT RDA BAM SMR-DE
                             Dr. Steffen Lamparter
                             Head of Research Group Semantics & Reasoning
                             CT RDA BAM SMR-DE
                             Siemens AG
                             Corporate Technology
                             Otto-Hahn-Ring 6
                             81739 München Germany

                             E-mail
                             steffen.lamparter@siemens.com
                             Intranet
                             intranet.siemens.com/ct

                Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology   Siemens AG 2019
Thank you!

 SUMMIT   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SUMMIT   © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
You can also read