CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...

Page created by Gloria French
 
CONTINUE READING
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
CESSDA Expert Seminar
                                    2018
                                    CESSDA Technical Infrastructure

John Shepherdson                               Session 2: Cloud Computing (part 1) - An
                                               introduction to the technical foundations
CESSDA Platform Delivery Director
                                               of CESSDA

                                               60 minutes
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
CESSDA Technical Framework
» A guide for the development of the various (software)
  tools and services that form part of the CESSDA
  Research Infrastructure
» Promote good practice for software development
» Infrastructure for Development, Staging and Production
    ○ Harmonise development tool chain for SPs
    ○ Apply consistent set of tests
» Stable, scalable deployment environment
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
Documents and Forms
»   Technical Architecture
»   User Experience Guide
»   API and Developer Guidelines
»   Software Maturity Levels form
»   Contributor’s Agreement form
»   Software Adoption
»   Repository Request form
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
Technical Architecture

» Promote good software development
  practice across the Service Provider
  community, in respect of the provision of
  software artefacts for CESSDA Research
  Infrastructure
» Publication of basic standards for source
  code quality so SPs know what is expected of
  them
» bit.ly/tech_arch3_0
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
User Experience Guide

» Describes the general user experience for
  CESSDA ERIC tools and search applications
» Wireframes and visual examples are
  provided to illustrate functionality
» bit.ly/tool_branding_1_5
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
‘How to’ Guidelines

API Design Guidelines:
» https://bitbucket.org/cessda/cessda.guidelines.api/wiki/Home

Developer CIT Guidelines:
» https://bitbucket.org/cessda/cessda.guidelines.cit/wiki/Develope
  rs
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
Software Maturity Levels

» Approach for assessing maturity of software
  components
   ○ so CESSDA can mandate minimum levels
      that SPs and others have to meet
   ○ prerequisites for supplying software
      artefacts to CESSDA
» bit.ly/sml_doc1
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
Software Maturity Levels -
scoring
» 1. Initial usability; software use is not recommended
» 2. Use is feasible; the software can be used by skilled
  personnel but with considerable effort, cost and risk
» 3. Use is possible by most users; with some effort, cost,
  and risk. A risk assessment should be made before use
» 4. Software is usable; with little effort, cost, and risk
» 5. Demonstrable usability; there is clear evidence that
  the software is widely used by many users
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
Software Maturity Levels form

» Online mechanism for assessing 11 criteria:
   ○ Documentation, Intellectual property issues
   ○ Extensibility, Modularity, Packaging
   ○ Portability, Standards compliance, Support
   ○ Verification and testing, Security
   ○ Internationalisation and Localization
» bit.ly/sml_2
CESSDA Expert Seminar 2018 - CESSDA Technical Infrastructure Session 2: Cloud Computing (part 1) - An introduction to the technical foundations of ...
Contributor’s Agreement form

» For completion, prior to accessing CESSDA’s
  repositories
» http://bit.ly/contrib_req
Software Adoption

» Software Adoption Policy
   ○ bit.ly/sa_pol2
» Software Adoption Procedure
   ○ bit.ly/sa_proc2
Repository Request form

» http://bit.ly/repo_req
Introduction to Cloud
Computing
»   Definitions
»   Main players
»   GDPR considerations
»   Benefits to CESSDA
Definitions
» ‘SaaS applications are designed for end-users,
  delivered over the web
» PaaS is the set of tools and services designed
  to make coding and deploying those
  applications quick and efficient
» IaaS is the hardware and software that powers
  it all – servers, storage, networks, operating
  systems’
Source: Rackspace Whitepaper ‘Understanding the Cloud Computing Stack: SaaS,
PaaS, IaaS’, content licensed under CC BY-NC-ND 3.0

»
Main players
» Wide choice of cloud provider platforms:
  ○ AWS
  ○ Azure
  ○ Google Cloud
  ○ IBM Cloud
  ○ and many more

See e.g. Rightscale Cloud Comparison 2018
GDPR: Personal Data
» GDPR defines ‘personal data’ as ‘any information
  relating to an identified or identifiable natural person’
  (‘data subject’).
» An identifiable natural person is defined as one ‘who can
  be identified, directly or indirectly, by reference to an
  identifier such as a name, an identification number,
  location data, an online identifier or to one or more
  factors specific to the physical, physiological, genetic,
  mental, economic, cultural or social identity of that
  natural person.’
Source: GDRP Article 4(1)
GDPR: Processing Personal Data
» Process lawfully, fair and transparent
     ○ The data subject is informed of what will be done with the data and
       data
       processing should be done accordingly
» Keep to the original purpose
     ○ Data should be collected for specified, explicit and legitimate
       purposes and not further processed in a manner that is incompatible
       with those purposes
» Minimise data size
     ○ Personal data that are collected should be adequate, relevant and
       limited to what is necessary
GDPR: What we collect
» Present
   ○ User registration -> need clear ‘sign up’ statement
» Future
   ○ Usage data -> Pseudonymisation may help

» Google is ‘committed to GDPR’ and is Privacy Shield Certified
   ○ guide to aid compliance
Benefits to CESSDA

» In past, relied on members (‘Service Providers’)
  to develop and host standalone products
» Cloud is Greenfield site for technical
  development
  ○ elasticity
  ○ pay for what you use
  ○ establish common standards
Overview of Google Cloud
Platform
»   Obtaining access
»   GCP dashboard
»   Main features
»   Pricing model
Obtaining access

» By invitation
   ○ access restricted to essential users
   ○ temporary access for one off activities
Google Cloud Platform
Dashboard
Google Cloud Platform
Dashboard
Google Cloud Platform Features

» Very extensive - see
  https://cloud.google.com/terms/services
» Software networking
» Containers - Docker and more
» Clusters - Kubernetes
   ○ Auto scale/upgrade/repair
Google Cloud Platform Pricing

»   Pay as you go
»   Per second billing
»   Custom machine types
»   Rightsizing recommendations
»   Pricing calculator
Technical Infrastructure and GCP

» Code Repositories
» Development, staging and production
  environments
» Containers and clusters
» Management and monitoring
Code Repositories
Code repositories in Bitbucket
 ● Organised into projects
    ○   CESSDA Architectural Guidelines - CAG
    ○   CESSDA Managed Content - CMC
    ○   CESSDA Operations - COPS
    ○   CESSDA Public Helpdesk - CPH
    ○   CESSDA Research Infrastructure - CRI
●   Repository URLs
    ○   https://bitbucket.org/cessda/
Code Repositories
●   Request access via form
●   Specify who and what
●   Agree to depositors’ conditions
●   Admin creates Bitbucket repos(s) and
    accounts
●   Devs check in code and add documentation
Multiple environments

» Integration testing, user testing, go live
   ○ development has various tools
   ○ staging and production are very similar
» Different subnets for each
   ○ different firewall rules
Multiple environments
Specify basic parameters per product
REGION=europe-west1
ZONE=europe-west1-b
PROJECT=cessda-development
NET=jenkins-net
SUBNET=jenkins-subnet
PRODUCT=cessda-pasc
# TO EDIT
MODULE=certbot
ENVIRONMENT=dev

gcloud config set project $PROJECT
gcloud config set compute/region $REGION
gcloud config set compute/zone $ZONE
Containers
» Containers are predictable, repeatable and
  immutable
» Use Docker containers to run components
  ○ Working to 12 Factor App guidelines
  ○ Move from ‘monolithic’ to ‘composed’ apps
    ■ microservices (one app per container)
  ○ Maintain application environment
  ○ Version management and ease of reuse
Containers - basic vocabulary
 ●    Container Image - file
 ●    Container Image Format - as defined by Open Container Initiative (OCI)
 ●    Container Engine - typically uses OCI compliant runtime like runc
 ●    Container - runtime instantiation of a Container Image
 ●    Container Host - system that runs the containerized processes
 ●    Container Registry - storage space for Container Images
 ●    Container Orchestration - dynamic scheduling of container workloads
      within a cluster of computers

Source: A Practical Introduction to Container Terminology
Docker at a glance

Source: Docker Reference Architecture: Designing Scalable, Portable Docker Container Networks, Mark Church
Containers

Average Start/Stop Times*

Technology                                                  Start Time        Stop
Time
Docker Containers                               < 50 ms                       < 50
ms
Virtual Machines                                            30-45 sec
       5-10 sec

* Source: https://www.slideshare.net/Flux7Labs/performance-of-docker-vs-vms
Docker and 12 Factor App

Source: 12 Factor App with docker
Containers and Clusters

Use Kubernetes to orchestrate containers
» Provisions and manages underlying cloud resources
  automatically
» Routine health checks detect and replace hung/crashed
  applications
» Autoscaling (up and down)
» Portable across clouds and on-premises
Clusters - basic vocabulary
» Kubernetes Master - collection of 3 processes that run
  on single (master) node:
   ○ kube-apiserver, kube-controller-manager, kube-
      scheduler
» Each non-master node in cluster runs two processes:
   ○ kubelet, which communicates with the Kubernetes
      Master
   ○ kube-proxy, network proxy which reflects
      Kubernetes networking services on each node
Source: https://kubernetes.io/docs/concepts/
Clusters - basic vocabulary
» Basic Kubernetes objects:
   ○ Pod - runs single instance of given application
   ○ Node - worker machine (virtual or physical)
   ○ ReplicaSet - create/destroy Pods dynamically (e.g.
      scaling up or down)
   ○ Deployment - manages ReplicaSets

Source: https://kubernetes.io/docs/concepts/
Clusters - basic vocabulary
» Basic Kubernetes objects:
   ○ Service - defines logical set of Pods plus access
      policy
   ○ Volume - file persistence and sharing
   ○ Label - K/V pair used to organize and to select
      subsets of objects
   ○ Namespace - multiple virtual clusters backed by the
      same physical cluster

Source: https://kubernetes.io/docs/concepts/
Kubernetes at a glance

Source: Kubernetes in three diagrams, Tsuyoshi Ushio
Build, Test and Deploy

Combination of Bitbucket and Jenkins
» Commit code to Bitbucket repository
» Post commit hook
» Jenkins job
Jenkins Jobs

» Continuous integration (CI) and continuous
  delivery (CD) application
» Job (or project) is basic unit of work
   ○ build and test software projects
      continuously
   ○ monitor, backup, deploy, notify ….
» Jenkins glossary
Jenkins Jobs

» Old way - create via Jenkins UI
  ○ cannot version, need local backup/restore
  ○ difficult to edit/review/iterate by team
» New way - Jenkins file
  ○ just another source code file
  ○ manage via SCM system (such as
     Bitbucket)
Jenkins Jobs - Pipelines

Automated expression of process for getting software
from version control to users
Source: Jenkins Pipeline documentation
Build and Deploy - standard view
» Jenkins job - build CDC from ‘develop’ branch
Build and Deploy - Blue Ocean
» Jenkins job - build CDC from ‘develop’ branch
Management and monitoring

Combination of Jenkins, Stackdriver,
UptimeRobot
» Jenkins jobs - backups
» Stackdriver - error reporting and logging
» UptimeRobot - external polling
Thanks for listening

Any Questions?
Additional Slides
Common interoperability
characteristics
CESSDA defines 5 CICs, but how to achieve?
   • REST APIs c/w API design standards
   • Architectural standards
   • Common development environment
   • Adoption of 12 Factor App principles
   • Software acceptance criteria
1. Loosely coupled but coordinated

 Adopt microservices architecture based on RESTful
 web service APIs
  • provides a mechanism for reusing and
   combining software artefacts
 See also 12 factor app, number 7 (Port binding - Export services via port binding)
2. Sustainable
The provision of common standards
    • Technical Architecture document
Common development and test environment
    • via the technical infrastructure
Deployment environment
    • via extensions to the technical infrastructure
Central source-code repository

See also 12 factor app, number 1 (Codebase - One codebase tracked in revision control,
many deploys)
3. Extensible
Service API is key
    • Integration point for new services
    • Combination point for building new features
Version and support two versions simultaneously
    • Allows services to evolve, without breaking contract provided
      to consumers

See also 12 factor app, number 8 (Concurrency - Scale out via the process model)
See also 12 factor app, number 9 (Disposability - Maximize robustness with fast startup
and graceful shutdown)
4. Maintainable

Again, service API is key
 • implementation of a service can be changed as
  required, to take advantage of developments in
  software technology
 • location of services can be changed as required, to
  take advantage of developments in hardware
  technology
See also 12 factor app, number 2 (Dependencies - Explicitly declare and isolate
dependencies)
5. Standards Based

 • Provision of common architectural standards (via
   Technical Architecture)
 • A consistent (in both the calling and return
   structures and formats) and versioned API

See also 12 factor app, number 4 (Backing services - Treat backing services as attached
resources)
You can also read