SlideShare a Scribd company logo
Quilt
Ethan J. Jackson, Aurojit Panda, Kevin Lin,
Johann Schleier-Smith, Nicholas Sun, Luise Valentin,
Yuen Mei Wan, Scott Shenker
quilt.io
Everything has an API
Compute
Network
DevOps
1. Choose a Compute API
2. Choose a Network API
3. Write a Deployment Script
Deployment Script
Simple right?
spark-ec2.py
• Official Spark Script
• 1528 Lines of Code
• Incomprehensible
Network Security
• Status Quo
– Secure the Perimeter
• A Better Way
– Tight East-West Firewall
– Increased script complexity
Portability
?
Quilt
Automated Deployment
Quilt DSL: Stitch
• Declarative Application Specification
• Lisp Dialect
• Declaration Includes:
– Application Network and Compute
– Infrastructure
Example: Wordpress
ZooKeeper
Spark
HAProxy
MySQL
WordPress
Memcached
WordPress
Yuen Mei Wan⇤
Scott Shenker⇤†
rkeley †
ICSI
nd
hal-
we
em
oys
ble.
as-
nd
1 (import "haproxy")
2 (import "memcached")
3 (import "mysql")
4 (import "spark")
5 (import "wordpress")
6 (import "zookeeper")
7
8 (let ((db (mysql.New "db" 2))
9 (memcd (memcached.New "memcd" 3))
10 (wp (wordpress.New "wp" 8 db memcd))
11 (hap (haproxy.New "hap" 2 wp))
12 (zk (zookeeper.New "zk" 3))
13 (spark (spark.New "spark" 2 4 zk)))
14 (connect 7077 (hmapValues spark)
15 (hmapValues db))
16 (connect 80 "public" hap))
ZooKeeper
Spark
HAProxy
MySQL
WordPress
Memcached
wordpress.NewDeclarative Application Specification with Quilt
. Jackson⇤
Aurojit Panda⇤
Kevin Lin⇤
Johann Schleier-Smith⇤
olas Sun⇤
Luise Valentin⇤
Yuen Mei Wan⇤
Scott Shenker⇤†
⇤
UC Berkeley †
ICSI
emergence of container orchestrators and
etworks, operators still face daunting chal-
heir distributed systems. In this paper we
w language that specifies distributed system
Quilt, a system that automatically deploys
ns on whatever infrastructure is available.
1 (import "haproxy")
2 (import "memcached")
3 (import "mysql")
4 (import "spark")
5 (import "wordpress")
6 (import "zookeeper")
7
8 (let ((db (mysql.New "db" 2))
9 (memcd (memcached.New "memcd" 3))
10 (wp (wordpress.New "wp" 8 db memcd))
11 (hap (haproxy.New "hap" 2 wp))
1 (define (New name n db memcd)
2 (let ((dk (makeList n (docker image)))
3 (labelNames (strings.Range name n))
4 (wp (map label labelNames dk)))
5 (configure wp db memcd)
6 (connect 3306 wp (hmapGet db "master"))
7 (connect 3306 wp (hmapGet db "slave"))
8 (connect 11211 wp memcd)
9 wp))
WordPress
Yuen Mei Wan⇤
Scott Shenker⇤†
rkeley †
ICSI
nd
hal-
we
em
oys
ble.
as-
nd
1 (import "haproxy")
2 (import "memcached")
3 (import "mysql")
4 (import "spark")
5 (import "wordpress")
6 (import "zookeeper")
7
8 (let ((db (mysql.New "db" 2))
9 (memcd (memcached.New "memcd" 3))
10 (wp (wordpress.New "wp" 8 db memcd))
11 (hap (haproxy.New "hap" 2 wp))
12 (zk (zookeeper.New "zk" 3))
13 (spark (spark.New "spark" 2 4 zk)))
14 (connect 7077 (hmapValues spark)
15 (hmapValues db))
16 (connect 80 "public" hap))
ZooKeeper
Spark
HAProxy
MySQL
WordPress
Memcached
WordPress
wp-5: quay.io/netsys/di-wordpress
memcd-2: quay.io/netsys/di-memcached memcd-1: quay.io/netsys/di-memcacheddb-dbs-3: quay.io/netsys/di-wp-mysqldb-dbs-2: quay.io/netsys/di-wp-mysql
db-dbm-1: quay.io/netsys/di-wp-mysql
memcd-0: quay.io/netsys/di-memcached
spark-wk-0: quay.io/netsys/spark
spark-wk-2: quay.io/netsys/spark
spark-ms-0: quay.io/netsys/spark
spark-ms-1: quay.io/netsys/spark
spark-wk-3: quay.io/netsys/spark
spark-wk-1: quay.io/netsys/spark
k-2: quay.io/netsys/zookeeper
zk-1: quay.io/netsys/zookeeper
zk-0: quay.io/netsys/zookeeper
wp-0: quay.io/netsys/di-wordpress wp-3: quay.io/netsys/di-wordpresswp-7: quay.io/netsys/di-wordpress
hap-0: quay.io/netsys/di-wp-haproxy
wp-6: quay.io/netsys/di-wordpress wp-4: quay.io/netsys/di-wordpress wwp-1: quay.io/netsys/di-wordpress
hap-1: quay.io/netsys/di-wp-haproxy
public: [ ]
ZooKeeper
Spark
HAProxy
MySQL
WordPress
Memcached
Ethan J. Jackson Aurojit Panda Kevin Lin Johann Schleier-Smith
Nicholas Sun⇤
Luise Valentin⇤
Yuen Mei Wan⇤
Scott Shenker⇤†
⇤
UC Berkeley †
ICSI
Abstract
Despite the recent emergence of container orchestrators and
software defined networks, operators still face daunting chal-
lenges managing their distributed systems. In this paper we
present Stitch, a new language that specifies distributed system
policy directly, and Quilt, a system that automatically deploys
Stitch specifications on whatever infrastructure is available.
By disentangling application policy from application infras-
tructure, Quilt supports portable distributed applications and
automatically enforces strict network isolation.
1 Introduction
In recent years it has become easier to deploy distributed sys-
tems. Script-friendly cloud APIs [1, 50, 35, 20] and container
orchestrators [25, 33, 12, 52, 46, 7] allow administrators to
1 (import "haproxy")
2 (import "memcached")
3 (import "mysql")
4 (import "spark")
5 (import "wordpress")
6 (import "zookeeper")
7
8 (let ((db (mysql.New "db" 2))
9 (memcd (memcached.New "memcd" 3))
10 (wp (wordpress.New "wp" 8 db memcd))
11 (hap (haproxy.New "hap" 2 wp))
12 (zk (zookeeper.New "zk" 3))
13 (spark (spark.New "spark" 2 4 zk)))
14 (connect 7077 (hmapValues spark)
15 (hmapValues db))
16 (connect 80 "public" hap))
Figure 1: Stitch specification for a complex multi-t
WordPress deployment motivated in detail in §2.
Infrastructure
1 (define cfg
2 (list (provider "Amazon") (region "us-west-1")
3 (ram 32 64) (cpu 4 8) (sshkey "elided")))
4
5 (makeList 3 (machine (role "Master") cfg))
6 (makeList 32 (machine (role "Worker") cfg))
Infrastructure
?
1 (define cfg
2 (list (provider "Amazon") (region "us-west-1")
3 (ram 32 64) (cpu 4 8) (sshkey "elided")))
4
5 (makeList 3 (machine (role "Master") cfg))
6 (makeList 32 (machine (role "Worker") cfg))
1 (define cfg
2 (list (provider "Amazon") (region "us-west-1")
3 (ram 32 64) (cpu 4 8) (sshkey "elided")))
4
5 (makeList 3 (machine (role "Master") cfg))
6 (makeList 32 (machine (role "Worker") cfg))
Infrastructure
Azure Central US
?
Geographical Distribution
Geographical Distribution
Figure 4 shows a simple way an Stitch operator may instanti-
ate our WordPress example. In addition to the application spec-
1 (define cfg (list (ram 32 64) (cpu 4 8)
2 (sshkey "<elided>")))
3
4 (define db (mysql.New "db" 2))
5 (define zk (zookeeper.New "zk" 3))
6 (define spark (spark.New "spark" 2 4 zk))
7 (connect 7077 (hmapValues spark) (hmapValues db))
8
9 (define (makeLoc prvd rgn)
10 (list (provider prvd) (region rgn)))
11
12 (define (makePod name)
13 (let ((memcd (memcached.New (+ name "-mem") 1))
14 (wp (wordpress.New (+ name "-wp")
15 2 db memcd))
16 (hap (haproxy.New (+ name "-hap") 1 wp)))
17 (connect 80 "public" hap)
18 (list memcd wp hap)))
19
20 (define (deploy pod loc)
21 (makeList 16 (machine (role "Worker") cfg loc))
22 (place (machineRule "on" loc) pod))
23
24 (deploy (makePod "gce")
25 (makeLoc "Google" "europe-west1-b"))
26
27 (deploy (makePod "azure")
28 (makeLoc "Azure" "Central US"))
29
30 (let ((loc (makeLoc "Amazon" "ap-southeast-2"))
31 (nodes (append (makePod "aws") zk
32 (hmapValues db)
33 (hmapValues spark))))
34 (machine (role "Master") cfg loc)
35 (deploy nodes loc))
Stitch
Stitch
• Lisp (Scheme)
– Variables
– Arithmetic
– Functions
– Modules
• Domain Specific Primitives
Stitch — Primitives
• Application Primitives
– “docker”, “label”, “connect”, “place”, “setEnv”
• Infrastructure Primitives
– “machine”
– “role”, “provider”, “region”, “ram”, “cpu”, “size”
Stitch — Primitives
Stitch — Primitives
spark-master: quilt/spark
spark-worker: [ 10 quilt/spark ]
Quilt Architecture
Goals
• Simple
• Robust
• Portable
Quilt Architecture
• Import Infrastructure Spec
• Update Cluster
• Cloud Provider Plugins
– Amazon EC2
– Google Compute Engine
– Microsoft Azure
Infrastructure Controller
Cluster
AWS Azure
Foreman
DatabaseEngine
VM
AWS
GCE
VM VM
Azure
VM VM
GCE
VM
Cloud Provider
• Boot, Stop, List
• Network Reachability
• Application Agnostic
• Virtual Machines Running …
• Application Containers
• Open Virtual Network
– SDN Overlay
• Infrastructure Agnostic
Quilt Cluster
Unsolved Problems
• Application Configuration
• Container Security
• State
• External Services
Related Work
Related Work
• Container Orchestrators
– Kubernetes, Docker Swarm, Mesos, Nomad
– No explicit application specification
– No tight network firewall
• Quilt is a policy layer above these systems
Related Work
• Docker Compose / Kubernetes Helm
– Declare Groups of Containers to Boot
• Static Data Serialization Format
– Poor modularity
• Missing network graph
Future Work
Stitch: New Domains
• Security policy
– Key Management
– User Management
• Data
• Application Configuration
Stitch Analysis
• Verification
– Stitch specifies app entirely
– Simpler to verify than deployed systems
• Reachability
• Availability
Summary
• Portable Application Deployment
• Strict Network Security
• Modular, Shareable, Reusable Specifications
• In Future — Formal Analysis
Thank you
quilt.io
ejj@eecs.berkeley.edu

More Related Content

PDF
Solr As A SparkSQL DataSource
PPTX
Spark etl
PDF
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
PPTX
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
PDF
Using Apache Spark as ETL engine. Pros and Cons
PDF
Making Structured Streaming Ready for Production
PPTX
Monitoring Spark Applications
PDF
Spark SQL Deep Dive @ Melbourne Spark Meetup
Solr As A SparkSQL DataSource
Spark etl
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Using Apache Spark as ETL engine. Pros and Cons
Making Structured Streaming Ready for Production
Monitoring Spark Applications
Spark SQL Deep Dive @ Melbourne Spark Meetup

What's hot (20)

PDF
Sparkcamp @ Strata CA: Intro to Apache Spark with Hands-on Tutorials
PDF
Easy, scalable, fault tolerant stream processing with structured streaming - ...
PDF
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
PDF
Simplifying Big Data Analytics with Apache Spark
PDF
SparkSQL: A Compiler from Queries to RDDs
PDF
Sparkly Notebook: Interactive Analysis and Visualization with Spark
PDF
Apache Spark RDDs
PPTX
Keeping Spark on Track: Productionizing Spark for ETL
PDF
20140908 spark sql & catalyst
PDF
Recent Developments In SparkR For Advanced Analytics
PDF
Spark Summit EU talk by Ted Malaska
PDF
Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0
PDF
Sqoop on Spark for Data Ingestion
PDF
Spark Streaming Programming Techniques You Should Know with Gerard Maas
PPTX
Introduction to Apache Spark
PDF
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
PDF
Lessons from the Field, Episode II: Applying Best Practices to Your Apache S...
PDF
Building Robust ETL Pipelines with Apache Spark
PDF
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
PPTX
ETL with SPARK - First Spark London meetup
Sparkcamp @ Strata CA: Intro to Apache Spark with Hands-on Tutorials
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Simplifying Big Data Analytics with Apache Spark
SparkSQL: A Compiler from Queries to RDDs
Sparkly Notebook: Interactive Analysis and Visualization with Spark
Apache Spark RDDs
Keeping Spark on Track: Productionizing Spark for ETL
20140908 spark sql & catalyst
Recent Developments In SparkR For Advanced Analytics
Spark Summit EU talk by Ted Malaska
Spark Summit EU 2016 Keynote - Simplifying Big Data in Apache Spark 2.0
Sqoop on Spark for Data Ingestion
Spark Streaming Programming Techniques You Should Know with Gerard Maas
Introduction to Apache Spark
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
Lessons from the Field, Episode II: Applying Best Practices to Your Apache S...
Building Robust ETL Pipelines with Apache Spark
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
ETL with SPARK - First Spark London meetup
Ad

Viewers also liked (20)

PDF
Spark and Couchbase: Augmenting the Operational Database with Spark
PDF
Ansible - Automatyzacja zadań IT
PDF
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
PDF
GraphFrames: Graph Queries In Spark SQL
PDF
Time-Evolving Graph Processing On Commodity Clusters
PDF
Big Data in Production: Lessons from Running in the Cloud
PDF
Operational Tips For Deploying Apache Spark
PDF
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
PDF
Spark at Bloomberg: Dynamically Composable Analytics
PDF
Low Latency Execution For Apache Spark
PDF
Spark Uber Development Kit
DOCX
Jickson Accounts CV (1)
PPTX
Collaborative working and federating v4
PPTX
Just In Time
PPTX
Orient Textiles Ramadan Luxurious Collection 2016
PPT
Daniel - 3.ders
PPTX
Rancangan formula-suppositoria-aminofilin
DOC
Krishna_IBM_Infosphere_Certified_Datastage_Consultant
PDF
Infographic: The House Republican Budget
PPTX
Diagramacion
Spark and Couchbase: Augmenting the Operational Database with Spark
Ansible - Automatyzacja zadań IT
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
GraphFrames: Graph Queries In Spark SQL
Time-Evolving Graph Processing On Commodity Clusters
Big Data in Production: Lessons from Running in the Cloud
Operational Tips For Deploying Apache Spark
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Spark at Bloomberg: Dynamically Composable Analytics
Low Latency Execution For Apache Spark
Spark Uber Development Kit
Jickson Accounts CV (1)
Collaborative working and federating v4
Just In Time
Orient Textiles Ramadan Luxurious Collection 2016
Daniel - 3.ders
Rancangan formula-suppositoria-aminofilin
Krishna_IBM_Infosphere_Certified_Datastage_Consultant
Infographic: The House Republican Budget
Diagramacion
Ad

Similar to Automated Spark Deployment With Declarative Infrastructure (20)

PPT
TopicMapReduceComet log analysis by using splunk
PDF
Apache Spark, the Next Generation Cluster Computing
PDF
Spark with Elasticsearch - umd version 2014
PDF
Introduction to Scalding and Monoids
PDF
Bulding a reactive game engine with Spring 5 & Couchbase
PDF
NoSQL and JavaScript: a Love Story
PDF
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
PPTX
Big Data Scala by the Bay: Interactive Spark in your Browser
PPTX
Modern technologies in data science
PDF
Refactoring to Macros with Clojure
PDF
Node.js - async for the rest of us.
PDF
Блохин Леонид - "Mist, как часть Hydrosphere"
PDF
wtf is in Java/JDK/wtf7?
PPTX
Hazelcast and MongoDB at Cloud CMS
PDF
CouchDB Mobile - From Couch to 5K in 1 Hour
PDF
Introduction aux Macros
PPT
Full-Stack JavaScript with Node.js
PDF
Genode Compositions
PDF
Deathstar
PDF
Nomad Multi-Cloud
TopicMapReduceComet log analysis by using splunk
Apache Spark, the Next Generation Cluster Computing
Spark with Elasticsearch - umd version 2014
Introduction to Scalding and Monoids
Bulding a reactive game engine with Spring 5 & Couchbase
NoSQL and JavaScript: a Love Story
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Big Data Scala by the Bay: Interactive Spark in your Browser
Modern technologies in data science
Refactoring to Macros with Clojure
Node.js - async for the rest of us.
Блохин Леонид - "Mist, как часть Hydrosphere"
wtf is in Java/JDK/wtf7?
Hazelcast and MongoDB at Cloud CMS
CouchDB Mobile - From Couch to 5K in 1 Hour
Introduction aux Macros
Full-Stack JavaScript with Node.js
Genode Compositions
Deathstar
Nomad Multi-Cloud

More from Spark Summit (20)

PDF
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
PDF
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
PDF
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
PDF
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
PDF
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
PDF
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
PDF
Apache Spark and Tensorflow as a Service with Jim Dowling
PDF
Apache Spark and Tensorflow as a Service with Jim Dowling
PDF
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
PDF
Next CERN Accelerator Logging Service with Jakub Wozniak
PDF
Powering a Startup with Apache Spark with Kevin Kim
PDF
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
PDF
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
PDF
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
PDF
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
PDF
Goal Based Data Production with Sim Simeonov
PDF
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
PDF
Getting Ready to Use Redis with Apache Spark with Dvir Volk
PDF
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
PDF
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Next CERN Accelerator Logging Service with Jakub Wozniak
Powering a Startup with Apache Spark with Kevin Kim
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Goal Based Data Production with Sim Simeonov
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...

Recently uploaded (20)

PPTX
modul_python (1).pptx for professional and student
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Computer network topology notes for revision
PPT
Predictive modeling basics in data cleaning process
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Mega Projects Data Mega Projects Data
PDF
.pdf is not working space design for the following data for the following dat...
PDF
[EN] Industrial Machine Downtime Prediction
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Introduction to Data Science and Data Analysis
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
modul_python (1).pptx for professional and student
climate analysis of Dhaka ,Banglades.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Computer network topology notes for revision
Predictive modeling basics in data cleaning process
SAP 2 completion done . PRESENTATION.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Mega Projects Data Mega Projects Data
.pdf is not working space design for the following data for the following dat...
[EN] Industrial Machine Downtime Prediction
Optimise Shopper Experiences with a Strong Data Estate.pdf
Introduction-to-Cloud-ComputingFinal.pptx
Supervised vs unsupervised machine learning algorithms
oil_refinery_comprehensive_20250804084928 (1).pptx
Introduction to Data Science and Data Analysis
Clinical guidelines as a resource for EBP(1).pdf
Qualitative Qantitative and Mixed Methods.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...

Automated Spark Deployment With Declarative Infrastructure

  • 1. Quilt Ethan J. Jackson, Aurojit Panda, Kevin Lin, Johann Schleier-Smith, Nicholas Sun, Luise Valentin, Yuen Mei Wan, Scott Shenker quilt.io
  • 5. DevOps 1. Choose a Compute API 2. Choose a Network API 3. Write a Deployment Script
  • 7. spark-ec2.py • Official Spark Script • 1528 Lines of Code • Incomprehensible
  • 8. Network Security • Status Quo – Secure the Perimeter • A Better Way – Tight East-West Firewall – Increased script complexity
  • 11. Quilt DSL: Stitch • Declarative Application Specification • Lisp Dialect • Declaration Includes: – Application Network and Compute – Infrastructure
  • 13. WordPress Yuen Mei Wan⇤ Scott Shenker⇤† rkeley † ICSI nd hal- we em oys ble. as- nd 1 (import "haproxy") 2 (import "memcached") 3 (import "mysql") 4 (import "spark") 5 (import "wordpress") 6 (import "zookeeper") 7 8 (let ((db (mysql.New "db" 2)) 9 (memcd (memcached.New "memcd" 3)) 10 (wp (wordpress.New "wp" 8 db memcd)) 11 (hap (haproxy.New "hap" 2 wp)) 12 (zk (zookeeper.New "zk" 3)) 13 (spark (spark.New "spark" 2 4 zk))) 14 (connect 7077 (hmapValues spark) 15 (hmapValues db)) 16 (connect 80 "public" hap)) ZooKeeper Spark HAProxy MySQL WordPress Memcached
  • 14. wordpress.NewDeclarative Application Specification with Quilt . Jackson⇤ Aurojit Panda⇤ Kevin Lin⇤ Johann Schleier-Smith⇤ olas Sun⇤ Luise Valentin⇤ Yuen Mei Wan⇤ Scott Shenker⇤† ⇤ UC Berkeley † ICSI emergence of container orchestrators and etworks, operators still face daunting chal- heir distributed systems. In this paper we w language that specifies distributed system Quilt, a system that automatically deploys ns on whatever infrastructure is available. 1 (import "haproxy") 2 (import "memcached") 3 (import "mysql") 4 (import "spark") 5 (import "wordpress") 6 (import "zookeeper") 7 8 (let ((db (mysql.New "db" 2)) 9 (memcd (memcached.New "memcd" 3)) 10 (wp (wordpress.New "wp" 8 db memcd)) 11 (hap (haproxy.New "hap" 2 wp)) 1 (define (New name n db memcd) 2 (let ((dk (makeList n (docker image))) 3 (labelNames (strings.Range name n)) 4 (wp (map label labelNames dk))) 5 (configure wp db memcd) 6 (connect 3306 wp (hmapGet db "master")) 7 (connect 3306 wp (hmapGet db "slave")) 8 (connect 11211 wp memcd) 9 wp))
  • 15. WordPress Yuen Mei Wan⇤ Scott Shenker⇤† rkeley † ICSI nd hal- we em oys ble. as- nd 1 (import "haproxy") 2 (import "memcached") 3 (import "mysql") 4 (import "spark") 5 (import "wordpress") 6 (import "zookeeper") 7 8 (let ((db (mysql.New "db" 2)) 9 (memcd (memcached.New "memcd" 3)) 10 (wp (wordpress.New "wp" 8 db memcd)) 11 (hap (haproxy.New "hap" 2 wp)) 12 (zk (zookeeper.New "zk" 3)) 13 (spark (spark.New "spark" 2 4 zk))) 14 (connect 7077 (hmapValues spark) 15 (hmapValues db)) 16 (connect 80 "public" hap)) ZooKeeper Spark HAProxy MySQL WordPress Memcached
  • 16. WordPress wp-5: quay.io/netsys/di-wordpress memcd-2: quay.io/netsys/di-memcached memcd-1: quay.io/netsys/di-memcacheddb-dbs-3: quay.io/netsys/di-wp-mysqldb-dbs-2: quay.io/netsys/di-wp-mysql db-dbm-1: quay.io/netsys/di-wp-mysql memcd-0: quay.io/netsys/di-memcached spark-wk-0: quay.io/netsys/spark spark-wk-2: quay.io/netsys/spark spark-ms-0: quay.io/netsys/spark spark-ms-1: quay.io/netsys/spark spark-wk-3: quay.io/netsys/spark spark-wk-1: quay.io/netsys/spark k-2: quay.io/netsys/zookeeper zk-1: quay.io/netsys/zookeeper zk-0: quay.io/netsys/zookeeper wp-0: quay.io/netsys/di-wordpress wp-3: quay.io/netsys/di-wordpresswp-7: quay.io/netsys/di-wordpress hap-0: quay.io/netsys/di-wp-haproxy wp-6: quay.io/netsys/di-wordpress wp-4: quay.io/netsys/di-wordpress wwp-1: quay.io/netsys/di-wordpress hap-1: quay.io/netsys/di-wp-haproxy public: [ ] ZooKeeper Spark HAProxy MySQL WordPress Memcached Ethan J. Jackson Aurojit Panda Kevin Lin Johann Schleier-Smith Nicholas Sun⇤ Luise Valentin⇤ Yuen Mei Wan⇤ Scott Shenker⇤† ⇤ UC Berkeley † ICSI Abstract Despite the recent emergence of container orchestrators and software defined networks, operators still face daunting chal- lenges managing their distributed systems. In this paper we present Stitch, a new language that specifies distributed system policy directly, and Quilt, a system that automatically deploys Stitch specifications on whatever infrastructure is available. By disentangling application policy from application infras- tructure, Quilt supports portable distributed applications and automatically enforces strict network isolation. 1 Introduction In recent years it has become easier to deploy distributed sys- tems. Script-friendly cloud APIs [1, 50, 35, 20] and container orchestrators [25, 33, 12, 52, 46, 7] allow administrators to 1 (import "haproxy") 2 (import "memcached") 3 (import "mysql") 4 (import "spark") 5 (import "wordpress") 6 (import "zookeeper") 7 8 (let ((db (mysql.New "db" 2)) 9 (memcd (memcached.New "memcd" 3)) 10 (wp (wordpress.New "wp" 8 db memcd)) 11 (hap (haproxy.New "hap" 2 wp)) 12 (zk (zookeeper.New "zk" 3)) 13 (spark (spark.New "spark" 2 4 zk))) 14 (connect 7077 (hmapValues spark) 15 (hmapValues db)) 16 (connect 80 "public" hap)) Figure 1: Stitch specification for a complex multi-t WordPress deployment motivated in detail in §2.
  • 17. Infrastructure 1 (define cfg 2 (list (provider "Amazon") (region "us-west-1") 3 (ram 32 64) (cpu 4 8) (sshkey "elided"))) 4 5 (makeList 3 (machine (role "Master") cfg)) 6 (makeList 32 (machine (role "Worker") cfg))
  • 18. Infrastructure ? 1 (define cfg 2 (list (provider "Amazon") (region "us-west-1") 3 (ram 32 64) (cpu 4 8) (sshkey "elided"))) 4 5 (makeList 3 (machine (role "Master") cfg)) 6 (makeList 32 (machine (role "Worker") cfg))
  • 19. 1 (define cfg 2 (list (provider "Amazon") (region "us-west-1") 3 (ram 32 64) (cpu 4 8) (sshkey "elided"))) 4 5 (makeList 3 (machine (role "Master") cfg)) 6 (makeList 32 (machine (role "Worker") cfg)) Infrastructure Azure Central US ?
  • 21. Geographical Distribution Figure 4 shows a simple way an Stitch operator may instanti- ate our WordPress example. In addition to the application spec- 1 (define cfg (list (ram 32 64) (cpu 4 8) 2 (sshkey "<elided>"))) 3 4 (define db (mysql.New "db" 2)) 5 (define zk (zookeeper.New "zk" 3)) 6 (define spark (spark.New "spark" 2 4 zk)) 7 (connect 7077 (hmapValues spark) (hmapValues db)) 8 9 (define (makeLoc prvd rgn) 10 (list (provider prvd) (region rgn))) 11 12 (define (makePod name) 13 (let ((memcd (memcached.New (+ name "-mem") 1)) 14 (wp (wordpress.New (+ name "-wp") 15 2 db memcd)) 16 (hap (haproxy.New (+ name "-hap") 1 wp))) 17 (connect 80 "public" hap) 18 (list memcd wp hap))) 19 20 (define (deploy pod loc) 21 (makeList 16 (machine (role "Worker") cfg loc)) 22 (place (machineRule "on" loc) pod)) 23 24 (deploy (makePod "gce") 25 (makeLoc "Google" "europe-west1-b")) 26 27 (deploy (makePod "azure") 28 (makeLoc "Azure" "Central US")) 29 30 (let ((loc (makeLoc "Amazon" "ap-southeast-2")) 31 (nodes (append (makePod "aws") zk 32 (hmapValues db) 33 (hmapValues spark)))) 34 (machine (role "Master") cfg loc) 35 (deploy nodes loc))
  • 23. Stitch • Lisp (Scheme) – Variables – Arithmetic – Functions – Modules • Domain Specific Primitives
  • 24. Stitch — Primitives • Application Primitives – “docker”, “label”, “connect”, “place”, “setEnv” • Infrastructure Primitives – “machine” – “role”, “provider”, “region”, “ram”, “cpu”, “size”
  • 26. Stitch — Primitives spark-master: quilt/spark spark-worker: [ 10 quilt/spark ]
  • 30. • Import Infrastructure Spec • Update Cluster • Cloud Provider Plugins – Amazon EC2 – Google Compute Engine – Microsoft Azure Infrastructure Controller Cluster AWS Azure Foreman DatabaseEngine VM AWS GCE VM VM Azure VM VM GCE VM
  • 31. Cloud Provider • Boot, Stop, List • Network Reachability • Application Agnostic
  • 32. • Virtual Machines Running … • Application Containers • Open Virtual Network – SDN Overlay • Infrastructure Agnostic Quilt Cluster
  • 33. Unsolved Problems • Application Configuration • Container Security • State • External Services
  • 35. Related Work • Container Orchestrators – Kubernetes, Docker Swarm, Mesos, Nomad – No explicit application specification – No tight network firewall • Quilt is a policy layer above these systems
  • 36. Related Work • Docker Compose / Kubernetes Helm – Declare Groups of Containers to Boot • Static Data Serialization Format – Poor modularity • Missing network graph
  • 38. Stitch: New Domains • Security policy – Key Management – User Management • Data • Application Configuration
  • 39. Stitch Analysis • Verification – Stitch specifies app entirely – Simpler to verify than deployed systems • Reachability • Availability
  • 40. Summary • Portable Application Deployment • Strict Network Security • Modular, Shareable, Reusable Specifications • In Future — Formal Analysis