Resilience: the key requirement of a [big] [data] architecture - StampedeCon 2015

1
Resilience: The key requirement of a [big]
[data] architecture
Let 'em know
@HuffPostCode @edwardcapriolo

2
About me

Data Architech @ Huffpo

Apache Hive
Commiter/PMC

Author: Programming Hive
− 2nd edition comming. Save
up!

Husband & dad

Crazed inventor:
github.com/edwardcapriolo

3
Huffingtonpost & Me

What is huffingtonpost?
− News, blogs, and video
− Desktop and mobile
− Multiple editions worldwide

What do I do there?
− Provide APIs, dashboards, reports
− Crunch BigData using uber tech
− Say no to bad tech decisions
via 'ed says no' meme

4
For the next hour...
I am going to present
that everything I have
designed and use is
perfect and it never
breaks!

5
Reality check: Things break
all the time.

Anomalous cloud outages

External software bugs

Internal software bugs
− Aka. Anomalous cloud outage in
post mortem

Fat fingers

Preventable failures

6

To be resilient, design a system
that causes minimal panic
when something does break

7
What does Resilience
not sound like?

'HADOOP IS DOWN'
− “We are losing data!
Call OPS!”

“One of the 10
NoSQL nodes is
down”
− “Users are seeing
inaccurate numbers,
and request are
failing!”

Why is this bad?

8
What does Resilience sound like?

'Hadoop is down'
− No problem. The
process loading
hadoop can queue
messages for up to
40 hours

'One of the 10
NoSQL nodes is
down'
− No Problem. We can
tolerate multiple node
failures with minimal

9
Agenda

Software stacks (especially our 'Fright stack')

Planning for building a resilient service

Redundancy

Component Overview

Case study: Building the Lifetime API

Questions

10
Data Eng Stack
at Huffpo (Fright Stack)

11
Dont be scurred!

Compontents are named after horror movies

Batch & Realtime aka 'Lamb Duh' architecture
− Accomplish lower hanging fruit in real-time
− Expensive/complex processing in batch

Designed for throughput

Designed for horizontal scale

Less is more

12
Components of streaming stack

Kafka : The strong silent type
− Persistent, distribued, commit log
− Massive Throughput without GC issues at scale

Cassandra : In my Column Family
− Cells : Columns that hold values (last update wins)
− Counters : Columns that support Increment
operation

Teknek: KISS stream platform
− No Single Point of Failure
− Simple Take data off Feed apply function

13
Key compoents of batch stack

Hive / Hadoop : The big hammer
− SQL on Hadoop
− Flexibility of formats
− UDF / Streaming
− MetaStore

Impala: The scalpel
− Interactive speeds for reasonable datasets
− Avoids of having to bulk load into OLAP datastore

14
Planning a [big][data] service

15
What developers want

Hadoop

Hive

Mahout

Spark

Cascading

Python

Scala

Thrift

Storm

SQL

NoSQL

Transactions

Web
Services

Micro
Services

Message
Queue

Zookeeper

Akka

NodeJs

16
What Operations want

MySQL on RDS

[end slide]

17
What users want

Cloud

Big Data

Real Time

Reactive
− Wait I think I meant... Responsive?

18
After the initial excitement
of X, everyone:

Expexts someone else to manage X

Is more excited to work with Y

Will preach to you that X is a backwards
technology holding everything back
− Even though they were a staunch advocate for X
months ago

Everyone includes you
Source:
http://www.chrisunderwoodsblog.com/2014/0
1/new-deal-trough-or-plateau.html

19
Planning the service life-cycle

Build a playbook of setup/administration tasks

Get multiple groups buy in

Determine who carries out schema changes,
planning upgrades, etc

Build monitoring and determine escalations

20
Performance demands
on the service

Acceptable performance
− Request latenty
− 99th percentile
− Job time

Requests per second

Storage requirements

Acceptable caching/delay

24
What redundency does for you

Less chance that single event causes panic

Less manuals/wikis about what to do if...

Less user facing issues

More peace of mind

Availability of N services

Active/Passive is old school
Active/Active/Active + scalable is hip

25
Do not agile your redundancy

Be very afraid if someone tries to convince of
anything that sounds like this:
− For MVP we do not need Namenode HA. We can
get it running now and add the HA later.
− For MVP we need to get solution X working. We
can worry about scaling it later.
− For MVP it does not have to respond quickly. We
won't have much load and can speed it up
later.

26
Later always comes
before you are ready for it

28
Criteria for software selection

Initial setup and ongoing administration

General Utility (duct tape vs star screwdriver)

'Web Scale' design effort

Customizable/pluggable

No 'at scale' gotchas

Insane specialty superpower

29
Apache Kafka

Replication: set per topic (2)

Scale: Partitions dictate clients (10, 100)

Durability: sync vs async producers

Idempotence: Messages persisted to disk

Idempotence: Messages are multiplexed

Performance: Insane throughput

30
Message Queues
without persistence

Producer might be too fast for consumers and
messages are dropped
− You would need + 100% capacity to safely deal
with all surges

Consumer crash results in dropped message
− You cannot stop for anything, not even an update,
without loosing data

31
Apache Kafka
with persistence

Can handle traffic surges

Can safely queue data for upgrades
− Disk is cheap

Can replay data (bad release/backfill)

Multiplex data to multiple consumer groups

32
Apache Cassandra

Replication: At the keyspace level (3)

Redundant: No Single Point of Failure

Durability: Self healing with quorum

Idempotence: Cell writes

Idempotence: Compare and Swap

Performance: Lightning fast writes

33
Cassandra @ Work

Counters and Column Family can model a
good number of low latency stats problems

BatchMutations and stream save round trips

Clients do not need shard awareness

Masterless design ideal for high availability

To read it you have to be able to write it first

34
Apache Hadoop

Replication: Set per file (3)

Scale: Storage is incremental

Durability: Limit semantics

Performance: Typically brute force

Tuning: Too many tunes

Redundancy: Too many parts

36
Lifetime API

Result data per entry
− GET /api/lifetime/5656
− { „views“: 45454545, „clicks“: 343434 }

Provide the total lifetime sum
− Views
− Facebook shares
− Etc

Also provide 28 day counts

37
Planning

Acceptable performance
− Used in edit dashboards via web service call

Request per second
− Hundreds to thousands

Storage requirements
− Single value for each column *

Freshness
− Update hourly

38
Previous Vertica implementation

Does some queries sick fast

Enforces primary key on read
− If your double insert, later reads fail

Query slots limiting (OLAP)

Many projections can be problematic

Updates and deletes are PITA

Stonebreaker and I have beef
− http://www.edwardcapriolo.com/roller/edwardcapri
olo/entry/hadoop_is_the_best_thing

39
Let's NoSQL it!

Design for the read path
− Only fetch one entry at a time

Fetch entire history

Fetch last 28 days

Many entries have short shelf life

Do not store a single value, store a by-day
timeline instead!

40
Data modeling:
'Fixed' columns by day

Key = Entry:5555
− [2015-09-01:Views] = 30
− [2015-09-01:Clicks] = 10
− [2015-09-02:Views] = 2

Sparse data

Ordered by time
− Allows us to efficiently ask for ranges of data

41
Data modeling:
Dealing with $hipforaday
social networks

Key = Entry:5555
− [2015-09-01:networks/zintrest/zshares] = 22
− [2015-09-01:networks/zintrest/zlikes] = 10
− [2015-09-01:networks/dug/dougs] = 2

Two level dynamic:
− network/type
− True old schoolers mash strings bra

Schema-less is eloquent with the social
networks!
− Explain ire of schema and social networks

42
Updating hourly

Had hourly updates descoped until:

Asks 'Do we update hourly?'
− Of course, someone says yes

43
Data Modeling: Multiple granularity
in same row with TTL

Compute daily data once a day

Houlry data with time-to-live during the day

Entry:5555
− [2015-09-01-01:Views] = 30 *ttl 24 hours
− [2015-09-01-01:Clicks] = 10 *ttl 24 hours
− [2015-09-01-02:Views] = 2 *ttl 24 hours

API needs some intelligence not to count
hourly data if the daily column exists
− Could have named these columns so that they
always appear at the beginning or end of the data

44
Compute in batch write to NoSQL

Hive Queries from
scheduler produce
hourly data

Hour data
aggregated into day
table

TheRing: HCat API
[table] -> Cassandra

45
Results

Entry data divided
evenly across cluster

Survive multiple node
failures

API sums data on
read path

Horizontally scalable
http://sparkletechthoughts.blogspot.com/2013/03/how-to-setup-cassandra-cluster-using.html

46
How Resilient is this service?

Hourly/Daily processing can easily be re-run

Bulk loading cells is idempotent

NoSQL (Cassandra) has fault tollerance

NoSQL can take massive load

API server is stateless easily load balanced

Resilience: the key requirement of a [big] [data] architecture - StampedeCon 2015

More Related Content

What's hot (20)

Viewers also liked (9)

Similar to Resilience: the key requirement of a [big] [data] architecture - StampedeCon 2015 (20)

More from StampedeCon (20)

Recently uploaded (20)

Resilience: the key requirement of a [big] [data] architecture - StampedeCon 2015