PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story

PHP at 5000 Requests / Sec
Hootsuite’s Scaling Story
Bill Monkman
Lead Technical Engineer - Platform
@bmonkman

Overview - Selected Current Architecture
Users lb1 lb2 lb3 ... Nginx Load balancers
web1 web2 web3 ... Nginx web servers
PHP-FPM PHP-FPM PHP-FPM PHP-FPM
Memcached cluster
mem1 ...
Mysql cluster
master slave
MongoDB cluster
master slave
master slave
shard1
shard2
Gearman cluster
geard1 geard2
worker1 ... ...
...
Services

Technologies - at first
• Apache
• PHP
• MySQL

Problem
It’s hard to scale MySQL horizontally

Solution - Caching
Memcached.
● Distributed cache, cluster of boxes with lots of RAM, trivial to scale
● Cache as much as possible, invalidate only when necessary
● Use cache instead of DB
● No joins - decouple entities (collection caching)
● Twemproxy!

“There are only two hard things in
Computer Science: cache invalidation and
naming things.”
• Phil Karlton

Solution - Caching
MvcModelBaseCaching
MvcModelBase
MvcModelMysql
SocialNetwork

Solution - Caching
SELECT * FROM member WHERE org_id=888
set individual cache records
member_1 {data}
member_5 {data}
member_9 {data}
set collection cache
member_org_888 [1,5,9]
Automatic invalidation of collection cache

Solution - Caching
It’s hard to scale MySQL horizontally
Now:
● No need to scale MySQL
● Able to serve the whole site on 1 MySQL server
● 500 MySQL SELECTs per second. 50,000 Memcached GETs.
● 99+% hit rate

Problem
Need a way to perform asynchronous, distributed tasks using a
single-threaded language.

Solution - Gearman
Gearman.
● Distribute work to other servers to handle (workers also using
PHP, same codebase)
● Precursor to SOA where everything is truly distributed
● Many other solutions, queueing systems.

Solution - Gearman
geard1 geard2
gearworker1 gearworker2 gearworker6

Solution - Gearman
Need a way to perform asynchronous, distributed tasks using a
single-threaded language.
Now:
● Moved key tasks to Gearman
● Another cluster, scalable separately from web
● Discrete tasks, callable sync or async

Problem
Need to store data with the potential to grow too big to handle
effectively with MySQL.

Solution - MongoDB
MongoDB.
● Certain data did not need to be highly relational
● NoSQL DB, many other solutions these days
● Mongo can be a pain, lots of moving parts
● Had to make our own sequencer where auto-incremented ids were
necessary

Solution - MongoDB
Need to store data with the potential to grow too big to handle
effectively with MySQL.
Now:
● Multiple clusters containing amounts of data that likely would
have crushed MySQL
● Billions of rows per collection, many TB of data on disk

Technologies
• Apache
• PHP
• MySQL
• Memcached
• Gearman
• MongoDB

Problem
With a codebase and an engineering team increasing in size, how do
we keep up the pace of development and maintain control of the
system?
(SVN, big branches, merge hell)

Solution - Dark Launching
Dark Launching.
● Wrap code in block with a specific name
● That name will appear in a management page
● Can control whether or not that block is executed by modifying it’s value
● Boolean , random percentage, session-based, member list, organization
list, etc.

if (In_Feature::isEnabled(‘TWITTER_ADS’)) {
// execute new code
} else {
// execute old code
}

Dark Launching - Reasons
• Control your code
• Limit risk -> raise confidence -> speed up pace of releases
• “Branching in Production”
• Learning happens in Production

With a codebase and an engineering team increasing in size, how do
we keep up the pace of development and maintain control of the
system?
Now:
● Work fast with more confidence
● Huge amount of control over production systems
● Typically 10+ code releases to production per day
● Push-based distribution with Consul

Problem
With a rapidly increasing codebase and amount of users / traffic
how do we keep visibility into the performance of the code?

Solution - Monitoring
Statsd / Graphite.
Logstash / Elasticsearch / Kibana.
Sensu
● Statsd for metrics
● Logstash for log events
● Sensu for monitoring / alerting

Statsd::timing('apiCall.facebookGraph', microtime(true) - $startTime);

Logger::event('user liked from in-stream', In_Log::CATEGORY_UX, $logData);

• Visibility into the performance and behaviour of your application
• Iterate upon your code, measure results
• Pairs well with dark launching
• Also systems like New Relic

With a rapidly increasing codebase and amount of users / traffic
how do we keep visibility into the performance of the code?
Now:
● Able to watch performance / behaviour in real time.
● Able to view important events both in the aggregate or very
granular
● Able to control the system and watch the effect of changes

Optimizations
• Things expand beyond their initial scope
• Case in point: Translations

Optimizations - Push work to users
• Within reason, push work up to users
• Make your users into a distributed processing grid
• e.g. Stream rendering

Optimizations - Performance / Risks
• Performance is more important than clean code, business reqts
(in the instances where they may be mutually exclusive)
• Fine line between future proofing and premature optimization
• Don’t add burdensome processes, but make it easy for your team
to do things the right way
• Know your weak spots, protect against abuse

Technologies
Linux
Nginx
ElasticSearch Varnish
PHP-FPM
MySQL
Jenkins
Scala
MongoDB
Consul
Gearman
Redis
Akka
Python
Memcached
HAProxy
jQuery
ZeroMQ
Backbone RabbitMQ
EC2
Zend
Docker
Cloudfront CDN
Logstash
Zookeeper
Kibana
Statsd/Graphite
Packer
Vagrant
Nagios
VirtualBox
Spark/Shark
Sensu
Symfony
Riak
Composer
Websockets
Comet
Hadoop
Ansible
Git
Webpack Redshift

Problem
With a huge and growing monolithic codebase and over 80
engineers, how to keep scaling in a manageable way?

Solution - SOA
SOA.
● Split up the system into independent services which communicate only via APIs
● Teams can work on their own services with encapsulated business logic and have their own
deployment schedules.
● We chose to use Scala/Akka for services, communicating via ZeroMQ
● SOA transition made easier by the “no joins” philosophy
● Tons of work

Solution - SOA
SOM.
● “Service Oriented Monolith”
● When splitting up a monolithic codebase, dependencies are what kill you
● Fulfill dependencies by writing interim services using existing PHP code
● Maintain the contract and future scala services will be drop-in
replacements

Solution - SOA
With a huge and growing monolithic codebase and over 130
engineers, how to keep scaling in a manageable way?
Today:
● Transitioning to Scala SOA
● PHP will still be used as the Façade, a thin layer built on top of
the business logic of the services it interacts with.

Thank You!
Bill Monkman
@bmonkman
More Info:
code.hootsuite.com

PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story

Recommended

More Related Content

What's hot (20)

Similar to PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story (20)

Recently uploaded (20)

PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story