SlideShare a Scribd company logo
PHP at 5000 Requests / Sec 
Hootsuite’s Scaling Story 
Bill Monkman 
Lead Technical Engineer - Platform 
@bmonkman
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
Overview - Selected Current Architecture 
Users lb1 lb2 lb3 ... Nginx Load balancers 
web1 web2 web3 ... Nginx web servers 
PHP-FPM PHP-FPM PHP-FPM PHP-FPM 
Memcached cluster 
mem1 ... 
Mysql cluster 
master slave 
MongoDB cluster 
master slave 
master slave 
shard1 
shard2 
Gearman cluster 
geard1 geard2 
worker1 ... ... 
... 
Services
Technologies - at first 
• Apache 
• PHP 
• MySQL
Then...
Problem 
It’s hard to scale MySQL horizontally
Solution - Caching 
Memcached. 
● Distributed cache, cluster of boxes with lots of RAM, trivial to scale 
● Cache as much as possible, invalidate only when necessary 
● Use cache instead of DB 
● No joins - decouple entities (collection caching) 
● Twemproxy!
“There are only two hard things in 
Computer Science: cache invalidation and 
naming things.” 
• Phil Karlton
Solution - Caching 
MvcModelBaseCaching 
MvcModelBase 
MvcModelMysql 
SocialNetwork
Solution - Caching 
SELECT * FROM member WHERE org_id=888 
set individual cache records 
member_1 {data} 
member_5 {data} 
member_9 {data} 
set collection cache 
member_org_888 [1,5,9] 
Automatic invalidation of collection cache
Solution - Caching 
It’s hard to scale MySQL horizontally 
Now: 
● No need to scale MySQL 
● Able to serve the whole site on 1 MySQL server 
● 500 MySQL SELECTs per second. 50,000 Memcached GETs. 
● 99+% hit rate
Then...
Problem 
Need a way to perform asynchronous, distributed tasks using a 
single-threaded language.
Solution - Gearman 
Gearman. 
● Distribute work to other servers to handle (workers also using 
PHP, same codebase) 
● Precursor to SOA where everything is truly distributed 
● Many other solutions, queueing systems.
Solution - Gearman 
geard1 geard2 
gearworker1 gearworker2 gearworker6
Solution - Gearman 
Need a way to perform asynchronous, distributed tasks using a 
single-threaded language. 
Now: 
● Moved key tasks to Gearman 
● Another cluster, scalable separately from web 
● Discrete tasks, callable sync or async
Then...
Problem 
Need to store data with the potential to grow too big to handle 
effectively with MySQL.
Solution - MongoDB 
MongoDB. 
● Certain data did not need to be highly relational 
● NoSQL DB, many other solutions these days 
● Mongo can be a pain, lots of moving parts 
● Had to make our own sequencer where auto-incremented ids were 
necessary
Solution - MongoDB 
Need to store data with the potential to grow too big to handle 
effectively with MySQL. 
Now: 
● Multiple clusters containing amounts of data that likely would 
have crushed MySQL 
● Billions of rows per collection, many TB of data on disk
Technologies 
• Apache 
• PHP 
• MySQL 
• Memcached 
• Gearman 
• MongoDB
Then...
Problem 
With a codebase and an engineering team increasing in size, how do 
we keep up the pace of development and maintain control of the 
system? 
(SVN, big branches, merge hell)
Solution - Dark Launching 
Dark Launching. 
● Wrap code in block with a specific name 
● That name will appear in a management page 
● Can control whether or not that block is executed by modifying it’s value 
● Boolean , random percentage, session-based, member list, organization 
list, etc.
Solution - Dark Launching 
if (In_Feature::isEnabled(‘TWITTER_ADS’)) { 
// execute new code 
} else { 
// execute old code 
}
Dark Launching - Reasons 
• Control your code 
• Limit risk -> raise confidence -> speed up pace of releases 
• “Branching in Production” 
• Learning happens in Production
Solution - Dark Launching 
With a codebase and an engineering team increasing in size, how do 
we keep up the pace of development and maintain control of the 
system? 
Now: 
● Work fast with more confidence 
● Huge amount of control over production systems 
● Typically 10+ code releases to production per day 
● Push-based distribution with Consul
Then...
Problem 
With a rapidly increasing codebase and amount of users / traffic 
how do we keep visibility into the performance of the code?
Solution - Monitoring 
Statsd / Graphite. 
Logstash / Elasticsearch / Kibana. 
Sensu 
● Statsd for metrics 
● Logstash for log events 
● Sensu for monitoring / alerting
Solution - Monitoring 
Statsd::timing('apiCall.facebookGraph', microtime(true) - $startTime);
Solution - Monitoring 
Logger::event('user liked from in-stream', In_Log::CATEGORY_UX, $logData);
Solution - Monitoring 
• Visibility into the performance and behaviour of your application 
• Iterate upon your code, measure results 
• Pairs well with dark launching 
• Also systems like New Relic
Solution - Monitoring 
With a rapidly increasing codebase and amount of users / traffic 
how do we keep visibility into the performance of the code? 
Now: 
● Able to watch performance / behaviour in real time. 
● Able to view important events both in the aggregate or very 
granular 
● Able to control the system and watch the effect of changes
Optimizations
Optimizations 
• Things expand beyond their initial scope 
• Case in point: Translations
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
Optimizations - Push work to users 
• Within reason, push work up to users 
• Make your users into a distributed processing grid 
• e.g. Stream rendering
Optimizations - Performance / Risks 
• Performance is more important than clean code, business reqts 
(in the instances where they may be mutually exclusive) 
• Fine line between future proofing and premature optimization 
• Don’t add burdensome processes, but make it easy for your team 
to do things the right way 
• Know your weak spots, protect against abuse
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
Technologies 
Linux 
Nginx 
ElasticSearch Varnish 
PHP-FPM 
MySQL 
Jenkins 
Scala 
MongoDB 
Consul 
Gearman 
Redis 
Akka 
Python 
Memcached 
HAProxy 
jQuery 
ZeroMQ 
Backbone RabbitMQ 
EC2 
Zend 
Docker 
Cloudfront CDN 
Logstash 
Zookeeper 
Kibana 
Statsd/Graphite 
Packer 
Vagrant 
Nagios 
VirtualBox 
Spark/Shark 
Sensu 
Symfony 
Riak 
Composer 
Websockets 
Comet 
Hadoop 
Ansible 
Git 
Webpack Redshift
Problem 
With a huge and growing monolithic codebase and over 80 
engineers, how to keep scaling in a manageable way?
Solution - SOA 
SOA. 
● Split up the system into independent services which communicate only via APIs 
● Teams can work on their own services with encapsulated business logic and have their own 
deployment schedules. 
● We chose to use Scala/Akka for services, communicating via ZeroMQ 
● SOA transition made easier by the “no joins” philosophy 
● Tons of work
Solution - SOA 
SOM. 
● “Service Oriented Monolith” 
● When splitting up a monolithic codebase, dependencies are what kill you 
● Fulfill dependencies by writing interim services using existing PHP code 
● Maintain the contract and future scala services will be drop-in 
replacements
Solution - SOA 
With a huge and growing monolithic codebase and over 130 
engineers, how to keep scaling in a manageable way? 
Today: 
● Transitioning to Scala SOA 
● PHP will still be used as the Façade, a thin layer built on top of 
the business logic of the services it interacts with.
Conclusion
Thank You! 
Bill Monkman 
@bmonkman 
More Info: 
code.hootsuite.com
Ad

Recommended

Stephan Ewen - Scaling to large State
Stephan Ewen - Scaling to large State
Flink Forward
 
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Databricks
 
Spark on YARN
Spark on YARN
Adarsh Pannu
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
HBaseCon
 
C# to python
C# to python
Tess Ferrandez
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
Jason Shih
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08
Angela Mendoza M.
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
Databricks
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
Patrick Wendell
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Databricks
 
Databricks MLflow Object Relationships
Databricks MLflow Object Relationships
amesar0
 
Micro-Architectural Attacks on Cyber-Physical Systems
Micro-Architectural Attacks on Cyber-Physical Systems
Heechul Yun
 
الشبكات العصبية الاصطناعية منصة الذكاء الاصطناعي
الشبكات العصبية الاصطناعية منصة الذكاء الاصطناعي
Areege Alangari
 
Raft presentation
Raft presentation
Patroclos Christou
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Databricks
 
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
ScyllaDB
 
20191018 reservoir computing
20191018 reservoir computing
CHIA-HSIANG KAO
 
Introduction to Capsule Networks
Introduction to Capsule Networks
Chia-Ching Lin
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
Jinwon Lee
 
Machine learning with Google machine learning APIs - Puppy or Muffin?
Machine learning with Google machine learning APIs - Puppy or Muffin?
Bret McGowen - NYC Google Developer Advocate
 
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Vietnam Open Infrastructure User Group
 
Dragon Cave - Nº11
Dragon Cave - Nº11
Victor Cardoso
 
Spark Summit EU talk by Nimbus Goehausen
Spark Summit EU talk by Nimbus Goehausen
Spark Summit
 
It's Time to ROCm!
It's Time to ROCm!
inside-BigData.com
 
Understanding DPDK
Understanding DPDK
Denys Haryachyy
 
Accelerating Ceph with RDMA and NVMe-oF
Accelerating Ceph with RDMA and NVMe-oF
inside-BigData.com
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
DataWorks Summit/Hadoop Summit
 
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Erlang Solutions
 
Fixing twitter
Fixing twitter
Roger Xia
 
Fixing_Twitter
Fixing_Twitter
liujianrong
 

More Related Content

What's hot (20)

Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
Patrick Wendell
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Databricks
 
Databricks MLflow Object Relationships
Databricks MLflow Object Relationships
amesar0
 
Micro-Architectural Attacks on Cyber-Physical Systems
Micro-Architectural Attacks on Cyber-Physical Systems
Heechul Yun
 
الشبكات العصبية الاصطناعية منصة الذكاء الاصطناعي
الشبكات العصبية الاصطناعية منصة الذكاء الاصطناعي
Areege Alangari
 
Raft presentation
Raft presentation
Patroclos Christou
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Databricks
 
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
ScyllaDB
 
20191018 reservoir computing
20191018 reservoir computing
CHIA-HSIANG KAO
 
Introduction to Capsule Networks
Introduction to Capsule Networks
Chia-Ching Lin
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
Jinwon Lee
 
Machine learning with Google machine learning APIs - Puppy or Muffin?
Machine learning with Google machine learning APIs - Puppy or Muffin?
Bret McGowen - NYC Google Developer Advocate
 
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Vietnam Open Infrastructure User Group
 
Dragon Cave - Nº11
Dragon Cave - Nº11
Victor Cardoso
 
Spark Summit EU talk by Nimbus Goehausen
Spark Summit EU talk by Nimbus Goehausen
Spark Summit
 
It's Time to ROCm!
It's Time to ROCm!
inside-BigData.com
 
Understanding DPDK
Understanding DPDK
Denys Haryachyy
 
Accelerating Ceph with RDMA and NVMe-oF
Accelerating Ceph with RDMA and NVMe-oF
inside-BigData.com
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
DataWorks Summit/Hadoop Summit
 
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Erlang Solutions
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
Patrick Wendell
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Databricks
 
Databricks MLflow Object Relationships
Databricks MLflow Object Relationships
amesar0
 
Micro-Architectural Attacks on Cyber-Physical Systems
Micro-Architectural Attacks on Cyber-Physical Systems
Heechul Yun
 
الشبكات العصبية الاصطناعية منصة الذكاء الاصطناعي
الشبكات العصبية الاصطناعية منصة الذكاء الاصطناعي
Areege Alangari
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Databricks
 
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
ScyllaDB
 
20191018 reservoir computing
20191018 reservoir computing
CHIA-HSIANG KAO
 
Introduction to Capsule Networks
Introduction to Capsule Networks
Chia-Ching Lin
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
Jinwon Lee
 
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Room 1 - 3 - Lê Anh Tuấn - Build a High Performance Identification at GHTK wi...
Vietnam Open Infrastructure User Group
 
Spark Summit EU talk by Nimbus Goehausen
Spark Summit EU talk by Nimbus Goehausen
Spark Summit
 
Accelerating Ceph with RDMA and NVMe-oF
Accelerating Ceph with RDMA and NVMe-oF
inside-BigData.com
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
DataWorks Summit/Hadoop Summit
 
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Efficient Erlang - Performance and memory efficiency of your data by Dmytro L...
Erlang Solutions
 

Similar to PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story (20)

Fixing twitter
Fixing twitter
Roger Xia
 
Fixing_Twitter
Fixing_Twitter
liujianrong
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
xlight
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
John Adams
 
John adams talk cloudy
John adams talk cloudy
John Adams
 
Qcon
Qcon
adityaagarwal
 
SOA with PHP and Symfony
SOA with PHP and Symfony
MichalSchroeder
 
Experiences with Microservices at Tuenti
Experiences with Microservices at Tuenti
Andrés Viedma Peláez
 
Building data intensive applications
Building data intensive applications
Amit Kejriwal
 
Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015
Ricard Clau
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
Christopher Curtin
 
Tool up your lamp stack
Tool up your lamp stack
AgileOnTheBeach
 
Tool Up Your LAMP Stack
Tool Up Your LAMP Stack
Lorna Mitchell
 
Microservices Antipatterns
Microservices Antipatterns
C4Media
 
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
LinkedIn
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3
Wen-Tien Chang
 
Fixing Twitter Velocity2009
Fixing Twitter Velocity2009
John Adams
 
What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...
Stefano Fago
 
The Hard Problems of Continuous Deployment
The Hard Problems of Continuous Deployment
Timothy Fitz
 
Fixing twitter
Fixing twitter
Roger Xia
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
xlight
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
John Adams
 
John adams talk cloudy
John adams talk cloudy
John Adams
 
SOA with PHP and Symfony
SOA with PHP and Symfony
MichalSchroeder
 
Experiences with Microservices at Tuenti
Experiences with Microservices at Tuenti
Andrés Viedma Peláez
 
Building data intensive applications
Building data intensive applications
Amit Kejriwal
 
Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015
Ricard Clau
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
Christopher Curtin
 
Tool Up Your LAMP Stack
Tool Up Your LAMP Stack
Lorna Mitchell
 
Microservices Antipatterns
Microservices Antipatterns
C4Media
 
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
LinkedIn
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3
Wen-Tien Chang
 
Fixing Twitter Velocity2009
Fixing Twitter Velocity2009
John Adams
 
What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...
Stefano Fago
 
The Hard Problems of Continuous Deployment
The Hard Problems of Continuous Deployment
Timothy Fitz
 
Ad

Recently uploaded (20)

Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
9-1-1 Addressing: End-to-End Automation Using FME
9-1-1 Addressing: End-to-End Automation Using FME
Safe Software
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Priyanka Aash
 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Curietech AI in action - Accelerate MuleSoft development
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
All Things Open
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
9-1-1 Addressing: End-to-End Automation Using FME
9-1-1 Addressing: End-to-End Automation Using FME
Safe Software
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Priyanka Aash
 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Curietech AI in action - Accelerate MuleSoft development
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
All Things Open
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Ad

PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story

  • 1. PHP at 5000 Requests / Sec Hootsuite’s Scaling Story Bill Monkman Lead Technical Engineer - Platform @bmonkman
  • 3. Overview - Selected Current Architecture Users lb1 lb2 lb3 ... Nginx Load balancers web1 web2 web3 ... Nginx web servers PHP-FPM PHP-FPM PHP-FPM PHP-FPM Memcached cluster mem1 ... Mysql cluster master slave MongoDB cluster master slave master slave shard1 shard2 Gearman cluster geard1 geard2 worker1 ... ... ... Services
  • 4. Technologies - at first • Apache • PHP • MySQL
  • 6. Problem It’s hard to scale MySQL horizontally
  • 7. Solution - Caching Memcached. ● Distributed cache, cluster of boxes with lots of RAM, trivial to scale ● Cache as much as possible, invalidate only when necessary ● Use cache instead of DB ● No joins - decouple entities (collection caching) ● Twemproxy!
  • 8. “There are only two hard things in Computer Science: cache invalidation and naming things.” • Phil Karlton
  • 9. Solution - Caching MvcModelBaseCaching MvcModelBase MvcModelMysql SocialNetwork
  • 10. Solution - Caching SELECT * FROM member WHERE org_id=888 set individual cache records member_1 {data} member_5 {data} member_9 {data} set collection cache member_org_888 [1,5,9] Automatic invalidation of collection cache
  • 11. Solution - Caching It’s hard to scale MySQL horizontally Now: ● No need to scale MySQL ● Able to serve the whole site on 1 MySQL server ● 500 MySQL SELECTs per second. 50,000 Memcached GETs. ● 99+% hit rate
  • 13. Problem Need a way to perform asynchronous, distributed tasks using a single-threaded language.
  • 14. Solution - Gearman Gearman. ● Distribute work to other servers to handle (workers also using PHP, same codebase) ● Precursor to SOA where everything is truly distributed ● Many other solutions, queueing systems.
  • 15. Solution - Gearman geard1 geard2 gearworker1 gearworker2 gearworker6
  • 16. Solution - Gearman Need a way to perform asynchronous, distributed tasks using a single-threaded language. Now: ● Moved key tasks to Gearman ● Another cluster, scalable separately from web ● Discrete tasks, callable sync or async
  • 18. Problem Need to store data with the potential to grow too big to handle effectively with MySQL.
  • 19. Solution - MongoDB MongoDB. ● Certain data did not need to be highly relational ● NoSQL DB, many other solutions these days ● Mongo can be a pain, lots of moving parts ● Had to make our own sequencer where auto-incremented ids were necessary
  • 20. Solution - MongoDB Need to store data with the potential to grow too big to handle effectively with MySQL. Now: ● Multiple clusters containing amounts of data that likely would have crushed MySQL ● Billions of rows per collection, many TB of data on disk
  • 21. Technologies • Apache • PHP • MySQL • Memcached • Gearman • MongoDB
  • 23. Problem With a codebase and an engineering team increasing in size, how do we keep up the pace of development and maintain control of the system? (SVN, big branches, merge hell)
  • 24. Solution - Dark Launching Dark Launching. ● Wrap code in block with a specific name ● That name will appear in a management page ● Can control whether or not that block is executed by modifying it’s value ● Boolean , random percentage, session-based, member list, organization list, etc.
  • 25. Solution - Dark Launching if (In_Feature::isEnabled(‘TWITTER_ADS’)) { // execute new code } else { // execute old code }
  • 26. Dark Launching - Reasons • Control your code • Limit risk -> raise confidence -> speed up pace of releases • “Branching in Production” • Learning happens in Production
  • 27. Solution - Dark Launching With a codebase and an engineering team increasing in size, how do we keep up the pace of development and maintain control of the system? Now: ● Work fast with more confidence ● Huge amount of control over production systems ● Typically 10+ code releases to production per day ● Push-based distribution with Consul
  • 29. Problem With a rapidly increasing codebase and amount of users / traffic how do we keep visibility into the performance of the code?
  • 30. Solution - Monitoring Statsd / Graphite. Logstash / Elasticsearch / Kibana. Sensu ● Statsd for metrics ● Logstash for log events ● Sensu for monitoring / alerting
  • 31. Solution - Monitoring Statsd::timing('apiCall.facebookGraph', microtime(true) - $startTime);
  • 32. Solution - Monitoring Logger::event('user liked from in-stream', In_Log::CATEGORY_UX, $logData);
  • 33. Solution - Monitoring • Visibility into the performance and behaviour of your application • Iterate upon your code, measure results • Pairs well with dark launching • Also systems like New Relic
  • 34. Solution - Monitoring With a rapidly increasing codebase and amount of users / traffic how do we keep visibility into the performance of the code? Now: ● Able to watch performance / behaviour in real time. ● Able to view important events both in the aggregate or very granular ● Able to control the system and watch the effect of changes
  • 36. Optimizations • Things expand beyond their initial scope • Case in point: Translations
  • 39. Optimizations - Push work to users • Within reason, push work up to users • Make your users into a distributed processing grid • e.g. Stream rendering
  • 40. Optimizations - Performance / Risks • Performance is more important than clean code, business reqts (in the instances where they may be mutually exclusive) • Fine line between future proofing and premature optimization • Don’t add burdensome processes, but make it easy for your team to do things the right way • Know your weak spots, protect against abuse
  • 42. Technologies Linux Nginx ElasticSearch Varnish PHP-FPM MySQL Jenkins Scala MongoDB Consul Gearman Redis Akka Python Memcached HAProxy jQuery ZeroMQ Backbone RabbitMQ EC2 Zend Docker Cloudfront CDN Logstash Zookeeper Kibana Statsd/Graphite Packer Vagrant Nagios VirtualBox Spark/Shark Sensu Symfony Riak Composer Websockets Comet Hadoop Ansible Git Webpack Redshift
  • 43. Problem With a huge and growing monolithic codebase and over 80 engineers, how to keep scaling in a manageable way?
  • 44. Solution - SOA SOA. ● Split up the system into independent services which communicate only via APIs ● Teams can work on their own services with encapsulated business logic and have their own deployment schedules. ● We chose to use Scala/Akka for services, communicating via ZeroMQ ● SOA transition made easier by the “no joins” philosophy ● Tons of work
  • 45. Solution - SOA SOM. ● “Service Oriented Monolith” ● When splitting up a monolithic codebase, dependencies are what kill you ● Fulfill dependencies by writing interim services using existing PHP code ● Maintain the contract and future scala services will be drop-in replacements
  • 46. Solution - SOA With a huge and growing monolithic codebase and over 130 engineers, how to keep scaling in a manageable way? Today: ● Transitioning to Scala SOA ● PHP will still be used as the Façade, a thin layer built on top of the business logic of the services it interacts with.
  • 48. Thank You! Bill Monkman @bmonkman More Info: code.hootsuite.com