SlideShare a Scribd company logo
PHP & Continuous Data ProcessingMichael Peacock, October, 2011
No. Not milk floats (anymore)All Electric, Commercial Vehicles.Photo courtesy of kenjonbro: http://www.flickr.com/photos/kenjonbro/4037649210/in/set-72157623026469013
About Michael PeacockSenior/Lead Web Developer
Web Systems Developer
Telemetry Team – Smith Electric Vehicles US Corp
Author
PHP 5 Social Networking, PHP 5 E-Commerce Development, Drupal Social Networking (6 & 7), Selling online with Drupal e-Commerce, Building Websites with TYPO3
PHPNE Volunteer
Occasional technical speaker
PHP North-East, PHPNW 2010, SuperMondays, PHPNW 2011 Unconference, ConFoo 2012Smith Electric Vehicles & Telemetry	Worlds largest manufacturer of Commercial, all-electric vehiclesSmith Link – on-board vehicle telematics system, capturing over 2500 data points each second on the vehicle and broadcasting them over mobile network~400 telemetry enabled vehicles on the roadWorlds largest telemetry project outside of F1
System Architecture
System Architecture
Problem #1: We Can’t Loose Any DataData is required as part of a $32 million grant from the US Department of EnergyThousands of pieces of information collected on a per second basis from a range of remote collection devices
Un-predictable amounts of data at any one time
More vehicles rolling off the production line with telemetry enabled
What about system downtime, upgrades, roll-outs and connectivity problems?Message QueuingSolution: We use a fast, reliable, scalable, secure, hosted message queueIf our systems are offline, data builds up in the external message queue
If we are processing at full capacity, surplus builds in in the message queue
If the vehicle loses GPRS signal, or message queue were to be inaccessible, vehicles have an internal buffer of up to 7 daysSecret Weapon #1: StormMQBased on AMQP, an open standard
Secure: All data is encrypted and sent over SSL
Reliable: Huge investment in server infrastructure
Hosted: Backed up with an SLA
Scalable: Capable of processing huge numbers of incoming messages, with capacity to store the messages when we perform maintenance on our systemsProblem #2: Processing data quicklyWe utilise a dedicated server and number of dedicated applications to pull these messages and process themThis needs to happen quick enough for live data to be seen through the web interface
Data is rapidly converted into batch SQL files, which are imported to MySQL via “LOAD DATA INFILE”
Results in high number of inserts per second (20,000 – 80,000)
LOAD DATA INFILE isn’t enough on its own...Secret Weapon #2: DBASam Lambert – DBA ExtraordinaireConstantly tweaking the servers and configuration to get more and more performance
Pushing the capabilities of our SAN, tweaking configs where no DBA has gone before
www.samlambert.com
http://www.samlambert.com/2011/07/how-to-push-your-san-with-open-iscsi_13.html
http://www.samlambert.com/2011/07/diagnosing-and-fixing-mysql-io.html
sam.lambert@smithelectric.comShardingHuge volumes of data being stored
We shard the data based on the truck it came from, each truck has its own database
Databases held on one of many database servers in our cluster each with ~100GB RAMLive, Real Time Information[live screen photo]
Real Time Status and Tracking
Live, Real Time Information: ProblemOriginal database design dictated:All data-points were stored in the same table
Each type of data point required a separate query, sub-query or join to obtainWorkings of the remote device collecting the data, and the processing server, dictated:GPS Co-ordinates can be up to 6 separate data points, including: Longitude; Latitude; Altitude; Speed; Number of Satellites used to get location; DirectionReal Time Information: ConcurrentInitial Solution from the original developers:Pull as many pieces of real time information through asynchronously
Ad

Recommended

PHP & Twilio
PHP & Twilio
Michael Peacock
 
Agility Requires Safety
Agility Requires Safety
Yevgeniy Brikman
 
Speed = $$$
Speed = $$$
Peter Gfader
 
You are not_hiding_from_me_.net
You are not_hiding_from_me_.net
Chung Wee Jing
 
Midwest PHP 2017 DevOps For Small team
Midwest PHP 2017 DevOps For Small team
Joe Ferguson
 
Continuous delivery of your legacy application
Continuous delivery of your legacy application
ColdFusionConference
 
Selenium and Sauce Labs
Selenium and Sauce Labs
hugs
 
php[world] 2015 Laravel 5.1: From Homestead to the Cloud
php[world] 2015 Laravel 5.1: From Homestead to the Cloud
Joe Ferguson
 
php[world] 2015 Training - Laravel from the Ground Up
php[world] 2015 Training - Laravel from the Ground Up
Joe Ferguson
 
ColdFusion builder plugins
ColdFusion builder plugins
ColdFusionConference
 
Advanced Spring Boot with Consul
Advanced Spring Boot with Consul
VMware Tanzu
 
Selenium-4-and-appium-2
Selenium-4-and-appium-2
Manoj Kumar Kumar
 
POST/CON 2019 Workshop: Testing, Automated Testing, and Reporting APIs with P...
POST/CON 2019 Workshop: Testing, Automated Testing, and Reporting APIs with P...
Postman
 
Drupal Performance
Drupal Performance
Pantheon
 
Automatic Test Results Publishing using Slack
Automatic Test Results Publishing using Slack
Yelitza Ruales
 
Securing Legacy CFML Code
Securing Legacy CFML Code
ColdFusionConference
 
Drupal Deployment
Drupal Deployment
Jeff Eaton
 
WordPress London Developer Operations For Beginners
WordPress London Developer Operations For Beginners
Stewart Ritchie
 
Dennis Benkert - The Dog Ate My Deployment - Symfony Usergroup Berlin March ...
Dennis Benkert - The Dog Ate My Deployment - Symfony Usergroup Berlin March ...
D
 
The Dog Ate My Deployment - PHP Uncoference September 2013
The Dog Ate My Deployment - PHP Uncoference September 2013
D
 
The Dog Ate My Deployment - Symfony Usergroup Cologne July 2013
The Dog Ate My Deployment - Symfony Usergroup Cologne July 2013
D
 
VodQA_ParallelizingCukes_AmanKing
VodQA_ParallelizingCukes_AmanKing
poojaelkunchwar
 
Reducing Build Time
Reducing Build Time
Aman King
 
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Domas Lasauskas
 
Helpful Automation Techniques - Selenium Camp 2014
Helpful Automation Techniques - Selenium Camp 2014
Justin Ison
 
Connecting Connect with Spring Boot
Connecting Connect with Spring Boot
Vincent Kok
 
Selenium tutorial (1)
Selenium tutorial (1)
Simona Pitam
 
Selenium tutorial
Selenium tutorial
Simona Pitam
 
Refactoring to symfony components
Refactoring to symfony components
Michael Peacock
 
Dance for the puppet master: G6 Tech Talk
Dance for the puppet master: G6 Tech Talk
Michael Peacock
 

More Related Content

What's hot (20)

php[world] 2015 Training - Laravel from the Ground Up
php[world] 2015 Training - Laravel from the Ground Up
Joe Ferguson
 
ColdFusion builder plugins
ColdFusion builder plugins
ColdFusionConference
 
Advanced Spring Boot with Consul
Advanced Spring Boot with Consul
VMware Tanzu
 
Selenium-4-and-appium-2
Selenium-4-and-appium-2
Manoj Kumar Kumar
 
POST/CON 2019 Workshop: Testing, Automated Testing, and Reporting APIs with P...
POST/CON 2019 Workshop: Testing, Automated Testing, and Reporting APIs with P...
Postman
 
Drupal Performance
Drupal Performance
Pantheon
 
Automatic Test Results Publishing using Slack
Automatic Test Results Publishing using Slack
Yelitza Ruales
 
Securing Legacy CFML Code
Securing Legacy CFML Code
ColdFusionConference
 
Drupal Deployment
Drupal Deployment
Jeff Eaton
 
WordPress London Developer Operations For Beginners
WordPress London Developer Operations For Beginners
Stewart Ritchie
 
Dennis Benkert - The Dog Ate My Deployment - Symfony Usergroup Berlin March ...
Dennis Benkert - The Dog Ate My Deployment - Symfony Usergroup Berlin March ...
D
 
The Dog Ate My Deployment - PHP Uncoference September 2013
The Dog Ate My Deployment - PHP Uncoference September 2013
D
 
The Dog Ate My Deployment - Symfony Usergroup Cologne July 2013
The Dog Ate My Deployment - Symfony Usergroup Cologne July 2013
D
 
VodQA_ParallelizingCukes_AmanKing
VodQA_ParallelizingCukes_AmanKing
poojaelkunchwar
 
Reducing Build Time
Reducing Build Time
Aman King
 
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Domas Lasauskas
 
Helpful Automation Techniques - Selenium Camp 2014
Helpful Automation Techniques - Selenium Camp 2014
Justin Ison
 
Connecting Connect with Spring Boot
Connecting Connect with Spring Boot
Vincent Kok
 
Selenium tutorial (1)
Selenium tutorial (1)
Simona Pitam
 
Selenium tutorial
Selenium tutorial
Simona Pitam
 
php[world] 2015 Training - Laravel from the Ground Up
php[world] 2015 Training - Laravel from the Ground Up
Joe Ferguson
 
Advanced Spring Boot with Consul
Advanced Spring Boot with Consul
VMware Tanzu
 
POST/CON 2019 Workshop: Testing, Automated Testing, and Reporting APIs with P...
POST/CON 2019 Workshop: Testing, Automated Testing, and Reporting APIs with P...
Postman
 
Drupal Performance
Drupal Performance
Pantheon
 
Automatic Test Results Publishing using Slack
Automatic Test Results Publishing using Slack
Yelitza Ruales
 
Drupal Deployment
Drupal Deployment
Jeff Eaton
 
WordPress London Developer Operations For Beginners
WordPress London Developer Operations For Beginners
Stewart Ritchie
 
Dennis Benkert - The Dog Ate My Deployment - Symfony Usergroup Berlin March ...
Dennis Benkert - The Dog Ate My Deployment - Symfony Usergroup Berlin March ...
D
 
The Dog Ate My Deployment - PHP Uncoference September 2013
The Dog Ate My Deployment - PHP Uncoference September 2013
D
 
The Dog Ate My Deployment - Symfony Usergroup Cologne July 2013
The Dog Ate My Deployment - Symfony Usergroup Cologne July 2013
D
 
VodQA_ParallelizingCukes_AmanKing
VodQA_ParallelizingCukes_AmanKing
poojaelkunchwar
 
Reducing Build Time
Reducing Build Time
Aman King
 
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Domas Lasauskas
 
Helpful Automation Techniques - Selenium Camp 2014
Helpful Automation Techniques - Selenium Camp 2014
Justin Ison
 
Connecting Connect with Spring Boot
Connecting Connect with Spring Boot
Vincent Kok
 
Selenium tutorial (1)
Selenium tutorial (1)
Simona Pitam
 

Viewers also liked (6)

Refactoring to symfony components
Refactoring to symfony components
Michael Peacock
 
Dance for the puppet master: G6 Tech Talk
Dance for the puppet master: G6 Tech Talk
Michael Peacock
 
Phinx talk
Phinx talk
Michael Peacock
 
Powerful and flexible templates with Twig
Powerful and flexible templates with Twig
Michael Peacock
 
Introduction to OOP with PHP
Introduction to OOP with PHP
Michael Peacock
 
Multimedia chapter 2
Multimedia chapter 2
PrathimaBaliga
 
Refactoring to symfony components
Refactoring to symfony components
Michael Peacock
 
Dance for the puppet master: G6 Tech Talk
Dance for the puppet master: G6 Tech Talk
Michael Peacock
 
Powerful and flexible templates with Twig
Powerful and flexible templates with Twig
Michael Peacock
 
Introduction to OOP with PHP
Introduction to OOP with PHP
Michael Peacock
 
Ad

Similar to PHP Continuous Data Processing (20)

Evolution of a big data project
Evolution of a big data project
Michael Peacock
 
Microsoft Windows Server AppFabric
Microsoft Windows Server AppFabric
Mark Ginnebaugh
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overview
jimliddle
 
Log everything! @DC13
Log everything! @DC13
DECK36
 
Data science for infrastructure dev week 2022
Data science for infrastructure dev week 2022
ZainAsgar1
 
DataBearings: A semantic platform for data integration on IoT, Artem Katasonov
DataBearings: A semantic platform for data integration on IoT, Artem Katasonov
VTT Technical Research Centre of Finland Ltd
 
6 tips for improving ruby performance
6 tips for improving ruby performance
Engine Yard
 
Wikilims Road4
Wikilims Road4
guestcc22df
 
Scaling asp.net websites to millions of users
Scaling asp.net websites to millions of users
oazabir
 
Embulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loader
Sadayuki Furuhashi
 
High Volume Payments using Mule
High Volume Payments using Mule
Adhish Pendharkar
 
Sherlock Homepage (Maarten Balliauw)
Sherlock Homepage (Maarten Balliauw)
Visug
 
Sherlock Homepage - A detective story about running large web services (VISUG...
Sherlock Homepage - A detective story about running large web services (VISUG...
Maarten Balliauw
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
Splunk
 
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
NETWAYS
 
nodejs_at_a_glance, understanding java script
nodejs_at_a_glance, understanding java script
mohammedarshadhussai4
 
Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)
jaxLondonConference
 
Big Data, Mob Scale.
Big Data, Mob Scale.
darach
 
AD102 - Break out of the Box
AD102 - Break out of the Box
Karl-Henry Martinsson
 
Deconstructing Lambda
Deconstructing Lambda
darach
 
Evolution of a big data project
Evolution of a big data project
Michael Peacock
 
Microsoft Windows Server AppFabric
Microsoft Windows Server AppFabric
Mark Ginnebaugh
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overview
jimliddle
 
Log everything! @DC13
Log everything! @DC13
DECK36
 
Data science for infrastructure dev week 2022
Data science for infrastructure dev week 2022
ZainAsgar1
 
6 tips for improving ruby performance
6 tips for improving ruby performance
Engine Yard
 
Scaling asp.net websites to millions of users
Scaling asp.net websites to millions of users
oazabir
 
Embulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loader
Sadayuki Furuhashi
 
High Volume Payments using Mule
High Volume Payments using Mule
Adhish Pendharkar
 
Sherlock Homepage (Maarten Balliauw)
Sherlock Homepage (Maarten Balliauw)
Visug
 
Sherlock Homepage - A detective story about running large web services (VISUG...
Sherlock Homepage - A detective story about running large web services (VISUG...
Maarten Balliauw
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
Splunk
 
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
NETWAYS
 
nodejs_at_a_glance, understanding java script
nodejs_at_a_glance, understanding java script
mohammedarshadhussai4
 
Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)
jaxLondonConference
 
Big Data, Mob Scale.
Big Data, Mob Scale.
darach
 
Deconstructing Lambda
Deconstructing Lambda
darach
 
Ad

More from Michael Peacock (18)

Immutable Infrastructure with Packer Ansible and Terraform
Immutable Infrastructure with Packer Ansible and Terraform
Michael Peacock
 
Test driven APIs with Laravel
Test driven APIs with Laravel
Michael Peacock
 
Symfony Workflow Component - Introductory Lightning Talk
Symfony Workflow Component - Introductory Lightning Talk
Michael Peacock
 
Alexa, lets make a skill
Alexa, lets make a skill
Michael Peacock
 
API Development with Laravel
API Development with Laravel
Michael Peacock
 
An introduction to Laravel Passport
An introduction to Laravel Passport
Michael Peacock
 
Vagrant
Vagrant
Michael Peacock
 
Phpne august-2012-symfony-components-friends
Phpne august-2012-symfony-components-friends
Michael Peacock
 
Real time voice call integration - Confoo 2012
Real time voice call integration - Confoo 2012
Michael Peacock
 
Dealing with Continuous Data Processing, ConFoo 2012
Dealing with Continuous Data Processing, ConFoo 2012
Michael Peacock
 
Data at Scale - Michael Peacock, Cloud Connect 2012
Data at Scale - Michael Peacock, Cloud Connect 2012
Michael Peacock
 
Supermondays twilio
Supermondays twilio
Michael Peacock
 
PHP North East Registry Pattern
PHP North East Registry Pattern
Michael Peacock
 
PHP North East - Registry Design Pattern
PHP North East - Registry Design Pattern
Michael Peacock
 
Supermondays: Jenkins CI lightning talk
Supermondays: Jenkins CI lightning talk
Michael Peacock
 
Corporate Structures - September 2010
Corporate Structures - September 2010
Michael Peacock
 
PHP North-East - Automated Deployment
PHP North-East - Automated Deployment
Michael Peacock
 
Abstracting functionality with centralised content
Abstracting functionality with centralised content
Michael Peacock
 
Immutable Infrastructure with Packer Ansible and Terraform
Immutable Infrastructure with Packer Ansible and Terraform
Michael Peacock
 
Test driven APIs with Laravel
Test driven APIs with Laravel
Michael Peacock
 
Symfony Workflow Component - Introductory Lightning Talk
Symfony Workflow Component - Introductory Lightning Talk
Michael Peacock
 
Alexa, lets make a skill
Alexa, lets make a skill
Michael Peacock
 
API Development with Laravel
API Development with Laravel
Michael Peacock
 
An introduction to Laravel Passport
An introduction to Laravel Passport
Michael Peacock
 
Phpne august-2012-symfony-components-friends
Phpne august-2012-symfony-components-friends
Michael Peacock
 
Real time voice call integration - Confoo 2012
Real time voice call integration - Confoo 2012
Michael Peacock
 
Dealing with Continuous Data Processing, ConFoo 2012
Dealing with Continuous Data Processing, ConFoo 2012
Michael Peacock
 
Data at Scale - Michael Peacock, Cloud Connect 2012
Data at Scale - Michael Peacock, Cloud Connect 2012
Michael Peacock
 
PHP North East Registry Pattern
PHP North East Registry Pattern
Michael Peacock
 
PHP North East - Registry Design Pattern
PHP North East - Registry Design Pattern
Michael Peacock
 
Supermondays: Jenkins CI lightning talk
Supermondays: Jenkins CI lightning talk
Michael Peacock
 
Corporate Structures - September 2010
Corporate Structures - September 2010
Michael Peacock
 
PHP North-East - Automated Deployment
PHP North-East - Automated Deployment
Michael Peacock
 
Abstracting functionality with centralised content
Abstracting functionality with centralised content
Michael Peacock
 

Recently uploaded (20)

Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
High Availability On-Premises FME Flow.pdf
High Availability On-Premises FME Flow.pdf
Safe Software
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Supporting the NextGen 911 Digital Transformation with FME
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
biswajitbanerjee38
 
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
Precisely
 
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Safe Software
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
High Availability On-Premises FME Flow.pdf
High Availability On-Premises FME Flow.pdf
Safe Software
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Supporting the NextGen 911 Digital Transformation with FME
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
biswajitbanerjee38
 
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
Precisely
 
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Safe Software
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 

PHP Continuous Data Processing

  • 1. PHP & Continuous Data ProcessingMichael Peacock, October, 2011
  • 2. No. Not milk floats (anymore)All Electric, Commercial Vehicles.Photo courtesy of kenjonbro: http://www.flickr.com/photos/kenjonbro/4037649210/in/set-72157623026469013
  • 5. Telemetry Team – Smith Electric Vehicles US Corp
  • 7. PHP 5 Social Networking, PHP 5 E-Commerce Development, Drupal Social Networking (6 & 7), Selling online with Drupal e-Commerce, Building Websites with TYPO3
  • 10. PHP North-East, PHPNW 2010, SuperMondays, PHPNW 2011 Unconference, ConFoo 2012Smith Electric Vehicles & Telemetry Worlds largest manufacturer of Commercial, all-electric vehiclesSmith Link – on-board vehicle telematics system, capturing over 2500 data points each second on the vehicle and broadcasting them over mobile network~400 telemetry enabled vehicles on the roadWorlds largest telemetry project outside of F1
  • 13. Problem #1: We Can’t Loose Any DataData is required as part of a $32 million grant from the US Department of EnergyThousands of pieces of information collected on a per second basis from a range of remote collection devices
  • 14. Un-predictable amounts of data at any one time
  • 15. More vehicles rolling off the production line with telemetry enabled
  • 16. What about system downtime, upgrades, roll-outs and connectivity problems?Message QueuingSolution: We use a fast, reliable, scalable, secure, hosted message queueIf our systems are offline, data builds up in the external message queue
  • 17. If we are processing at full capacity, surplus builds in in the message queue
  • 18. If the vehicle loses GPRS signal, or message queue were to be inaccessible, vehicles have an internal buffer of up to 7 daysSecret Weapon #1: StormMQBased on AMQP, an open standard
  • 19. Secure: All data is encrypted and sent over SSL
  • 20. Reliable: Huge investment in server infrastructure
  • 21. Hosted: Backed up with an SLA
  • 22. Scalable: Capable of processing huge numbers of incoming messages, with capacity to store the messages when we perform maintenance on our systemsProblem #2: Processing data quicklyWe utilise a dedicated server and number of dedicated applications to pull these messages and process themThis needs to happen quick enough for live data to be seen through the web interface
  • 23. Data is rapidly converted into batch SQL files, which are imported to MySQL via “LOAD DATA INFILE”
  • 24. Results in high number of inserts per second (20,000 – 80,000)
  • 25. LOAD DATA INFILE isn’t enough on its own...Secret Weapon #2: DBASam Lambert – DBA ExtraordinaireConstantly tweaking the servers and configuration to get more and more performance
  • 26. Pushing the capabilities of our SAN, tweaking configs where no DBA has gone before
  • 30. [email protected] volumes of data being stored
  • 31. We shard the data based on the truck it came from, each truck has its own database
  • 32. Databases held on one of many database servers in our cluster each with ~100GB RAMLive, Real Time Information[live screen photo]
  • 33. Real Time Status and Tracking
  • 34. Live, Real Time Information: ProblemOriginal database design dictated:All data-points were stored in the same table
  • 35. Each type of data point required a separate query, sub-query or join to obtainWorkings of the remote device collecting the data, and the processing server, dictated:GPS Co-ordinates can be up to 6 separate data points, including: Longitude; Latitude; Altitude; Speed; Number of Satellites used to get location; DirectionReal Time Information: ConcurrentInitial Solution from the original developers:Pull as many pieces of real time information through asynchronously
  • 36. Involved the use of Flash based “widgets” which called separate PHP scripts to query the data
  • 38. Data points took a little time to load
  • 39. Not good enoughReal Time Information: CachingHigh volumes of data, and varying levels of concurrent processing means query times are often not consistent
  • 40. Memcachewas used when processing the data from the message queue, keeping a copy of the most recent of each data point for each truck
  • 41. Live, Real-Time information accessed directly from memcache, bypassing the databaseCaching: Registry/DI is IdealSporadic use of memcache within the web application – ideal use case for a lazy loading registry or DI container
  • 42. Give the registry or container details of memcache
  • 43. Object only instantiated and connection made only when data is requested from memcacheLazy Loadingpublic function getObject( $key ){ if( in_array( $key, array_keys( $this->objects ) ) ) { return $this->objects[$key]; }elseif( in_array( $key, array_keys( $this->objectSetup ) ) ) { if( ! is_null( $this->objectSetup[ $key ]['abstract'] ) ) {require_once( FRAMEWORK_PATH . 'registry/aspects/' . $this->objectSetup[ $key ]['folder'] . '/' . $this->objectSetup[ $key ]['abstract'] .'.abstract.php' ); }require_once( FRAMEWORK_PATH . 'registry/aspects/' . $this->objectSetup[ $key ]['folder'] . '/' . $this- >objectSetup[ $key ]['file'] . '.class.php' ); $o = new $this->objectSetup[ $key ]['class']( $this ); $this->storeObject( $o, $key ); return $o; }elseif( $key == 'memcache' ) { // requesting memcache for the first time, instantiate, connect, store and return $mc = new Memcache(); $mc->connect( MEMCACHE_SERVER, MEMCACHE_PORT ); $this->storeObject( $mc, 'memcache' ); return $mc; }}Becomes the limit for the registry pattern, DI container more suitable
  • 44. Real Time Information: Extrapolate and AssumeOur telemetry unit broadcasts each data point once per second
  • 45. Data doesn’t change every second, e.g.
  • 46. Battery state of charge may take several minutes to loose a percentage point
  • 47. Fault flags only change to 1 when there is a fault
  • 49. We compare the data to the last known value…if it’s the same we don’t insert, instead we assume it was the same
  • 50. Unfortunately, this requires us to put additional checks and balances in placeExtrapolate and Assume: “Interlation”Built a special library which:Accepted a number of arrays, each representing a collection of data points for one variable on the truck
  • 51. Used key indicators and time differences to work out if/when the truck was off, and extrapolation should stop
  • 52. For each time data was recorded, pull down data for other variables for consistencyInterlace * Add an array to the interlation public function addArray( $name, $array ) * Get the time that we first receive data in one of our arrays public function getFirst( $field ) * Get the time that we last received data in any of our arrays public function getLast( $field ) * Generate the interlaced array public function generate( $keyField, $valueField) * Beak the interlaced array down into seperate days public function dayBreak( $interlationArray) * Generate an interlaced array and fill for all timestamps withinthe range of _first_ to _last_ public function generateAndFill( $keyField, $valueField) * Populate the new combined array with key fields using the common field public function populateKeysFromField( $field, $valueField=null )http://www.michaelpeacock.co.uk/interlation-library
  • 53. Real Time Information: Single RequestCurrently, each piece of “live data” is loaded into a flash graph or widget, which updates every 30 seconds using an AJAX request
  • 54. The move from MySQL to Memcache reduces database load, but large number of requests still add strain to web server
  • 55. Moving to image and JavaScript widgets, which are updated from a single AJAX requestLots of Data: Race ConditionsSessions in PHP close at the end of the execution cycleUnpredictable query times
  • 56. Large number of concurrent requests per screenSession LockingCompletely locks out a users session, as PHP hasn’t closed the session
  • 57. Race Conditions: PHP & Sessionssession_write_close()Added after each write to the $_SESSION array. Closes the current session.(requires a call to session_start immediately before any further reads or writes)
  • 58. Race Conditions: Use a ******* Template EngineV1 of the system mixed PHP and HTML 
  • 59. You can’t re-initialise your session once output has been sent
  • 60. All new code uses a template engine, so session interaction has no bearing on output. When the template is processed and output, all database and session work has been completed long before.Race Conditions: Use a Single Entry PointRace conditions are further exacerbated by the PHP timeout values
  • 61. Certain exports, actions and processes take longer than 30 seconds, so the default execution time is longer
  • 62. Initially the project lacked a single entry point, and execution flow was muddled
  • 63. Single Entry Point makes it easier to enforce a lower time out, which is overridden by intensive controllers or modelsIntensive queries & CalculationsHow far did this vehicle travel?
  • 64. Motor RPM x Various vehicle specific constants
  • 65. Calculated for every RPM value held during drive process
  • 66. How much energy did the vehicle use
  • 67. Battery Current x Battery Voltage x Time
  • 68. For every current and voltage value combination held during the driving process
  • 69. How well was the vehicle driven
  • 71. Harshness of accelerator and brake pedal usage
  • 72. Inappropriate duration of AC / Heater on time?
  • 73. What about for a customers fleet, or all of our vehicles sold?Intensive Queries & Calculations
  • 74. Intensive queries & CalculationsInvolves a fair number of queries per vehicle
  • 75. Calculations involve holding this data in memory
  • 76. Processing required for every single record for that piece of data during that dayTakes a while!Solution:Calculate information overnight
  • 77. Save it as a compiled report
  • 78. Lookups and comparisons only need to look at the compiled / saved reports in the databaseReportsIn addition to our calculated reports, we also need to export key bits of information to grant authoritiesInitially our PHP based export scripts held one database connection per database (~400 databases)
  • 79. Re-wrote to maintain only one connection per server, and switch the database used
  • 80. Toggles to instruct the export to only apply for 1 of the servers at a time
  • 81. Modulus magic to run multiple export scripts per serverTriggers and EventsCurrently a work-in-progress R&D project, evaluating two options:Golden hammer: Use PHP
  • 82. Run PHP as a daemon
  • 84. Continually monitor for specific changes to memcache variables
  • 88. Link into PHP based API to run triggers The FutureMore sharding
  • 89. Based on time – keep the individual tables smaller
  • 91. Currently investigating NoSQL solutions as alternatives
  • 93. Do we need as much data as we collect?
  • 95. We need to continually abstract concepts and ideas to make on-going maintenance and expansion easier; especially in terms of mapping code to database shards
  • 97. Expand our DB cluster, more RAM, R&D
  • 99. A much needed design refreshConclusionsMake the solution scalable from the start
  • 100. Where data collection is critical, use a message queue, ideally hosted or “cloud based”
  • 101. Hire a genius DBA to push your database engine
  • 102. Make use of data caching systems to reduce strain on the database
  • 103. Calculations and post-processing should be done during dead time and automated
  • 104. Add more tools to your toolbox – PHP needs lots of friends in these situations
  • 105. Watch out for Session race conditions: where they can’t be avoided, use session_write_close, a template engine and a single entry point
  • 106. Reduce the number of continuous AJAX callsQ & AMichael PeacockWeb Systems Developer – Telemetry Team – Smith Electric Vehicles US [email protected] / Lead Developer, Author & [email protected] www.michaelpeacock.co.uk@michaelpeacockhttp://joind.in/3808http://www.slideshare.net/michaelpeacock Extra information!

Editor's Notes

  • #16: Imagine viewing a customers fleet of 30 vehicles on a map? 60 queries refreshing every 30 seconds