SlideShare a Scribd company logo
#CCCEU14
#CCCEU14 
TROUBLESHOOTING APACHE CLOUDSTACK
#CCCEU14 
JORIS VAN LIESHOUT 
Working @ Schuberg Philis since 2010 
Mission Critical 
+3 year IaaS team 
Part of the initial CS vs OS 
Started with CloudStack 2.2.14
#CCCEU14 
Reading the logs 
Understanding System VMs 
Use the source, Luke! 
DB API, hands off? 
Employee Cloud 
Questions 
Side notes 
• Have you worked with ACS? 
• CloudStack 4.4.1 
• XenServer 6.2.0 SP1 
• Default log settings 
AGENDA
#CCCEU14 
Less is more <=> 
• less /var/log/cloudstack/management/management-server. 
log 
Keep management server ids at hand 
• select * from cloud.mshost where removed is null; 
Stack traces 
• Look at the first instead of the last 
Search for API key 
Log4j 1.2 EnhancedPatternLayout 
• /etc/cloudstack/management/log4j-cloud.xml 
READING THE LOGS
#CCCEU14 
READING THE LOGS 
(Date, Time), Log priority 
Class 
• Will match a .java file in the source
#CCCEU14 
READING THE LOGS 
Thread Name 
Thread context 
Optional Job id
#CCCEU14 
READING THE LOGS 
When forwarding commands 
• Host_id-Sequence_nr 
• MgmtId 
• Can be the other server 
• Via 
• Command
#CCCEU14 
READING THE LOGS 
Find the first call 
• API call 
• API key 
• Name 
Thread name and 
first context 
Async creates a job
#CCCEU14 
READING THE LOGS 
Find the first call 
Thread name and 
first context 
Might creates a job 
Picked up by…
#CCCEU14 
READING THE LOGS 
Thread name and 
first context 
Sending… 
• Track sequence id 
• Keep an eye on MgmtID 
Executing…
#CCCEU14 
UNDERSTANDING SYSTEM VMS 
ssh to SVM 
• From hypervisor 
• Port 3922 
• /root/.ssh/id_rsa.cloud 
xensourse.log, SMlog and cloud/vmops.log 
XAPI call to vmops plugin
#CCCEU14 
USE THE SOURCE, LUKE! 
GitHub 
• https://github.com/apache/cloudstack 
Eclipse for Java EE 
• https://cwiki.apache.org/confluence/display/CLOUDSTACK/ 
Using+Eclipse+With+CloudStack 
DevCloud4 
• https://github.com/imduffy15/devcloud4
#CCCEU14 
Read only! 
Unless… 
• After code review 
• Bug solution also changes db 
Why? 
• Change state => data mismatch 
• Incorrect value => ACS start fails 
• DB data is not always leading 
• Effects of DB change can stick around 
Marvin and CloudMonkey 
• And find a way out without DB change 
DB API, HANDS OFF!
#CCCEU14 
EMPLOYEE CLOUD 
Realistic workload 
Use as UAT environment 
To reproduce bugs and workarounds 
Technology test ground
#CCCEU14 
jvanlieshout@schubergphilis.com 
@JorizvL 
QUESTIONS?

More Related Content

PDF
Ansible container
KEY
Anatomy of a high-volume, cloud-based WordPress architecture
PDF
Ansible for networks
PDF
Ansible Oxford - Cows & Containers
PPTX
Windows Azure PowerShell Cmdlets
PPTX
AWS 기반 Docker, Kubernetes
PPTX
Ansible presentation
PDF
Automated Deployment with Capistrano
Ansible container
Anatomy of a high-volume, cloud-based WordPress architecture
Ansible for networks
Ansible Oxford - Cows & Containers
Windows Azure PowerShell Cmdlets
AWS 기반 Docker, Kubernetes
Ansible presentation
Automated Deployment with Capistrano

What's hot (19)

PDF
Docker-Vancouver Meetup - March 18, 2014 - Contain(erize) the tests - Mark Ei...
PDF
F5 Automation and service discovery
PPTX
Windows Azure PowerShell CmdLets
PPT
Spark Streaming Info
PPTX
Container Monitoring with Sysdig
PDF
Ansible testing
PDF
Docker on AWS OpsWorks
PDF
EKS에서 Opentelemetry로 코드실행 모니터링하기 - 신재현 (인덴트코퍼레이션) :: AWS Community Day Online...
PDF
Docker Introduction
PDF
Performance testing meets the cloud - Artem Shendrikov
PDF
Testing Ansible with Jenkins and Docker
PDF
Deploying PHP Applications with Ansible
PPTX
Managing Large Selenium Grid
PDF
Docker orchestration using core os and ansible - Ansible IL 2015
PPTX
Monitor-Driven Development Using Ansible
PPTX
Automation of Active Directory's Deployments on AWS
PDF
Ansible Case Studies
PPTX
How to work with Selenium Grid and Cloud Solutions
Docker-Vancouver Meetup - March 18, 2014 - Contain(erize) the tests - Mark Ei...
F5 Automation and service discovery
Windows Azure PowerShell CmdLets
Spark Streaming Info
Container Monitoring with Sysdig
Ansible testing
Docker on AWS OpsWorks
EKS에서 Opentelemetry로 코드실행 모니터링하기 - 신재현 (인덴트코퍼레이션) :: AWS Community Day Online...
Docker Introduction
Performance testing meets the cloud - Artem Shendrikov
Testing Ansible with Jenkins and Docker
Deploying PHP Applications with Ansible
Managing Large Selenium Grid
Docker orchestration using core os and ansible - Ansible IL 2015
Monitor-Driven Development Using Ansible
Automation of Active Directory's Deployments on AWS
Ansible Case Studies
How to work with Selenium Grid and Cloud Solutions
Ad

Viewers also liked (20)

PDF
De Mensajería hacia Logs con Apache Kafka
DOC
Syed Vali Resume
DOC
PDF
DOCX
Troubleshooting guide for apache 2.2 service.
PDF
WebLogic on ODA - Oracle Open World 2013
ODP
Apache logs monitoring
PPTX
ApacheCon-HBase-2016
PPTX
WebLogic Filtering ClassLoader and ClassLoader Analysis Tool Demo
PDF
WebLogic in Practice: SSL Configuration
PDF
SOA Suite 12c Customer implementation
PPT
Web Server(Apache),
PPT
WebLogic Performance on SOLARIS SPARC Servers
PDF
Weblogic Cluster advanced performance tuning
DOC
weblogic perfomence tuning
PDF
Deployment Best Practices on WebLogic Server (DOAG IMC Summit 2013)
PDF
Oracle Fusion Middleware Infrastructure Best Practices
PDF
Performance Tuning Oracle Weblogic Server 12c
PPT
WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...
PDF
How To Install and Configure Apache SSL on CentOS 7
De Mensajería hacia Logs con Apache Kafka
Syed Vali Resume
Troubleshooting guide for apache 2.2 service.
WebLogic on ODA - Oracle Open World 2013
Apache logs monitoring
ApacheCon-HBase-2016
WebLogic Filtering ClassLoader and ClassLoader Analysis Tool Demo
WebLogic in Practice: SSL Configuration
SOA Suite 12c Customer implementation
Web Server(Apache),
WebLogic Performance on SOLARIS SPARC Servers
Weblogic Cluster advanced performance tuning
weblogic perfomence tuning
Deployment Best Practices on WebLogic Server (DOAG IMC Summit 2013)
Oracle Fusion Middleware Infrastructure Best Practices
Performance Tuning Oracle Weblogic Server 12c
WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...
How To Install and Configure Apache SSL on CentOS 7
Ad

Similar to Troubleshooting Apache CloudStack at #ccceu14 by @jorizvl (20)

PDF
Gianluca Varisco - DevOoops (Increase awareness around DevOps infra security)
PDF
Road to Opscon (Pisa '15) - DevOoops
PDF
Austin Web Architecture
PDF
IBM Think Session 8598 Domino and JavaScript Development MasterClass
PPTX
Docker 1.11 Presentation
PDF
TechBeats #2
PPTX
Designing A Time bound resource download URL
PDF
AtlasCamp 2015: The age of orchestration: From Docker basics to cluster manag...
PDF
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
PDF
The age of orchestration: from Docker basics to cluster management
PDF
Backup and Restore SQL Server Databases in Microsoft Azure
PPTX
Become an Automation Ninja in 60 Minutes
PDF
Automating hard things may 2015
PPTX
PostgreSQL and Linux Containers
PPTX
Secure360 - Attack All the Layers! Again!
PPTX
Power of Azure Devops
PPTX
Getting Started with Docker
PDF
ITB2017 - Keynote
PDF
Scala at Treasure Data
PDF
GoDocker presentation
Gianluca Varisco - DevOoops (Increase awareness around DevOps infra security)
Road to Opscon (Pisa '15) - DevOoops
Austin Web Architecture
IBM Think Session 8598 Domino and JavaScript Development MasterClass
Docker 1.11 Presentation
TechBeats #2
Designing A Time bound resource download URL
AtlasCamp 2015: The age of orchestration: From Docker basics to cluster manag...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
The age of orchestration: from Docker basics to cluster management
Backup and Restore SQL Server Databases in Microsoft Azure
Become an Automation Ninja in 60 Minutes
Automating hard things may 2015
PostgreSQL and Linux Containers
Secure360 - Attack All the Layers! Again!
Power of Azure Devops
Getting Started with Docker
ITB2017 - Keynote
Scala at Treasure Data
GoDocker presentation

Recently uploaded (20)

PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
medical staffing services at VALiNTRY
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Transform Your Business with a Software ERP System
PDF
Digital Strategies for Manufacturing Companies
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PPTX
Introduction to Artificial Intelligence
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Digital Systems & Binary Numbers (comprehensive )
PPTX
Computer Software and OS of computer science of grade 11.pptx
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Understanding Forklifts - TECH EHS Solution
Reimagine Home Health with the Power of Agentic AI​
Navsoft: AI-Powered Business Solutions & Custom Software Development
Wondershare Filmora 15 Crack With Activation Key [2025
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
medical staffing services at VALiNTRY
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Odoo POS Development Services by CandidRoot Solutions
Internet Downloader Manager (IDM) Crack 6.42 Build 41
How to Choose the Right IT Partner for Your Business in Malaysia
Transform Your Business with a Software ERP System
Digital Strategies for Manufacturing Companies
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Introduction to Artificial Intelligence
Operating system designcfffgfgggggggvggggggggg
Digital Systems & Binary Numbers (comprehensive )
Computer Software and OS of computer science of grade 11.pptx
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises

Troubleshooting Apache CloudStack at #ccceu14 by @jorizvl

  • 3. #CCCEU14 JORIS VAN LIESHOUT Working @ Schuberg Philis since 2010 Mission Critical +3 year IaaS team Part of the initial CS vs OS Started with CloudStack 2.2.14
  • 4. #CCCEU14 Reading the logs Understanding System VMs Use the source, Luke! DB API, hands off? Employee Cloud Questions Side notes • Have you worked with ACS? • CloudStack 4.4.1 • XenServer 6.2.0 SP1 • Default log settings AGENDA
  • 5. #CCCEU14 Less is more <=> • less /var/log/cloudstack/management/management-server. log Keep management server ids at hand • select * from cloud.mshost where removed is null; Stack traces • Look at the first instead of the last Search for API key Log4j 1.2 EnhancedPatternLayout • /etc/cloudstack/management/log4j-cloud.xml READING THE LOGS
  • 6. #CCCEU14 READING THE LOGS (Date, Time), Log priority Class • Will match a .java file in the source
  • 7. #CCCEU14 READING THE LOGS Thread Name Thread context Optional Job id
  • 8. #CCCEU14 READING THE LOGS When forwarding commands • Host_id-Sequence_nr • MgmtId • Can be the other server • Via • Command
  • 9. #CCCEU14 READING THE LOGS Find the first call • API call • API key • Name Thread name and first context Async creates a job
  • 10. #CCCEU14 READING THE LOGS Find the first call Thread name and first context Might creates a job Picked up by…
  • 11. #CCCEU14 READING THE LOGS Thread name and first context Sending… • Track sequence id • Keep an eye on MgmtID Executing…
  • 12. #CCCEU14 UNDERSTANDING SYSTEM VMS ssh to SVM • From hypervisor • Port 3922 • /root/.ssh/id_rsa.cloud xensourse.log, SMlog and cloud/vmops.log XAPI call to vmops plugin
  • 13. #CCCEU14 USE THE SOURCE, LUKE! GitHub • https://github.com/apache/cloudstack Eclipse for Java EE • https://cwiki.apache.org/confluence/display/CLOUDSTACK/ Using+Eclipse+With+CloudStack DevCloud4 • https://github.com/imduffy15/devcloud4
  • 14. #CCCEU14 Read only! Unless… • After code review • Bug solution also changes db Why? • Change state => data mismatch • Incorrect value => ACS start fails • DB data is not always leading • Effects of DB change can stick around Marvin and CloudMonkey • And find a way out without DB change DB API, HANDS OFF!
  • 15. #CCCEU14 EMPLOYEE CLOUD Realistic workload Use as UAT environment To reproduce bugs and workarounds Technology test ground

Editor's Notes

  • #3: Everyone has seen <screenshot> Troubleshooting can be daunting <CLICK> Is it a infra issue, bug, something else Quick poll: Dev, Ops, Other? Today talk about: Fair share of outages (boot launch failed) From operational perspective: Some bugs only in env with PRD workload With this: Better DevOps relation
  • #4: As MCE (Mission Critical Engineer) CloudStack vs OpenStack, easily won by CS
  • #5: Bit much text, understanding will help a lot. Rohit api call life. How see commands are forwarded. Check out presentation Sten Use the source code and Dev tools Common discussion, my take on it Best tip I can give you Q: If time <CLICK> Notes: Ask: Who has? Based on DevCloud4 (4.4.1 and XS62EPS1) We never had the need to adjust log4j config
  • #6: Instead of grep, tail Or something like LogStash or Splunk When command gets forwarded to other management server And host ids (inc CVM and SSVM) Last ST frequently is result of earlier fault Search for instance name, network name, api call or API key API key: trace what user did. F.I. Citrix Studio *Brower plugin to see calls: Firefox, FireBug
  • #7: Bake down of a log line Remove date and time in all examples Debug is default Class (as of 4.4 abbreviated) Matches a source file on git
  • #8: Name and Number Can be many. F.I. DirectAgent thread ID: Unique per cycle Usually search for this combination For Async calls a Job id to track the work across threads We’ll look at this in a bit
  • #9: Forwarding commands to other hosts: hosts, CVMs, SSVMs host_id can be found in DB: cloud.hosts seq for tracking MgmtId might be other server when host managed by that server Via is the same as host_id. As of 4.4 this is expanded to full name.
  • #10: Finding call: API call or key or Name of Instance, Network, template, etc. between ===START=== and ===END=== Thread name and context can been seen as conversation. Async call will create a job for async execution
  • #11: Search for Job-id Again search for Thread name and context Repeat for next job can jump management server Different Thread and context
  • #12: In a conversation a task for a host is send to thread Picked up by thread and executed can jump management server In case of XenServer result in XAPI calls
  • #13: Very brief on SVMs: check out next talk by Sten This applies to XenServer get control ip from ACS (169.254.0.0) Depending on the call Calls for SVMs using vmops
  • #14: Have a look at GitHub clone essential Eclipse recommended DevCloud4 good to have
  • #15: As read only source really valuable Although most info using API Really careful Unless Know how the code responds to change in DB Stay in like with fix in code State change => example: NIC not removed, ref tables Incorrect state => example: instance state Expunged, network state Destroyed host ping => db is backup instance removed nic still there network will never return to allocated Both very powerful example => host in alert with pingtimeout, cluster unmanage.
  • #16: Load bugs, race conditions Also hardware load Not only ACS upgrades also other components: Hypervisors, storage, networking Tested solution for snapshot Dom0 load bug Adoption of Cloud tech For use Chef, Graphite, new OS, many PoCs Employee rack now is Employee cloud => lower power consumption