SlideShare a Scribd company logo
1
z/VM Performance Analysis
Lívio Sousa - livios@br.ibm.com
IBM zEnterprise Client Technical Specialist
2
Overview
• Guidelines
• Commands
• *MONITOR
• Performance Toolkit
• Omegamon XE
3
Definition of Performance
• Performance definitions:
– Response time
– Batch elapsed time
– Throughput
– Resource consumed per unit of work done
– Utilization
– Users supported
– Phone ringing
– Consistency
• All of the above
4
Performance Guidelines
• Processor
• Storage
• Paging
• Minidisk cache
• Server machines
5
Processor Guidelines
• Dedicated processors - mostly political
– Absolute share can be almost as effective
– Gets wait state assist and 500 ms minor time slice
– Perhaps not a good idea if you are CPU-constrained
– A virtual machine should have all dedicated or all shared processors
• Share settings
– Use absolute if you can judge percent of resources required
– Use relative if difficult to judge and if lower share as system load increases is
acceptable
– Be aware that share value is split by vCPUs
– Do not use LIMITHARD settings unnecessarily
• Masks looping users
• More scheduler overhead
• Use the right number of virtual processors for the guest's workload
• Don’t share all available IFLs to all LPARs
– Suspend Time can be high
6
Storage Guidelines
Virtual-to-real ratio should be <= 3:1 or make sure paging system is robust
– To avoid any performance impact for production workloads, you may need to keep ratio to 1:1
– See also http://www.vm.ibm.com/perf/tips/memory.html
– VIR2REAL EXEC (Bruce Hayden)
http://www.vm.ibm.com/download/packages/descript.cgi?VIR2REAL
Define some processor storage as expanded storage to provide paging hierarchy
– For more background, see http://www.vm.ibm.com/perf/tips/storconf.html
Size guests appropriately
– Avoiding over provisioning
– Do not put them in a high guest paging position
– Right-sized usually means "just barely swapping"
Exploit shared memory where possible
– IPL your Linux guests from a segment
– Use the Linux XIP (execute-in-place) file system
Total Virtual storage (all logged on userids): 388308 MB (379.2 GB)
Usable real storage (pageable) for this system: 202927 MB (198.2 GB)
Total LPAR Real storage: 204800 MB (200.0 GB)
Expanded storage usable for paging: 25600 MB ( 25.0 GB)
Total Virtual disk (VDISK) space defined: 50176 MB ( 49.0 GB)
Average Virtual disk size: 512 MB
Virtual + VDISK to Real storage ratio: 2.2 : 1
7
Paging Guidelines
• DASD paging allocations less than or equal to 50%
– QUERY ALLOC PAGE
• Watch blocks read per paging request (keep >10)
– Long block runs make paging I/O efficient
• Multiple volumes and multiple paths
– Remember, one I/O per real device at a time
– Use lots of little volumes rather than a few big volumes
– Pay attention in Response Time and Wait Queues
• Do not mix sizes of paging DASD
– Use all -3s, or all -9s, or whatever
• Paging to FCP SCSI (EDEVICES) may offer higher paging bandwidth with
higher processor requirements
– See also http://www.vm.ibm.com/perf/tips/prgpage.html
88
Reorder Processing - Background
• Page reorder is the process in z/VM of managing
user frame owned lists as input to demand scan
processing.
– It includes resetting the HW reference bit.
– Serializes the virtual machine (all virtual processors).
– In all releases of z/VM
• It is done periodically on a virtual machine basis.
• The cost of reorder is proportional to the number
of resident frames for the virtual machine.
– Roughly 130 ms/GB resident
– Delays of ~1 second for guest having 8 GB resident
– This can vary for different reasons +/- 40%
9
9
Reorder Processing - Diagnosing
Performance Toolkit
– Check resident page fields (“R<2GB” & “R>2GB”) on
FCX113 UPAGE report
• Remember, Reorder works against the resident pages, not total
virtual machine size.
– Check Console Function Mode Wait (“%CFW”) on
FCX114 USTAT report
• A virtual machine may be brought through console function mode to
serialize Reorder. There are other ways to serialize for Reorder and
there are other reasons that for CFW, so this is not conclusive.
REORDMON
– Available from the VM Download Page
http://www.vm.ibm.com/download/packages/
– Works against raw MONWRITE data for all monitored
virtual machines
– Works in real time for a specific virtual machine
– Provides how often Reorder processing occurs in each
monitor interval
10
10
REORDMON Example
Num. of Average Average
Userid Reorders Rsdnt(MB) Ref'd(MB) Reorder Times
-------- -------- --------- --------- -------------------
LINUX002 2 18352 13356 13:29:05 14:15:05
LINUX001 1 22444 6966 13:44:05
LINUX005 1 14275 5374 13:56:05
LINUX003 2 21408 13660 13:43:05 14:10:05
LINUX007 1 12238 5961 13:51:05
LINUX006 1 9686 4359 13:31:05
LINUX004 1 21410 11886 14:18:05
11
11
Reorder Processing - Mitigations
• Try to keep the virtual machine as small as possible.
• Virtual machines with multiple applications may need to be
split into multiple virtual machines with fewer applications.
• See http://www.vm.ibm.com/perf/tips/reorder.html for more
details.
• Apply APAR VM64774 if necessary:
– SET and QUERY commands, system wide settings
– Corrects problem in earlier “patch” solution that inhibits paging of
PGMBKs (Page Tables) for virtual machines where Reorder is set
off.
– z/VM 5.4.0 PTF UM33167 RSU 1003
– z/VM 6.1.0 PTF UM33169 RSU 1003
12
Minidisk Cache Guidelines
• In general, enable MDC for everything
• Configure some real storage for MDC
• Set maximum MDC limits
– SET MDC STOR 0M 256M and SET MDC XSTOR 0M 0M
• Disable MDC for
– Write-mostly or read-once disks (logs, accounting, Linux swap)
– Target volumes in backup scenarios
• Better performer than Virtual Disk in Storage (VDISK) for read I/Os
13
Server Machine Guidelines
• Server Virtual Machine (SVM)
• TCP/IP, RACFVM, etc.
• QUICKDSP ON to avoid eligible list
• Higher SHARE setting
• Ensure performance data includes these virtual
machines
14
CP INDICATE Command
• LOAD: shows total system load
– Processors, XSTORE, paging, MDC, queue lengths, storage load
– STORAGE value not very meaningful
• USER EXP: more useful than plain USER
• QUEUES EXP: great for scheduler problems and quick state sampling
– Mostly useful for eligible list assessments
• PAGING: lists users in page wait
• I/O: lists users in I/O wait
• ACTIVE: displays number of active users over given interval
• Consider using MONITOR DATA instead for "serious" examinations
15
CP INDICATE LOAD Example
INDICATE LOAD
AVGPROC-088% 03
XSTORE-000000/SEC MIGRATE-0000/SEC
MDC READS-000035/SEC WRITES-000001/SEC HIT RATIO-099%
PAGING-0023/SEC STEAL-000%
Q0-00007(00000) DORMANT-00410
Q1-00000(00000) E1-00000(00000)
Q2-00001(00000) EXPAN-002 E2-00000(00000)
Q3-00013(00000) EXPAN-002 E3-00000(00000)
PROC 0000-087% PROC 0001-088%
PROC 0002-089%
LIMITED-00000
16
Selected CP QUERY Commands
USERS: number and type of users on system
SRM: scheduler/dispatcher settings (LDUBUF, etc.)
SHARE: type and intensity of system share
FRAMES: real storage allocation
PATHS: physical paths to device and status
ALLOC MAP: DASD allocation
ALLOC PAGE: how full your paging space is
XSTORE: assignment of expanded storage
MONITOR: current monitor settings
MDC: MDC usage
VDISK: virtual disk in storage usage
SXSPAGES: System Execution Space
17
5,000 Foot View
CP Control
Blocks
Application
Data
VM Events
*MONITOR System Service
MONDCSS
Segment
MONWRITE Utility
Performance
Toolkit
Raw
Monwrite
History
Files
TCP/IP
Network
3270
Browser
VMRM
18
19
Processor
REPORT NAME REPORT CODE COMMAND
CPU Load and Transactions FCX100 CPU
LPAR Load FCX126 LPAR
Processor Log FCX144 PROCLOG
LPAR Load Log FCX202 LPARLOG
User Wait States FCX114 USTAT
System Summary FCX225 SYMSUMLG
20
FCX126 Run 2011/09/20 18:00:56 Logical Partition Activity
From 2011/09/13 09:19:15 To 2011/09/13 10:09:15
For 3000 Secs 00:50:00 Result of 13092011 Run
__________________________________________________________________________________________
Processor type and model : 2817-401
Nr. of configured partitions: 6
Nr. of physical processors : 25
Partition Nr. Upid #Proc Weight Wait-C Cap %Load CPU %Busy %Ovhd %Susp %VMld %Logld Type
LPAR1 1 00 24 100 NO NO 89.0 0 94.3 2.1 6.5 92.0 98.4 IFL
100 NO 1 93.4 2.4 7.7 90.8 98.3 IFL
100 NO 2 93.6 2.3 7.4 91.1 98.3 IFL
100 NO 3 93.6 2.4 7.5 91.1 98.4 IFL
100 NO 4 93.6 2.3 7.4 91.1 98.4 IFL
100 NO 5 93.5 2.3 7.5 91.0 98.3 IFL
100 NO 6 93.4 2.4 7.6 90.9 98.3 IFL
100 NO 7 93.2 2.4 7.7 90.6 98.1 IFL
100 NO 8 93.4 2.4 7.5 90.8 98.2 IFL
100 NO 9 93.2 2.4 7.7 90.7 98.2 IFL
100 NO 10 93.1 2.5 7.8 90.4 98.0 IFL
100 NO 11 93.2 2.4 7.7 90.6 98.0 IFL
100 NO 12 93.4 2.4 7.5 90.8 98.1 IFL
100 NO 13 93.3 2.3 7.5 90.8 98.1 IFL
100 NO 14 93.3 2.4 7.5 90.7 98.1 IFL
100 NO 15 93.2 2.5 7.6 90.5 97.9 IFL
100 NO 16 91.1 2.9 9.0 88.0 96.6 IFL
100 NO 17 91.3 2.8 8.8 88.2 96.7 IFL
100 NO 18 91.4 2.9 8.9 88.3 96.8 IFL
100 NO 19 91.5 2.7 8.8 88.5 97.0 IFL
100 NO 20 91.7 2.8 8.7 88.6 97.1 IFL
100 NO 21 91.5 2.8 8.9 88.5 97.1 IFL
21
FCX225 Run 2011/09/20 18:00:56 SYSSUMLG
System Performance Summary by Time
From 2011/09/13 09:19:15
To 2011/09/13 10:09:15
For 3000 Secs 00:50:00 Result of 13092011 Run
_________________________________________________________________________________
<------- CPU --------> <Vec> <--Users--> <---I/O---> <Stg> <-Paging-->
<--Ratio--> SSCH DASD Users <-Rate/s-->
Interval Pct Cap- On- Pct Log- +RSCH Resp in PGIN+ Read+
End Time Busy T/V ture line Busy ged Activ /s msec Elist PGOUT Write
>>Mean>> 90.0 1.10 .9293 24.0 .... 117 108 571.2 .4 .0 2610 1051
09:20:15 92.4 1.13 .9059 24.0 .... 117 108 523.0 .5 .0 1992 527.8
09:21:15 92.9 1.07 .9523 24.0 .... 117 108 399.2 .5 .0 1669 301.4
09:22:15 93.2 1.08 .9458 24.0 .... 117 108 557.4 .3 .0 2817 633.9
09:23:15 94.5 1.07 .9535 24.0 .... 117 108 590.3 .3 .0 1410 482.7
09:24:15 93.4 1.07 .9537 24.0 .... 117 108 649.5 .4 .0 2363 488.5
09:25:15 90.4 1.09 .9347 24.0 .... 117 108 684.7 .4 .0 2485 768.9
09:26:15 92.4 1.08 .9436 24.0 .... 117 108 666.8 .4 .0 2940 1215
09:27:15 90.9 1.09 .9344 24.0 .... 117 108 607.2 .4 .0 3179 726.7
09:28:15 92.2 1.08 .9469 24.0 .... 117 108 664.2 .5 .0 2179 896.0
09:29:17 90.8 1.10 .9318 24.0 .... 117 108 645.9 .6 .0 3404 804.5
09:30:16 89.5 1.19 .8579 24.0 .... 117 108 670.8 .7 .0 5402 3487
09:31:15 92.7 1.08 .9412 24.0 .... 117 108 588.7 .4 .0 3091 1807
09:32:15 91.2 1.09 .9421 24.0 .... 117 108 602.8 .3 .0 2635 1076
09:33:16 89.3 1.14 .9047 24.0 .... 117 108 255.2 .5 .0 3140 710.5
09:34:15 88.5 1.10 .9374 24.0 .... 117 108 205.2 .6 .0 2513 897.4
09:35:15 85.9 1.12 .9257 24.0 .... 117 108 320.4 .5 .0 3117 953.5
09:36:16 86.1 1.13 .9144 24.0 .... 117 108 213.5 .5 .0 3642 1144
09:37:16 83.0 1.14 .9090 24.0 .... 117 108 245.6 .5 .0 3414 2133
22
REPORT NAME REPORT CODE COMMAND
Auxiliary Storage Log FCX146 AUXLOG
CP Owned Device FCX109 DEVICE CPOWNED
User Page Data FCX113 UPAGE
Shared Data Spaces FCX134 DSPACESH
SXS Available Page Queues Mgnt FCX261 SXSAVAIL
Mini Disk Storage FCX178 MDCSTOR
Storage Utilization FCX103 STORAGE
Available List Log FCX254 AVAILLOG
Storage
23
FCX109 Run 2011/05/31 17:44:26 DEVICE CPOWNED
Load and Performance of CP Owned Disks
From 2011/05/12 16:48:41 To 2011/05/12 17:31:41
For 2580 Secs 00:43:00 Result of 20110512 Run
_______________________________________________________________________________
Page / SPOOL Allocation Summary
PAGE slots available 25165k SPOOL slots available 3605928
PAGE slot utilization 25% SPOOL slot utilization 65%
T-Disk cylinders avail. ....... DUMP slots available 0
T-Disk space utilization ...% DUMP slot utilization ..%
. . . . . . . . . . _____ . .<
Device Descr. -> <------------- Rate/s -------------> User Serv MLOAD
Volume Area Area Used <--Page---> <--Spool--> SSCH Inter Queue Time Resp
Addr Devtyp Serial Type Extent % P-Rds P-Wrt S-Rds S-Wrt Total +RSCH feres Lngth /Page Time
EDF1 9336 ZDPAG1 PAGE 12583k 25 196.5 199.9 ... ... 396.4 .0 0 8.18 5.5 88.0
EDF2 9336 ZDPAG2 PAGE 12583k 24 194.2 206.1 ... ... 400.4 .0 0 7.23 6.0 58.4
4374 3390 610SP1 SPOOL 802880 61 .0 .0 .0 .0 .0 .1 0 0 .4 .4
4672 3390 610SP2 SPOOL 803060 68 .0 .0 .0 .0 .0 .0 0 0 1.0 1.0
24
I/O
REPORT NAME REPORT CODE COMMAND
General I/O Device FCX108 DEVICE
SCSI Device FCX249 SCSI
DASD Performance Log FCX131 DEVCONF
FICON Channel Load FCX215 INTERIM FCHANNEL
General I/O Device Data Log FCX168 DEVLOG
I/O Processor Log FCX232 IOPROCLG
25
Studying MONWRITE Data
• z/VM Performance Toolkit
• Interactively – possible, but not so useful
• PERFKIT BATCH command – pretty useful
– Control files tell Perfkit which reports to produce
– You can then inspect the reports by hand or
programmatically
• See z/VM Performance Toolkit Reference for
information on how to use PERFKIT BATCH
• PRFIT (Brian Wade)
http://www.vm.ibm.com/download/packages/descript.cgi?PRFIT
26
26
Some Notes on z/VM Limits
• Sheer hardware:
– z/VM 5.2: 24 engines, 128 GB real
– z/VM 5.3: 32 engines, 256 GB real
– zSeries: 65,000 I/O devices
• Workloads we’ve run in test have included:
– 54 engines
– 440 GB real storage
– 128 GB XSTORE
– 240 1-GB Linux guests
– 8 1-TB guests
• Utilizations we routinely see in customer
environments
– 85% to 95% CPU utilization without worry
– Tens of thousands of pages per second without
worry
• Our limits tend to have two distinct shapes
– Performance drops off slowly with utilization (CPUs)
– Performance drops off rapidly when wall is hit
(storage)
Performance
Utilization
Precipitous (e.g., storage)
Gradual (e.g., CPUs)
27
Some Final Thoughts
• Define what is performance for your case
• Collect data for a base line of good
performance
• Implement change management process
• Make as few changes as possible at a time
• Relieving one bottleneck will reveal another
28
OBRIGADO!
Informações de Contato:
Livio Sousa
IBM Tutóia – SP
livios@br.ibm.com
+55 11 9 7203 6637

More Related Content

PPT
Samba server
PDF
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
PDF
MVS ABEND CODES
PDF
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
PDF
z/OS Communications Server Overview
PDF
VMware
PDF
Autonomous Database Explained
Samba server
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
MVS ABEND CODES
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
z/OS Communications Server Overview
VMware
Autonomous Database Explained

What's hot (20)

PPT
Shell and its types in LINUX
PDF
A Different Way to Perform zOS Maintenance.pdf
PPTX
Microkernel architecture
PDF
Hints for a successful hfs to zfs migration
PDF
A deep dive about VIP,HAIP, and SCAN
PDF
Users and groups in Linux
PDF
Understanding oracle rac internals part 1 - slides
PPT
Red hat linux 9 ppt2003
PDF
RTOS implementation
PPTX
IBM DS8880 and IBM Z - Integrated by Design
PDF
z16 zOS Support - March 2023 - SHARE in Atlanta.pdf
PPTX
Database administration commands
PPT
Mainframe Architecture & Product Overview
PDF
DB2 for z/OS Real Storage Monitoring, Control and Planning
PPT
DB2 for z/O S Data Sharing
PDF
RADIUS and LDAP on pfSense 2.4 - pfSense Hangout February 2018
PDF
Oracle RAC 19c: Best Practices and Secret Internals
PDF
DB2 for z/OS - Starter's guide to memory monitoring and control
PDF
Active/Active Database Solutions with Log Based Replication in xDB 6.0
 
PPTX
Linux basics
Shell and its types in LINUX
A Different Way to Perform zOS Maintenance.pdf
Microkernel architecture
Hints for a successful hfs to zfs migration
A deep dive about VIP,HAIP, and SCAN
Users and groups in Linux
Understanding oracle rac internals part 1 - slides
Red hat linux 9 ppt2003
RTOS implementation
IBM DS8880 and IBM Z - Integrated by Design
z16 zOS Support - March 2023 - SHARE in Atlanta.pdf
Database administration commands
Mainframe Architecture & Product Overview
DB2 for z/OS Real Storage Monitoring, Control and Planning
DB2 for z/O S Data Sharing
RADIUS and LDAP on pfSense 2.4 - pfSense Hangout February 2018
Oracle RAC 19c: Best Practices and Secret Internals
DB2 for z/OS - Starter's guide to memory monitoring and control
Active/Active Database Solutions with Log Based Replication in xDB 6.0
 
Linux basics
Ad

Similar to z/VM Performance Analysis (20)

PPTX
CPN302 your-linux-ami-optimization-and-performance
PDF
Ims05 ims 100 k benchmark
PDF
VMworld 2013: How SRP Delivers More Than Power to Their Customers
PPTX
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
PDF
Session 7362 Handout 427 0
PPTX
VMworld 2015: Extreme Performance Series - vSphere Compute & Memory
PDF
Advanced performance troubleshooting using esxtop
PDF
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
PPTX
Cloud Performance Benchmarking
PPT
Chapter 4
PDF
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
PDF
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
PPTX
Oracle Basics and Architecture
PDF
PowerDRC/LVS 2.2 released by POLYTEDA
PDF
Slow things down to make them go faster [FOSDEM 2022]
PDF
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
PPT
Troubleshooting SQL Server
PDF
Tuning Android for low RAM
PDF
PPTX
Zendcon scaling magento
CPN302 your-linux-ami-optimization-and-performance
Ims05 ims 100 k benchmark
VMworld 2013: How SRP Delivers More Than Power to Their Customers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Session 7362 Handout 427 0
VMworld 2015: Extreme Performance Series - vSphere Compute & Memory
Advanced performance troubleshooting using esxtop
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
Cloud Performance Benchmarking
Chapter 4
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Oracle Basics and Architecture
PowerDRC/LVS 2.2 released by POLYTEDA
Slow things down to make them go faster [FOSDEM 2022]
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
Troubleshooting SQL Server
Tuning Android for low RAM
Zendcon scaling magento
Ad

More from Rodrigo Campos (20)

PDF
Velocity Conference NYC 2014 - Real World DevOps
PDF
DevOps no mundo real - QCON 2014
PDF
7Masters Webops in the Cloud
PPT
14 guendert pres
PDF
Large and Giant Pages
PDF
Otimização holistica de ambiente computacional
PDF
Desempenho e Escalabilidade de Banco de Dados em ambiente x86
PPT
13 coelho final-pres
PPT
Mistério ou tecnologia? Paralelismo!
PPTX
Sistemas de proteção de perímetro
PDF
Devops at Walmart GeC Brazil
PDF
Disk IO Benchmarking in shared multi-tenant environments
PDF
Cloud Computing Oportunidades e Desafios
PDF
The good, the bad and the big... data
PPTX
CMG 2012 - Tuning where it matters - Gerry Tuddenham
PDF
A Consumerização da TI e o Efeito BYOT
PPT
CMG Brasil 2012 - Uso de Lines nos z196
PDF
Racionalização e Otimização de Energia em Computação na Nuvem
PDF
SDN - Openflow + OpenVSwitch + Quantum
PDF
AWS RDS Benchmark - CMG Brasil 2012
Velocity Conference NYC 2014 - Real World DevOps
DevOps no mundo real - QCON 2014
7Masters Webops in the Cloud
14 guendert pres
Large and Giant Pages
Otimização holistica de ambiente computacional
Desempenho e Escalabilidade de Banco de Dados em ambiente x86
13 coelho final-pres
Mistério ou tecnologia? Paralelismo!
Sistemas de proteção de perímetro
Devops at Walmart GeC Brazil
Disk IO Benchmarking in shared multi-tenant environments
Cloud Computing Oportunidades e Desafios
The good, the bad and the big... data
CMG 2012 - Tuning where it matters - Gerry Tuddenham
A Consumerização da TI e o Efeito BYOT
CMG Brasil 2012 - Uso de Lines nos z196
Racionalização e Otimização de Energia em Computação na Nuvem
SDN - Openflow + OpenVSwitch + Quantum
AWS RDS Benchmark - CMG Brasil 2012

Recently uploaded (20)

PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
Chapter 5: Probability Theory and Statistics
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
A Presentation on Touch Screen Technology
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Approach and Philosophy of On baking technology
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Hybrid model detection and classification of lung cancer
SOPHOS-XG Firewall Administrator PPT.pptx
Chapter 5: Probability Theory and Statistics
NewMind AI Weekly Chronicles - August'25-Week II
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
A Presentation on Artificial Intelligence
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Programs and apps: productivity, graphics, security and other tools
1 - Historical Antecedents, Social Consideration.pdf
Enhancing emotion recognition model for a student engagement use case through...
Digital-Transformation-Roadmap-for-Companies.pptx
DP Operators-handbook-extract for the Mautical Institute
A Presentation on Touch Screen Technology
WOOl fibre morphology and structure.pdf for textiles
cloud_computing_Infrastucture_as_cloud_p
A comparative study of natural language inference in Swahili using monolingua...
Approach and Philosophy of On baking technology
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
TLE Review Electricity (Electricity).pptx
Hybrid model detection and classification of lung cancer

z/VM Performance Analysis

  • 1. 1 z/VM Performance Analysis Lívio Sousa - [email protected] IBM zEnterprise Client Technical Specialist
  • 2. 2 Overview • Guidelines • Commands • *MONITOR • Performance Toolkit • Omegamon XE
  • 3. 3 Definition of Performance • Performance definitions: – Response time – Batch elapsed time – Throughput – Resource consumed per unit of work done – Utilization – Users supported – Phone ringing – Consistency • All of the above
  • 4. 4 Performance Guidelines • Processor • Storage • Paging • Minidisk cache • Server machines
  • 5. 5 Processor Guidelines • Dedicated processors - mostly political – Absolute share can be almost as effective – Gets wait state assist and 500 ms minor time slice – Perhaps not a good idea if you are CPU-constrained – A virtual machine should have all dedicated or all shared processors • Share settings – Use absolute if you can judge percent of resources required – Use relative if difficult to judge and if lower share as system load increases is acceptable – Be aware that share value is split by vCPUs – Do not use LIMITHARD settings unnecessarily • Masks looping users • More scheduler overhead • Use the right number of virtual processors for the guest's workload • Don’t share all available IFLs to all LPARs – Suspend Time can be high
  • 6. 6 Storage Guidelines Virtual-to-real ratio should be <= 3:1 or make sure paging system is robust – To avoid any performance impact for production workloads, you may need to keep ratio to 1:1 – See also http://www.vm.ibm.com/perf/tips/memory.html – VIR2REAL EXEC (Bruce Hayden) http://www.vm.ibm.com/download/packages/descript.cgi?VIR2REAL Define some processor storage as expanded storage to provide paging hierarchy – For more background, see http://www.vm.ibm.com/perf/tips/storconf.html Size guests appropriately – Avoiding over provisioning – Do not put them in a high guest paging position – Right-sized usually means "just barely swapping" Exploit shared memory where possible – IPL your Linux guests from a segment – Use the Linux XIP (execute-in-place) file system Total Virtual storage (all logged on userids): 388308 MB (379.2 GB) Usable real storage (pageable) for this system: 202927 MB (198.2 GB) Total LPAR Real storage: 204800 MB (200.0 GB) Expanded storage usable for paging: 25600 MB ( 25.0 GB) Total Virtual disk (VDISK) space defined: 50176 MB ( 49.0 GB) Average Virtual disk size: 512 MB Virtual + VDISK to Real storage ratio: 2.2 : 1
  • 7. 7 Paging Guidelines • DASD paging allocations less than or equal to 50% – QUERY ALLOC PAGE • Watch blocks read per paging request (keep >10) – Long block runs make paging I/O efficient • Multiple volumes and multiple paths – Remember, one I/O per real device at a time – Use lots of little volumes rather than a few big volumes – Pay attention in Response Time and Wait Queues • Do not mix sizes of paging DASD – Use all -3s, or all -9s, or whatever • Paging to FCP SCSI (EDEVICES) may offer higher paging bandwidth with higher processor requirements – See also http://www.vm.ibm.com/perf/tips/prgpage.html
  • 8. 88 Reorder Processing - Background • Page reorder is the process in z/VM of managing user frame owned lists as input to demand scan processing. – It includes resetting the HW reference bit. – Serializes the virtual machine (all virtual processors). – In all releases of z/VM • It is done periodically on a virtual machine basis. • The cost of reorder is proportional to the number of resident frames for the virtual machine. – Roughly 130 ms/GB resident – Delays of ~1 second for guest having 8 GB resident – This can vary for different reasons +/- 40%
  • 9. 9 9 Reorder Processing - Diagnosing Performance Toolkit – Check resident page fields (“R<2GB” & “R>2GB”) on FCX113 UPAGE report • Remember, Reorder works against the resident pages, not total virtual machine size. – Check Console Function Mode Wait (“%CFW”) on FCX114 USTAT report • A virtual machine may be brought through console function mode to serialize Reorder. There are other ways to serialize for Reorder and there are other reasons that for CFW, so this is not conclusive. REORDMON – Available from the VM Download Page http://www.vm.ibm.com/download/packages/ – Works against raw MONWRITE data for all monitored virtual machines – Works in real time for a specific virtual machine – Provides how often Reorder processing occurs in each monitor interval
  • 10. 10 10 REORDMON Example Num. of Average Average Userid Reorders Rsdnt(MB) Ref'd(MB) Reorder Times -------- -------- --------- --------- ------------------- LINUX002 2 18352 13356 13:29:05 14:15:05 LINUX001 1 22444 6966 13:44:05 LINUX005 1 14275 5374 13:56:05 LINUX003 2 21408 13660 13:43:05 14:10:05 LINUX007 1 12238 5961 13:51:05 LINUX006 1 9686 4359 13:31:05 LINUX004 1 21410 11886 14:18:05
  • 11. 11 11 Reorder Processing - Mitigations • Try to keep the virtual machine as small as possible. • Virtual machines with multiple applications may need to be split into multiple virtual machines with fewer applications. • See http://www.vm.ibm.com/perf/tips/reorder.html for more details. • Apply APAR VM64774 if necessary: – SET and QUERY commands, system wide settings – Corrects problem in earlier “patch” solution that inhibits paging of PGMBKs (Page Tables) for virtual machines where Reorder is set off. – z/VM 5.4.0 PTF UM33167 RSU 1003 – z/VM 6.1.0 PTF UM33169 RSU 1003
  • 12. 12 Minidisk Cache Guidelines • In general, enable MDC for everything • Configure some real storage for MDC • Set maximum MDC limits – SET MDC STOR 0M 256M and SET MDC XSTOR 0M 0M • Disable MDC for – Write-mostly or read-once disks (logs, accounting, Linux swap) – Target volumes in backup scenarios • Better performer than Virtual Disk in Storage (VDISK) for read I/Os
  • 13. 13 Server Machine Guidelines • Server Virtual Machine (SVM) • TCP/IP, RACFVM, etc. • QUICKDSP ON to avoid eligible list • Higher SHARE setting • Ensure performance data includes these virtual machines
  • 14. 14 CP INDICATE Command • LOAD: shows total system load – Processors, XSTORE, paging, MDC, queue lengths, storage load – STORAGE value not very meaningful • USER EXP: more useful than plain USER • QUEUES EXP: great for scheduler problems and quick state sampling – Mostly useful for eligible list assessments • PAGING: lists users in page wait • I/O: lists users in I/O wait • ACTIVE: displays number of active users over given interval • Consider using MONITOR DATA instead for "serious" examinations
  • 15. 15 CP INDICATE LOAD Example INDICATE LOAD AVGPROC-088% 03 XSTORE-000000/SEC MIGRATE-0000/SEC MDC READS-000035/SEC WRITES-000001/SEC HIT RATIO-099% PAGING-0023/SEC STEAL-000% Q0-00007(00000) DORMANT-00410 Q1-00000(00000) E1-00000(00000) Q2-00001(00000) EXPAN-002 E2-00000(00000) Q3-00013(00000) EXPAN-002 E3-00000(00000) PROC 0000-087% PROC 0001-088% PROC 0002-089% LIMITED-00000
  • 16. 16 Selected CP QUERY Commands USERS: number and type of users on system SRM: scheduler/dispatcher settings (LDUBUF, etc.) SHARE: type and intensity of system share FRAMES: real storage allocation PATHS: physical paths to device and status ALLOC MAP: DASD allocation ALLOC PAGE: how full your paging space is XSTORE: assignment of expanded storage MONITOR: current monitor settings MDC: MDC usage VDISK: virtual disk in storage usage SXSPAGES: System Execution Space
  • 17. 17 5,000 Foot View CP Control Blocks Application Data VM Events *MONITOR System Service MONDCSS Segment MONWRITE Utility Performance Toolkit Raw Monwrite History Files TCP/IP Network 3270 Browser VMRM
  • 18. 18
  • 19. 19 Processor REPORT NAME REPORT CODE COMMAND CPU Load and Transactions FCX100 CPU LPAR Load FCX126 LPAR Processor Log FCX144 PROCLOG LPAR Load Log FCX202 LPARLOG User Wait States FCX114 USTAT System Summary FCX225 SYMSUMLG
  • 20. 20 FCX126 Run 2011/09/20 18:00:56 Logical Partition Activity From 2011/09/13 09:19:15 To 2011/09/13 10:09:15 For 3000 Secs 00:50:00 Result of 13092011 Run __________________________________________________________________________________________ Processor type and model : 2817-401 Nr. of configured partitions: 6 Nr. of physical processors : 25 Partition Nr. Upid #Proc Weight Wait-C Cap %Load CPU %Busy %Ovhd %Susp %VMld %Logld Type LPAR1 1 00 24 100 NO NO 89.0 0 94.3 2.1 6.5 92.0 98.4 IFL 100 NO 1 93.4 2.4 7.7 90.8 98.3 IFL 100 NO 2 93.6 2.3 7.4 91.1 98.3 IFL 100 NO 3 93.6 2.4 7.5 91.1 98.4 IFL 100 NO 4 93.6 2.3 7.4 91.1 98.4 IFL 100 NO 5 93.5 2.3 7.5 91.0 98.3 IFL 100 NO 6 93.4 2.4 7.6 90.9 98.3 IFL 100 NO 7 93.2 2.4 7.7 90.6 98.1 IFL 100 NO 8 93.4 2.4 7.5 90.8 98.2 IFL 100 NO 9 93.2 2.4 7.7 90.7 98.2 IFL 100 NO 10 93.1 2.5 7.8 90.4 98.0 IFL 100 NO 11 93.2 2.4 7.7 90.6 98.0 IFL 100 NO 12 93.4 2.4 7.5 90.8 98.1 IFL 100 NO 13 93.3 2.3 7.5 90.8 98.1 IFL 100 NO 14 93.3 2.4 7.5 90.7 98.1 IFL 100 NO 15 93.2 2.5 7.6 90.5 97.9 IFL 100 NO 16 91.1 2.9 9.0 88.0 96.6 IFL 100 NO 17 91.3 2.8 8.8 88.2 96.7 IFL 100 NO 18 91.4 2.9 8.9 88.3 96.8 IFL 100 NO 19 91.5 2.7 8.8 88.5 97.0 IFL 100 NO 20 91.7 2.8 8.7 88.6 97.1 IFL 100 NO 21 91.5 2.8 8.9 88.5 97.1 IFL
  • 21. 21 FCX225 Run 2011/09/20 18:00:56 SYSSUMLG System Performance Summary by Time From 2011/09/13 09:19:15 To 2011/09/13 10:09:15 For 3000 Secs 00:50:00 Result of 13092011 Run _________________________________________________________________________________ <------- CPU --------> <Vec> <--Users--> <---I/O---> <Stg> <-Paging--> <--Ratio--> SSCH DASD Users <-Rate/s--> Interval Pct Cap- On- Pct Log- +RSCH Resp in PGIN+ Read+ End Time Busy T/V ture line Busy ged Activ /s msec Elist PGOUT Write >>Mean>> 90.0 1.10 .9293 24.0 .... 117 108 571.2 .4 .0 2610 1051 09:20:15 92.4 1.13 .9059 24.0 .... 117 108 523.0 .5 .0 1992 527.8 09:21:15 92.9 1.07 .9523 24.0 .... 117 108 399.2 .5 .0 1669 301.4 09:22:15 93.2 1.08 .9458 24.0 .... 117 108 557.4 .3 .0 2817 633.9 09:23:15 94.5 1.07 .9535 24.0 .... 117 108 590.3 .3 .0 1410 482.7 09:24:15 93.4 1.07 .9537 24.0 .... 117 108 649.5 .4 .0 2363 488.5 09:25:15 90.4 1.09 .9347 24.0 .... 117 108 684.7 .4 .0 2485 768.9 09:26:15 92.4 1.08 .9436 24.0 .... 117 108 666.8 .4 .0 2940 1215 09:27:15 90.9 1.09 .9344 24.0 .... 117 108 607.2 .4 .0 3179 726.7 09:28:15 92.2 1.08 .9469 24.0 .... 117 108 664.2 .5 .0 2179 896.0 09:29:17 90.8 1.10 .9318 24.0 .... 117 108 645.9 .6 .0 3404 804.5 09:30:16 89.5 1.19 .8579 24.0 .... 117 108 670.8 .7 .0 5402 3487 09:31:15 92.7 1.08 .9412 24.0 .... 117 108 588.7 .4 .0 3091 1807 09:32:15 91.2 1.09 .9421 24.0 .... 117 108 602.8 .3 .0 2635 1076 09:33:16 89.3 1.14 .9047 24.0 .... 117 108 255.2 .5 .0 3140 710.5 09:34:15 88.5 1.10 .9374 24.0 .... 117 108 205.2 .6 .0 2513 897.4 09:35:15 85.9 1.12 .9257 24.0 .... 117 108 320.4 .5 .0 3117 953.5 09:36:16 86.1 1.13 .9144 24.0 .... 117 108 213.5 .5 .0 3642 1144 09:37:16 83.0 1.14 .9090 24.0 .... 117 108 245.6 .5 .0 3414 2133
  • 22. 22 REPORT NAME REPORT CODE COMMAND Auxiliary Storage Log FCX146 AUXLOG CP Owned Device FCX109 DEVICE CPOWNED User Page Data FCX113 UPAGE Shared Data Spaces FCX134 DSPACESH SXS Available Page Queues Mgnt FCX261 SXSAVAIL Mini Disk Storage FCX178 MDCSTOR Storage Utilization FCX103 STORAGE Available List Log FCX254 AVAILLOG Storage
  • 23. 23 FCX109 Run 2011/05/31 17:44:26 DEVICE CPOWNED Load and Performance of CP Owned Disks From 2011/05/12 16:48:41 To 2011/05/12 17:31:41 For 2580 Secs 00:43:00 Result of 20110512 Run _______________________________________________________________________________ Page / SPOOL Allocation Summary PAGE slots available 25165k SPOOL slots available 3605928 PAGE slot utilization 25% SPOOL slot utilization 65% T-Disk cylinders avail. ....... DUMP slots available 0 T-Disk space utilization ...% DUMP slot utilization ..% . . . . . . . . . . _____ . .< Device Descr. -> <------------- Rate/s -------------> User Serv MLOAD Volume Area Area Used <--Page---> <--Spool--> SSCH Inter Queue Time Resp Addr Devtyp Serial Type Extent % P-Rds P-Wrt S-Rds S-Wrt Total +RSCH feres Lngth /Page Time EDF1 9336 ZDPAG1 PAGE 12583k 25 196.5 199.9 ... ... 396.4 .0 0 8.18 5.5 88.0 EDF2 9336 ZDPAG2 PAGE 12583k 24 194.2 206.1 ... ... 400.4 .0 0 7.23 6.0 58.4 4374 3390 610SP1 SPOOL 802880 61 .0 .0 .0 .0 .0 .1 0 0 .4 .4 4672 3390 610SP2 SPOOL 803060 68 .0 .0 .0 .0 .0 .0 0 0 1.0 1.0
  • 24. 24 I/O REPORT NAME REPORT CODE COMMAND General I/O Device FCX108 DEVICE SCSI Device FCX249 SCSI DASD Performance Log FCX131 DEVCONF FICON Channel Load FCX215 INTERIM FCHANNEL General I/O Device Data Log FCX168 DEVLOG I/O Processor Log FCX232 IOPROCLG
  • 25. 25 Studying MONWRITE Data • z/VM Performance Toolkit • Interactively – possible, but not so useful • PERFKIT BATCH command – pretty useful – Control files tell Perfkit which reports to produce – You can then inspect the reports by hand or programmatically • See z/VM Performance Toolkit Reference for information on how to use PERFKIT BATCH • PRFIT (Brian Wade) http://www.vm.ibm.com/download/packages/descript.cgi?PRFIT
  • 26. 26 26 Some Notes on z/VM Limits • Sheer hardware: – z/VM 5.2: 24 engines, 128 GB real – z/VM 5.3: 32 engines, 256 GB real – zSeries: 65,000 I/O devices • Workloads we’ve run in test have included: – 54 engines – 440 GB real storage – 128 GB XSTORE – 240 1-GB Linux guests – 8 1-TB guests • Utilizations we routinely see in customer environments – 85% to 95% CPU utilization without worry – Tens of thousands of pages per second without worry • Our limits tend to have two distinct shapes – Performance drops off slowly with utilization (CPUs) – Performance drops off rapidly when wall is hit (storage) Performance Utilization Precipitous (e.g., storage) Gradual (e.g., CPUs)
  • 27. 27 Some Final Thoughts • Define what is performance for your case • Collect data for a base line of good performance • Implement change management process • Make as few changes as possible at a time • Relieving one bottleneck will reveal another
  • 28. 28 OBRIGADO! Informações de Contato: Livio Sousa IBM Tutóia – SP [email protected] +55 11 9 7203 6637