SlideShare a Scribd company logo
1
PHP's Virtual Machine
2
Hello
● Julien PAULI
● Programming in PHP since early 2000s
● PHP Internals hacker and trainer
● PHP 5.5/5.6 Release Manager
● Working at SensioLabs in Paris - Blackfire
● Writing PHP tech articles and books
● http://phpinternalsbook.com
● @julienpauli - github.com/jpauli - jpauli@php.net
● Like working on OSS such as PHP :-)
3
PHP
● A program in itself
● Written in C
● Goal : Define a programming Web language
● High level, interpreted
● Interpreted language
● Less efficient than native-instr compiled language
● but simpler to handle
4
PHP
5
PHP from inside
● A software virtual machine
● Compiler/Executor
● intermediate OPCode
● Mono Thread, Mono process
● Automatic dynamic memory management
● Memory Manager
● Garbage collector
6
Zend Engine
● The heart of PHP
● An extensible part
● extensions and zend_extensions can change it
● A Virtual Machine
● A compiler
● An executor
● Some utilities
● OPCache
● A Zend extension that plays with the engine deeply
● Compiler optimizer is stored into OPCache
7
Request treatment steps
● Startup (memory allocations)
● Compilation
● Lexical and syntaxic analysis
● Compilation (OP Code generation)
● Execution
● OPCode interpretation
● Several VM flavors
● Include/require/eval = go back to compilation
● Shutdown (free resources)
● "Share nothing architecture"
Startup
Shutdown
zend_compile_file()
zend_execute()
8
Script execution
● Compilation
● Optmization (OPCache)
● Execution
● Destruction
9
Lexical analysis (lexing)
● Characters recognition
● Transform chars to tokens
● Lexer generator : Re2c
● http://re2c.org/
● http://www.php.net/manual/fr/tokens.php
● highlight_file()
● highlight_string()
● compile_file()
● compile_string()
10
Sementical analysis (parsing)
● "Understands" a set of tokens
● Defines the language syntax
● Parser generator : GNU/Bison (LALR)
● Foreach token or token set
● → Execute a function to generate an AST statement
● → Goto next token
● → Can generate "Parse error" and halt
● Very tied to lexical analyzer
11
zend_language_parser.y
● ext/tokenizer
statement:
'{' inner_statement_list '}' { $$ = $2; }
| if_stmt { $$ = $1; }
| alt_if_stmt { $$ = $1; }
| T_WHILE '(' expr ')' while_statement
{ $$ = zend_ast_create(ZEND_AST_WHILE, $3, $5); }
| T_DO statement T_WHILE '(' expr ')' ';'
{ $$ = zend_ast_create(ZEND_AST_DO_WHILE, $2, $5); }
| T_FOR '(' for_exprs ';' for_exprs ';' for_exprs ')' for_statement
{ $$ = zend_ast_create(ZEND_AST_FOR, $3, $5, $7, $9); }
| T_SWITCH '(' expr ')' switch_case_list
{ $$ = zend_ast_create(ZEND_AST_SWITCH, $3, $5); }
| T_BREAK optional_expr ';' { $$ = zend_ast_create(ZEND_AST_BREAK, $2); }
| T_CONTINUE optional_expr ';' { $$ = zend_ast_create(ZEND_AST_CONTINUE, $2); }
| T_RETURN optional_expr ';' { $$ = zend_ast_create(ZEND_AST_RETURN, $2); }
$(YACC) -p zend -v -d
$(srcdir)/zend_language_parser.y
-o zend_language_parser.c
12
Compilation
● Invoked on final AST
● Userland AST: https://github.com/nikic/php-ast
● Creates an OPCodes array
● OPCode = low level VM instruction
● Somehow similar to low level assembly
● Compilation step is very heavy
● Lots of checks and memory accesses
● address resolutions and computations
● many stacks and memory pools
● Some early optimizations/computations are performed
13
Optimization
● Optimizations are done by ext/opcache
● The optimizer is very heavy (in PHP 7)
● Steps are defined in opcache.optimization_level INI setting
#define ZEND_OPTIMIZER_PASS_1 (1<<0) /* CSE, STRING construction */
#define ZEND_OPTIMIZER_PASS_2 (1<<1) /* Constant conversion and jumps */
#define ZEND_OPTIMIZER_PASS_3 (1<<2) /* ++, +=, series of jumps */
#define ZEND_OPTIMIZER_PASS_4 (1<<3) /* INIT_FCALL_BY_NAME -> DO_FCALL */
#define ZEND_OPTIMIZER_PASS_5 (1<<4) /* CFG based optimization */
#define ZEND_OPTIMIZER_PASS_6 (1<<5) /* DFA based optimization */
#define ZEND_OPTIMIZER_PASS_7 (1<<6) /* CALL GRAPH optimization */
#define ZEND_OPTIMIZER_PASS_8 (1<<7) /* SCCP (constant propagation) */
#define ZEND_OPTIMIZER_PASS_9 (1<<8) /* TMP VAR usage */
#define ZEND_OPTIMIZER_PASS_10 (1<<9) /* NOP removal */
#define ZEND_OPTIMIZER_PASS_11 (1<<10) /* Merge equal constants */
#define ZEND_OPTIMIZER_PASS_12 (1<<11) /* Adjust used stack */
#define ZEND_OPTIMIZER_PASS_13 (1<<12) /* Remove unused variables */
#define ZEND_OPTIMIZER_PASS_14 (1<<13) /* DCE (dead code elimination) */
#define ZEND_OPTIMIZER_PASS_15 (1<<14) /* (unsafe) Collect constants */
#define ZEND_OPTIMIZER_PASS_16 (1<<15) /* Inline functions */
14
First easy example
<?php
print 'foo';
15
Compilation easy example
<?php
print 'foo';
<ST_IN_SCRIPTING>"print" {
return T_PRINT;
}
T_PRINT expr { $$ = zend_ast_create(ZEND_AST_PRINT, $2); }
lexing
parsing
16
Compilation easy example
case ZEND_AST_PRINT:
zend_compile_print(result, ast);
return;
compiling
T_PRINT expr { $$ = zend_ast_create(ZEND_AST_PRINT, $2); }
void zend_compile_print(znode *result, zend_ast *ast) /* {{{ */
{
zend_op *opline;
zend_ast *expr_ast = ast->child[0];
znode expr_node;
zend_compile_expr(&expr_node, expr_ast);
opline = zend_emit_op(NULL, ZEND_ECHO, &expr_node, NULL);
opline->extended_value = 1;
result->op_type = IS_CONST;
ZVAL_LONG(&result->u.constant, 1);
}
17
Execution
● Execute OPCodes
● Most complex part of Zend Engine
● VM executor
● zend_vm_execute.h
● Each OPCode
● is run through a handler() function
● "zend_vm_handler"
● runs the instructions in an infinite dipatch
loop
● Branching possibles (loops, catch blocks,
gotos, etc...)
Startup
Shutdown
zend_compile_file()
zend_execute()
18
ZEND_ECHO
ZEND_VM_HANDLER(40, ZEND_ECHO, CONST|TMPVAR|CV, ANY)
{
USE_OPLINE
zend_free_op free_op1;
zval *z;
SAVE_OPLINE();
z = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
if (Z_TYPE_P(z) == IS_STRING) {
zend_string *str = Z_STR_P(z);
if (ZSTR_LEN(str) != 0) {
zend_write(ZSTR_VAL(str), ZSTR_LEN(str));
}
} else {
zend_string *str = _zval_get_string_func(z);
if (ZSTR_LEN(str) != 0) {
zend_write(ZSTR_VAL(str), ZSTR_LEN(str));
} else if (OP1_TYPE == IS_CV && UNEXPECTED(Z_TYPE_P(z) == IS_UNDEF)) {
GET_OP1_UNDEF_CV(z, BP_VAR_R);
}
zend_string_release(str);
}
19
OPCode ?
● php
-dzend_extension=opcache
-dopcache.enable_cli=1
-dopcache.opt_debug_level=0x30000
/tmp/script.php
L0 (3): CV0($options) = RECV 1
L1 (5): CV1($invalid) = QM_ASSIGN array(...)
L2 (6): V4 = FE_RESET_R CV0($options) L18
L3 (6): T5 = FE_FETCH_R V4 CV2($value) L18
L4 (6): ASSIGN CV3($key) T5
...
20
Many OPCodes
● OPCodes are low level VM instructions
● Many of them, more and more as PHP evolves
● ~ 200 flavors in PHP 7
● See the list from Zend/zend_vm_opcodes.h
● ZEND_ADD
● ZEND_DECLARE_ANON_CLASS
● ZEND_FE_RESET
● ZEND_ADD_TRAIT
● ZEND_YIELD_FROM
● ...
21
OPCode handlers
● Each OPCode is treated by a handler (a function)
● It takes up to 3 arguments and produces exactly one
result
● Arguments and result are "variable" like you know them
● ZEND_ADD(num1, num2) : result_num
● ZEND_DECLARE_ANON_CLASS(class) : result_bool
● ZEND_FE_RESET(array_or_object) : result
● ZEND_ADD_TRAIT(class, trait) : result_bool
● ZEND_YIELD_FROM(cur_gen, gen_from) : result
22
ZEND_CONCAT example
ZEND_VM_HANDLER(8, ZEND_CONCAT, CONST|TMPVAR|CV, CONST|TMPVAR|CV)
{
USE_OPLINE
zend_free_op free_op1, free_op2;
zval *op1, *op2;
op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
if ((OP1_TYPE == IS_CONST || EXPECTED(Z_TYPE_P(op1) == IS_STRING)) &&
(OP2_TYPE == IS_CONST || EXPECTED(Z_TYPE_P(op2) == IS_STRING))) {
zend_string *op1_str = Z_STR_P(op1);
zend_string *op2_str = Z_STR_P(op2);
zend_string *str;
if (OP1_TYPE != IS_CONST && OP1_TYPE != IS_CV &&
!ZSTR_IS_INTERNED(op1_str) && GC_REFCOUNT(op1_str) == 1) {
size_t len = ZSTR_LEN(op1_str);
str = zend_string_extend(op1_str, len + ZSTR_LEN(op2_str), 0);
memcpy(ZSTR_VAL(str) + len, ZSTR_VAL(op2_str), ZSTR_LEN(op2_str)+1);
ZVAL_NEW_STR(EX_VAR(opline->result.var), str);
FREE_OP2();
}
...
23
More complex example
function check_options(array $options)
{
$invalid = [];
foreach ($options as $key => $value) {
if (array_key_exists($key, $this->options)) {
$this->options[$key] = $value;
} else {
$invalid[] = $key;
}
}
}
L0 (4): CV0($options) = RECV 1
L1 (6): ASSIGN CV1($invalid) array(...)
L2 (7): V5 = FE_RESET_R CV0($options) L18
L3 (7): T6 = FE_FETCH_R V5 CV2($value) L18
L4 (7): ASSIGN CV3($key) T6
L5 (8): INIT_FCALL 2 112 string("array_key_exists")
L6 (8): SEND_VAR CV3($key) 1
L7 (8): T8 = FETCH_OBJ_R THIS string("options")
L8 (8): SEND_VAL T8 2
L9 (8): V9 = DO_ICALL
L10 (8): JMPZ V9 L15
L11 (9): V10 = FETCH_OBJ_W THIS string("options")
L12 (9): ASSIGN_DIM V10 CV3($key)
L13 (9): OP_DATA CV2($value)
L14 (9): JMP L17
L15 (11): ASSIGN_DIM CV1($invalid) NEXT
L16 (11): OP_DATA CV3($key)
L17 (7): JMP L3
L18 (7): FE_FREE V5
L19 (14): RETURN null
24
OPCode Cache
● First time
● Compile
● Cache to SHM or cache file
● Execute
● Then, if file did not change
● Load from SHM or cache file
● Execute
● Compilation is very heavy
● Optimization can be as well
25
Compilation / Execution
function foo()
{
$data = file('/etc/fstab');
sort($data);
return $data;
}
for($i=0; $i<=$argv[1]; $i++)
{
$a = foo();
$a[] = range(0, $i);
$result[] = $a;
}
var_dump($result);
main()==>run_init::tmp/php.php//1 241
main()==>compile::tmp/php.php//1 89
main()==>run_init::tmp/php.php//1 1731
main()==>compile::tmp/php.php//1 89
argv = 1
argv = 10
26
Other topics in quick
27
CLI
● Even if more and more used, PHP has not been designed
to run in CLI (for long running scripts)
● In long run CLI ("consumers"), the VM never stops
● PHP never stops, thus never reaches its "cleaning memory
step"
● The "current request" memory is then never freed
● Even with GC on, the programmer has to really take care
not to create "memory leaks"
● And for that he has to master how PHP works internally
● Or use a low level memory debugger, like valgrind/massif
● OPCode caches and optimizers are pretty useless to CLI
● Optimization can be worth it
● Compilation prevention is useless as runtime will take a lot
28
JIT ?
● JIT is a complex topic, and coming to PHP 8
● Still under development
● It should accelerate very CPU intensive tasks
● Aka : not web applications, usually
● Until you really treat that many data per request, which
you shouldn't do anyway with PHP.
● But CLI scripts will mainly benefit from JIT (composer ?)
● Take care as it wont accelerate any IO intensive tasks
● And we tend to run some using PHP nowadays
● (Aka "event loops" and things like that)
29
PHP's memory consumption
● Know what you are talking about and what you're doing
● Know your OS and memory allocators
● memory_get_usage(): size used by your runtime code
● memory_get_usage(true): size allocated through the OS
● ZendMM caches blocks
● use gc_mem_caches() to reclaim them if needed
● Use your OS to be accurate
php> echo memory_get_usage();
625272
php> echo memory_get_usage(1);
786432
cat /proc/13399/status
Name:php
State: S (sleeping)
VmPeak: 154440 kB
VmSize: 133700 kB
VmRSS: 10304 kB
VmData: 4316 kB
VmStk: 136 kB
VmExe: 9876 kB
30
A software VM
● PHP internally works the same as
● Java
● Python
● Ruby
● Lua
● [... others ]
● But PHP's VM is not threaded, it runs a monolithic path
● PHP's VM compiler/optimizer/interpreter are merged
into PHP source code
● zend_compile.c
● ext/opcache/Optimizer/zend_optimizer.c
● zend_vm_def.h / zend_execute.c
31
Thank you for listening

More Related Content

PDF
Php engine
PDF
Quick tour of PHP from inside
PDF
PHP 7 OPCache extension review
PDF
Php and threads ZTS
PDF
PHP7 is coming
ODP
PHP Tips for certification - OdW13
ODP
PHP5.5 is Here
PDF
Create your own PHP extension, step by step - phpDay 2012 Verona
Php engine
Quick tour of PHP from inside
PHP 7 OPCache extension review
Php and threads ZTS
PHP7 is coming
PHP Tips for certification - OdW13
PHP5.5 is Here
Create your own PHP extension, step by step - phpDay 2012 Verona

What's hot (20)

PPT
The Php Life Cycle
PDF
Understanding PHP objects
PDF
Building Custom PHP Extensions
PDF
IPC2010SE Doctrine2 Enterprise Persistence Layer for PHP
ODP
Php in 2013 (Web-5 2013 conference)
PDF
Profiling php5 to php7
PPT
How PHP Works ?
PDF
Phpをいじり倒す10の方法
PDF
Php7 extensions workshop
ODP
OpenGurukul : Language : PHP
ODP
Отладка в GDB
PPTX
Php Extensions for Dummies
KEY
Yapcasia2011 - Hello Embed Perl
PPT
Евгений Крутько, Многопоточные вычисления, современный подход.
PDF
Mysqlnd, an unknown powerful PHP extension
PDF
Understanding PHP memory
PDF
PHP 7 new engine
PDF
Php extensions workshop
PPT
PPTX
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
The Php Life Cycle
Understanding PHP objects
Building Custom PHP Extensions
IPC2010SE Doctrine2 Enterprise Persistence Layer for PHP
Php in 2013 (Web-5 2013 conference)
Profiling php5 to php7
How PHP Works ?
Phpをいじり倒す10の方法
Php7 extensions workshop
OpenGurukul : Language : PHP
Отладка в GDB
Php Extensions for Dummies
Yapcasia2011 - Hello Embed Perl
Евгений Крутько, Многопоточные вычисления, современный подход.
Mysqlnd, an unknown powerful PHP extension
Understanding PHP memory
PHP 7 new engine
Php extensions workshop
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
Ad

Similar to PHP Internals and Virtual Machine (20)

PDF
PHP, Under The Hood - DPC
PPTX
Php extensions
PPTX
Php extensions
PPTX
Listen afup 2010
PPTX
Php extensions
PPTX
Php’s guts
PDF
PHP7 - The New Engine for old good train
PDF
Dynamic PHP web-application analysis
PDF
Getting Started with PHP Extensions
PDF
SyScan Singapore 2010 - Returning Into The PHP-Interpreter
KEY
Let's creating your own PHP (tejimaya version)
ODP
Php opcodes sep2008
PDF
Review unknown code with static analysis Zend con 2017
PDF
PHP QA Tools
PDF
20 PHP Static Analysis and Documentation Generators #burningkeyboards
KEY
Anatomy of a PHP Request ( UTOSC 2010 )
PPTX
Extending php (7), the basics
PPTX
Php core. get rid of bugs and contribute
PPTX
Zend Framework Workshop
PDF
Static analysis saved my code tonight
PHP, Under The Hood - DPC
Php extensions
Php extensions
Listen afup 2010
Php extensions
Php’s guts
PHP7 - The New Engine for old good train
Dynamic PHP web-application analysis
Getting Started with PHP Extensions
SyScan Singapore 2010 - Returning Into The PHP-Interpreter
Let's creating your own PHP (tejimaya version)
Php opcodes sep2008
Review unknown code with static analysis Zend con 2017
PHP QA Tools
20 PHP Static Analysis and Documentation Generators #burningkeyboards
Anatomy of a PHP Request ( UTOSC 2010 )
Extending php (7), the basics
Php core. get rid of bugs and contribute
Zend Framework Workshop
Static analysis saved my code tonight
Ad

More from julien pauli (15)

PDF
Doctrine with Symfony - SymfonyCon 2019
PDF
PDF
Basics of Cryptography - Stream ciphers and PRNG
PDF
Mastering your home network - Do It Yourself
PDF
SymfonyCon 2017 php7 performances
PDF
PDF
Symfony live 2017_php7_performances
PDF
PHP 7 performances from PHP 5
PDF
Communications Réseaux et HTTP avec PHP
PDF
PHPTour-2011-PHP_Extensions
PDF
PHPTour 2011 - PHP5.4
ODP
Patterns and OOP in PHP
ODP
ZendFramework2 - Présentation
PDF
AlterWay SolutionsLinux Outils Industrialisation PHP
PDF
Apache for développeurs PHP
Doctrine with Symfony - SymfonyCon 2019
Basics of Cryptography - Stream ciphers and PRNG
Mastering your home network - Do It Yourself
SymfonyCon 2017 php7 performances
Symfony live 2017_php7_performances
PHP 7 performances from PHP 5
Communications Réseaux et HTTP avec PHP
PHPTour-2011-PHP_Extensions
PHPTour 2011 - PHP5.4
Patterns and OOP in PHP
ZendFramework2 - Présentation
AlterWay SolutionsLinux Outils Industrialisation PHP
Apache for développeurs PHP

Recently uploaded (20)

PPTX
SAP Ariba Sourcing PPT for learning material
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PPT
Ethics in Information System - Management Information System
PPTX
Internet___Basics___Styled_ presentation
PPTX
Digital Literacy And Online Safety on internet
PDF
Introduction to the IoT system, how the IoT system works
PDF
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PDF
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
PPTX
artificialintelligenceai1-copy-210604123353.pptx
PPTX
artificial intelligence overview of it and more
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PPTX
Mathew Digital SEO Checklist Guidlines 2025
PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
PPTX
innovation process that make everything different.pptx
PDF
Paper PDF World Game (s) Great Redesign.pdf
SAP Ariba Sourcing PPT for learning material
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
Ethics in Information System - Management Information System
Internet___Basics___Styled_ presentation
Digital Literacy And Online Safety on internet
Introduction to the IoT system, how the IoT system works
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
artificialintelligenceai1-copy-210604123353.pptx
artificial intelligence overview of it and more
Module 1 - Cyber Law and Ethics 101.pptx
Mathew Digital SEO Checklist Guidlines 2025
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
Power Point - Lesson 3_2.pptx grad school presentation
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
introduction about ICD -10 & ICD-11 ppt.pptx
innovation process that make everything different.pptx
Paper PDF World Game (s) Great Redesign.pdf

PHP Internals and Virtual Machine

  • 2. 2 Hello ● Julien PAULI ● Programming in PHP since early 2000s ● PHP Internals hacker and trainer ● PHP 5.5/5.6 Release Manager ● Working at SensioLabs in Paris - Blackfire ● Writing PHP tech articles and books ● http://phpinternalsbook.com ● @julienpauli - github.com/jpauli - [email protected] ● Like working on OSS such as PHP :-)
  • 3. 3 PHP ● A program in itself ● Written in C ● Goal : Define a programming Web language ● High level, interpreted ● Interpreted language ● Less efficient than native-instr compiled language ● but simpler to handle
  • 5. 5 PHP from inside ● A software virtual machine ● Compiler/Executor ● intermediate OPCode ● Mono Thread, Mono process ● Automatic dynamic memory management ● Memory Manager ● Garbage collector
  • 6. 6 Zend Engine ● The heart of PHP ● An extensible part ● extensions and zend_extensions can change it ● A Virtual Machine ● A compiler ● An executor ● Some utilities ● OPCache ● A Zend extension that plays with the engine deeply ● Compiler optimizer is stored into OPCache
  • 7. 7 Request treatment steps ● Startup (memory allocations) ● Compilation ● Lexical and syntaxic analysis ● Compilation (OP Code generation) ● Execution ● OPCode interpretation ● Several VM flavors ● Include/require/eval = go back to compilation ● Shutdown (free resources) ● "Share nothing architecture" Startup Shutdown zend_compile_file() zend_execute()
  • 8. 8 Script execution ● Compilation ● Optmization (OPCache) ● Execution ● Destruction
  • 9. 9 Lexical analysis (lexing) ● Characters recognition ● Transform chars to tokens ● Lexer generator : Re2c ● http://re2c.org/ ● http://www.php.net/manual/fr/tokens.php ● highlight_file() ● highlight_string() ● compile_file() ● compile_string()
  • 10. 10 Sementical analysis (parsing) ● "Understands" a set of tokens ● Defines the language syntax ● Parser generator : GNU/Bison (LALR) ● Foreach token or token set ● → Execute a function to generate an AST statement ● → Goto next token ● → Can generate "Parse error" and halt ● Very tied to lexical analyzer
  • 11. 11 zend_language_parser.y ● ext/tokenizer statement: '{' inner_statement_list '}' { $$ = $2; } | if_stmt { $$ = $1; } | alt_if_stmt { $$ = $1; } | T_WHILE '(' expr ')' while_statement { $$ = zend_ast_create(ZEND_AST_WHILE, $3, $5); } | T_DO statement T_WHILE '(' expr ')' ';' { $$ = zend_ast_create(ZEND_AST_DO_WHILE, $2, $5); } | T_FOR '(' for_exprs ';' for_exprs ';' for_exprs ')' for_statement { $$ = zend_ast_create(ZEND_AST_FOR, $3, $5, $7, $9); } | T_SWITCH '(' expr ')' switch_case_list { $$ = zend_ast_create(ZEND_AST_SWITCH, $3, $5); } | T_BREAK optional_expr ';' { $$ = zend_ast_create(ZEND_AST_BREAK, $2); } | T_CONTINUE optional_expr ';' { $$ = zend_ast_create(ZEND_AST_CONTINUE, $2); } | T_RETURN optional_expr ';' { $$ = zend_ast_create(ZEND_AST_RETURN, $2); } $(YACC) -p zend -v -d $(srcdir)/zend_language_parser.y -o zend_language_parser.c
  • 12. 12 Compilation ● Invoked on final AST ● Userland AST: https://github.com/nikic/php-ast ● Creates an OPCodes array ● OPCode = low level VM instruction ● Somehow similar to low level assembly ● Compilation step is very heavy ● Lots of checks and memory accesses ● address resolutions and computations ● many stacks and memory pools ● Some early optimizations/computations are performed
  • 13. 13 Optimization ● Optimizations are done by ext/opcache ● The optimizer is very heavy (in PHP 7) ● Steps are defined in opcache.optimization_level INI setting #define ZEND_OPTIMIZER_PASS_1 (1<<0) /* CSE, STRING construction */ #define ZEND_OPTIMIZER_PASS_2 (1<<1) /* Constant conversion and jumps */ #define ZEND_OPTIMIZER_PASS_3 (1<<2) /* ++, +=, series of jumps */ #define ZEND_OPTIMIZER_PASS_4 (1<<3) /* INIT_FCALL_BY_NAME -> DO_FCALL */ #define ZEND_OPTIMIZER_PASS_5 (1<<4) /* CFG based optimization */ #define ZEND_OPTIMIZER_PASS_6 (1<<5) /* DFA based optimization */ #define ZEND_OPTIMIZER_PASS_7 (1<<6) /* CALL GRAPH optimization */ #define ZEND_OPTIMIZER_PASS_8 (1<<7) /* SCCP (constant propagation) */ #define ZEND_OPTIMIZER_PASS_9 (1<<8) /* TMP VAR usage */ #define ZEND_OPTIMIZER_PASS_10 (1<<9) /* NOP removal */ #define ZEND_OPTIMIZER_PASS_11 (1<<10) /* Merge equal constants */ #define ZEND_OPTIMIZER_PASS_12 (1<<11) /* Adjust used stack */ #define ZEND_OPTIMIZER_PASS_13 (1<<12) /* Remove unused variables */ #define ZEND_OPTIMIZER_PASS_14 (1<<13) /* DCE (dead code elimination) */ #define ZEND_OPTIMIZER_PASS_15 (1<<14) /* (unsafe) Collect constants */ #define ZEND_OPTIMIZER_PASS_16 (1<<15) /* Inline functions */
  • 15. 15 Compilation easy example <?php print 'foo'; <ST_IN_SCRIPTING>"print" { return T_PRINT; } T_PRINT expr { $$ = zend_ast_create(ZEND_AST_PRINT, $2); } lexing parsing
  • 16. 16 Compilation easy example case ZEND_AST_PRINT: zend_compile_print(result, ast); return; compiling T_PRINT expr { $$ = zend_ast_create(ZEND_AST_PRINT, $2); } void zend_compile_print(znode *result, zend_ast *ast) /* {{{ */ { zend_op *opline; zend_ast *expr_ast = ast->child[0]; znode expr_node; zend_compile_expr(&expr_node, expr_ast); opline = zend_emit_op(NULL, ZEND_ECHO, &expr_node, NULL); opline->extended_value = 1; result->op_type = IS_CONST; ZVAL_LONG(&result->u.constant, 1); }
  • 17. 17 Execution ● Execute OPCodes ● Most complex part of Zend Engine ● VM executor ● zend_vm_execute.h ● Each OPCode ● is run through a handler() function ● "zend_vm_handler" ● runs the instructions in an infinite dipatch loop ● Branching possibles (loops, catch blocks, gotos, etc...) Startup Shutdown zend_compile_file() zend_execute()
  • 18. 18 ZEND_ECHO ZEND_VM_HANDLER(40, ZEND_ECHO, CONST|TMPVAR|CV, ANY) { USE_OPLINE zend_free_op free_op1; zval *z; SAVE_OPLINE(); z = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R); if (Z_TYPE_P(z) == IS_STRING) { zend_string *str = Z_STR_P(z); if (ZSTR_LEN(str) != 0) { zend_write(ZSTR_VAL(str), ZSTR_LEN(str)); } } else { zend_string *str = _zval_get_string_func(z); if (ZSTR_LEN(str) != 0) { zend_write(ZSTR_VAL(str), ZSTR_LEN(str)); } else if (OP1_TYPE == IS_CV && UNEXPECTED(Z_TYPE_P(z) == IS_UNDEF)) { GET_OP1_UNDEF_CV(z, BP_VAR_R); } zend_string_release(str); }
  • 19. 19 OPCode ? ● php -dzend_extension=opcache -dopcache.enable_cli=1 -dopcache.opt_debug_level=0x30000 /tmp/script.php L0 (3): CV0($options) = RECV 1 L1 (5): CV1($invalid) = QM_ASSIGN array(...) L2 (6): V4 = FE_RESET_R CV0($options) L18 L3 (6): T5 = FE_FETCH_R V4 CV2($value) L18 L4 (6): ASSIGN CV3($key) T5 ...
  • 20. 20 Many OPCodes ● OPCodes are low level VM instructions ● Many of them, more and more as PHP evolves ● ~ 200 flavors in PHP 7 ● See the list from Zend/zend_vm_opcodes.h ● ZEND_ADD ● ZEND_DECLARE_ANON_CLASS ● ZEND_FE_RESET ● ZEND_ADD_TRAIT ● ZEND_YIELD_FROM ● ...
  • 21. 21 OPCode handlers ● Each OPCode is treated by a handler (a function) ● It takes up to 3 arguments and produces exactly one result ● Arguments and result are "variable" like you know them ● ZEND_ADD(num1, num2) : result_num ● ZEND_DECLARE_ANON_CLASS(class) : result_bool ● ZEND_FE_RESET(array_or_object) : result ● ZEND_ADD_TRAIT(class, trait) : result_bool ● ZEND_YIELD_FROM(cur_gen, gen_from) : result
  • 22. 22 ZEND_CONCAT example ZEND_VM_HANDLER(8, ZEND_CONCAT, CONST|TMPVAR|CV, CONST|TMPVAR|CV) { USE_OPLINE zend_free_op free_op1, free_op2; zval *op1, *op2; op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R); op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R); if ((OP1_TYPE == IS_CONST || EXPECTED(Z_TYPE_P(op1) == IS_STRING)) && (OP2_TYPE == IS_CONST || EXPECTED(Z_TYPE_P(op2) == IS_STRING))) { zend_string *op1_str = Z_STR_P(op1); zend_string *op2_str = Z_STR_P(op2); zend_string *str; if (OP1_TYPE != IS_CONST && OP1_TYPE != IS_CV && !ZSTR_IS_INTERNED(op1_str) && GC_REFCOUNT(op1_str) == 1) { size_t len = ZSTR_LEN(op1_str); str = zend_string_extend(op1_str, len + ZSTR_LEN(op2_str), 0); memcpy(ZSTR_VAL(str) + len, ZSTR_VAL(op2_str), ZSTR_LEN(op2_str)+1); ZVAL_NEW_STR(EX_VAR(opline->result.var), str); FREE_OP2(); } ...
  • 23. 23 More complex example function check_options(array $options) { $invalid = []; foreach ($options as $key => $value) { if (array_key_exists($key, $this->options)) { $this->options[$key] = $value; } else { $invalid[] = $key; } } } L0 (4): CV0($options) = RECV 1 L1 (6): ASSIGN CV1($invalid) array(...) L2 (7): V5 = FE_RESET_R CV0($options) L18 L3 (7): T6 = FE_FETCH_R V5 CV2($value) L18 L4 (7): ASSIGN CV3($key) T6 L5 (8): INIT_FCALL 2 112 string("array_key_exists") L6 (8): SEND_VAR CV3($key) 1 L7 (8): T8 = FETCH_OBJ_R THIS string("options") L8 (8): SEND_VAL T8 2 L9 (8): V9 = DO_ICALL L10 (8): JMPZ V9 L15 L11 (9): V10 = FETCH_OBJ_W THIS string("options") L12 (9): ASSIGN_DIM V10 CV3($key) L13 (9): OP_DATA CV2($value) L14 (9): JMP L17 L15 (11): ASSIGN_DIM CV1($invalid) NEXT L16 (11): OP_DATA CV3($key) L17 (7): JMP L3 L18 (7): FE_FREE V5 L19 (14): RETURN null
  • 24. 24 OPCode Cache ● First time ● Compile ● Cache to SHM or cache file ● Execute ● Then, if file did not change ● Load from SHM or cache file ● Execute ● Compilation is very heavy ● Optimization can be as well
  • 25. 25 Compilation / Execution function foo() { $data = file('/etc/fstab'); sort($data); return $data; } for($i=0; $i<=$argv[1]; $i++) { $a = foo(); $a[] = range(0, $i); $result[] = $a; } var_dump($result); main()==>run_init::tmp/php.php//1 241 main()==>compile::tmp/php.php//1 89 main()==>run_init::tmp/php.php//1 1731 main()==>compile::tmp/php.php//1 89 argv = 1 argv = 10
  • 27. 27 CLI ● Even if more and more used, PHP has not been designed to run in CLI (for long running scripts) ● In long run CLI ("consumers"), the VM never stops ● PHP never stops, thus never reaches its "cleaning memory step" ● The "current request" memory is then never freed ● Even with GC on, the programmer has to really take care not to create "memory leaks" ● And for that he has to master how PHP works internally ● Or use a low level memory debugger, like valgrind/massif ● OPCode caches and optimizers are pretty useless to CLI ● Optimization can be worth it ● Compilation prevention is useless as runtime will take a lot
  • 28. 28 JIT ? ● JIT is a complex topic, and coming to PHP 8 ● Still under development ● It should accelerate very CPU intensive tasks ● Aka : not web applications, usually ● Until you really treat that many data per request, which you shouldn't do anyway with PHP. ● But CLI scripts will mainly benefit from JIT (composer ?) ● Take care as it wont accelerate any IO intensive tasks ● And we tend to run some using PHP nowadays ● (Aka "event loops" and things like that)
  • 29. 29 PHP's memory consumption ● Know what you are talking about and what you're doing ● Know your OS and memory allocators ● memory_get_usage(): size used by your runtime code ● memory_get_usage(true): size allocated through the OS ● ZendMM caches blocks ● use gc_mem_caches() to reclaim them if needed ● Use your OS to be accurate php> echo memory_get_usage(); 625272 php> echo memory_get_usage(1); 786432 cat /proc/13399/status Name:php State: S (sleeping) VmPeak: 154440 kB VmSize: 133700 kB VmRSS: 10304 kB VmData: 4316 kB VmStk: 136 kB VmExe: 9876 kB
  • 30. 30 A software VM ● PHP internally works the same as ● Java ● Python ● Ruby ● Lua ● [... others ] ● But PHP's VM is not threaded, it runs a monolithic path ● PHP's VM compiler/optimizer/interpreter are merged into PHP source code ● zend_compile.c ● ext/opcache/Optimizer/zend_optimizer.c ● zend_vm_def.h / zend_execute.c
  • 31. 31 Thank you for listening