SlideShare a Scribd company logo
From Crash to Testcase: a Debugging Primer
(Or: how to get your bugs fixed really quickly and be loved by developers)
Roel Van de Paar
QA Lead, Percona
22 April 1:30pm - 4:30pm @ Ballroom D
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
2
Database Issues Cheat SheetDatabase Issues Cheat Sheet
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
3
Oops! My ServerOops! My Server CRASHED!CRASHED!
Or did it?Or did it?
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
4
Application Error or not?Application Error or not?
●
Application side: Signs that database server is alive?Application side: Signs that database server is alive?
●
Server side: Check mysql CLI (Command Line Interface)Server side: Check mysql CLI (Command Line Interface)
●
Check Error Log (data_dir/host_name.err)Check Error Log (data_dir/host_name.err)
– Start at end and work your way up
– Happy day if the last line reads “Writing a core file”
●
Check for cores (data_dir/core.pid, system locations)Check for cores (data_dir/core.pid, system locations)
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
5
Application Errors: You're It!Application Errors: You're It!
●
Bad News?Bad News?
– You need to fix it
– “original developer” has
“left the building”
●
Good News?Good News?
– Your A+ developer
incorporated app logs
“Pay me now or pay me later”
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
6
Debugging Your AppDebugging Your App
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
7
Related Misconfigurations:Related Misconfigurations:
Buffer & File Sizing, Communication ErrorsBuffer & File Sizing, Communication Errors
●
Comms IssueComms Issue
– Hardware
– Comms buffer settings on server set to small etc.
● max_allowed_packet http://dev.mysql.com/doc/refman/5.6/en/packet-too-large.html
●
Other non-comms IssuesOther non-comms Issues
– Buffer settings on server
● Example: [ERROR] mysqld: Sort aborted
– Can be due to a small sort_buffer_size
● Example:InnoDB: ERROR: the age of the last checkpoint is 724774680, InnoDB:
which exceeds the log group capacity 724770200.
– Due to innodb_log_file_size being too small
● Example: per-session var set too high thereby causing slowness, OOM etc.
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
8
Severity/Error LevelSeverity/Error Level
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
9
Server Crash?Server Crash?
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
10
Always check the Error LogAlways check the Error Log 1st1st
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
11
Error Log Analysis?Error Log Analysis?
● 2013-03-10 06:58:50 19481 [Note] /ssd/Percona-Server-5.6.8-alpha60.2-313-
debug.Linux.x86_64/bin/mysqld-debug: ready for connections.
Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_6/tmp/master.sock'
port: 13100 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug
mysqld-debug: /ssd/ps56-univ-log-archive-qa/Percona-Server-5.6.8-
alpha60.2/sql/protocol.cc:518: void Protocol::end_statement(): Assertion `0' failed.
04:00:32 UTC - mysqld got signal 6 ;
● 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Operating system error number 2 in a file
operation. InnoDB: The error means the system cannot find the path specified.
2013-03-17 16:17:44 7f45c9e96700 InnoDB: File name
/tmp/1363526254145352487/ib_log_archive_0000000000045568
2013-03-17 16:17:44 7f45c9e96700 InnoDB: File operation call: 'open' returned OS error 71.
2013-03-17 16:17:44 7f45c9e96700 InnoDB: Cannot continue operation.
2013-03-17 16:17:44 7f45c9e96700 InnoDB: Assertion failure in thread 13993771698764813 in
file os0file.cc line 62
InnoDB: Failing assertion: 0
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
12
Error Log Analysis?Error Log Analysis?
● Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_4/tmp/master.sock' port:
13060 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug
2013-03-13 02:20:59 8466 [ERROR] InnoDB: Unable to lock
/ssd/tmp/ib_log_archive_0000000687614464, error: 11
2013-03-13 02:20:59 8466 [Note] InnoDB: Check that you do not already have another mysqld
process using the same InnoDB data or log files.
InnoDB: Cannot create or open archive log file /ssd/tmp/ib_log_archive_0000000687614464.
InnoDB: Cannot continue operation.
InnoDB: Check that the log archive directory exists,
InnoDB: you have access rights to it, and
InnoDB: there is space available.
● 2013-03-11 03:34:23 7f9866dca700 InnoDB: Assertion failure in thread 140292537493248 in file
row0purge.cc line 459
InnoDB: Failing assertion: 0x20UL & rec_get_info_bits( btr_cur_get_rec(btr_cur),
dict_table_is_comp(index->table))
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
13
Error Log Analysis: Initial TipsError Log Analysis: Initial Tips
●
RRR++: Read! Read! Read! Read Again! (And once more)RRR++: Read! Read! Read! Read Again! (And once more)
– Why?
●
What's the problem?What's the problem?
– Assert vs. Error vs. Crash vs. OS vs. OOM vs. Sigx vs.
Halt vs. Kill vs. Corruption vs. Deadlocks vs. Buffer &
File sizing vs. Communication Errors vs. SQL Errors vs.
Warnings vs. 3rd
Party Messages vs. …
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
14
Analyzing the Error Log &Analyzing the Error Log & allall it containsit contains
:$ / RRR++:$ / RRR++
Research++Research++
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
15
WhichWhich QueryQuery caused trouble?caused trouble?
●
Crashing query: In Error log:Crashing query: In Error log:
– Query (3ff000002300): select f1 from t2 limit 5
●
Faulting query:Faulting query:
– 130325 6:07:46 [ERROR] mysqld: Sort aborted:
Query execution was interrupted
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
16
Resolving StacksResolving Stacks
[roel@localhost log]$ grep "mysqld(_" master.err | sed 's/^.*mysqld//'
(_ZN8Protocol13end_statementEv+0x1db)[0x525ce7]
(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1496)[0x5a2e6d]
(_Z10do_commandP3THD+0x284)[0x5a3702]
(_Z24do_handle_one_connectionP3THD+0x121)[0x648f1d]
[roel@localhost log]$ grep "mysqld(_" master.err | sed 's/^.*mysqld//' | c++filt
(Protocol::end_statement()+0x1db)[0x525ce7]
(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x1496)[0x5a2e6d]
(do_command(THD*)+0x284)[0x5a3702]
(do_handle_one_connection(THD*)+0x121)[0x648f1d]
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
17
SideTour:SideTour: WhichWhich QueryQuery caused trouble: LEScaused trouble: LES
●
LES: Last Executed StatementLES: Last Executed Statement
– LES Error log ExtractionLES Error log Extraction
/server/bin/mysqld(handle_one_connection+0x52)[0x649010]
/lib64/libpthread.so.0[0x333d007851]
/lib64/libc.so.6(clone+0x6d)[0x333cce890d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fe000009188): select `c3`,`c4` from `qa07` limit 10
– LES gdb ExtractionLES gdb Extraction
Select 'do_command' frame in crashing thread using thread & frame,
then use: p thd->query_string.string.str
http://www.mysqlperformanceblog.com/2012/09/09/obtain-last-executed-statement-from-optimized-core-dump/
Demo: /ssd/Percona-Server-5.5.29-rel30.0--debug.Linux.x86_64/data4
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
18
AssertAssertions: Generalions: General
●
Assert: “I, developer x, assert that at this point y, x=0Assert: “I, developer x, assert that at this point y, x=0
(as an example) should not be the case.”(as an example) should not be the case.”
– RRR++: SE, File, Line, Vars, TimeRRR++: SE, File, Line, Vars, Time
121204 7:45:06 InnoDB: Assertion failure in thread 1390 in file row0upd.c line 2023
InnoDB: Failing assertion: btr_pcur_restore_position(thr_get_trx(thr)->fake_changes ?
BTR_SEARCH_TREE : BTR_MODIFY_TREE, pcur, mtr)
– RRR++: Are you a dev?RRR++: Are you a dev?
130127 0:20:37 InnoDB: Assertion failure in thread 1396 in file row0sel.c line 115
InnoDB: Failing assertion: prefix_len >= sec_len
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
19
AssertAssertions: Typeions: Type
●
Server AssertionServer Assertion
mysqld: /mysql-5.5/sql/sql_string.cc:37: bool
String::real_alloc(uint32): Assertion `arg_length > length' failed.
●
InnoDB/XtraDB/Other SE Assertion (Seen most often)InnoDB/XtraDB/Other SE Assertion (Seen most often)
InnoDB: Error: Waited for 600 secs for hash index ref_count (1)
to drop to 0. index: "c32" table: "test/#sql2-4b20-a"
121203 3:48:15 InnoDB: Assertion failure in thread 352803136 in
file dict0dict.c line 1883
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
20
Error LogError Log AnalysisAnalysis ExampleExample
● 2013-03-10 06:58:50 19481 [Note] /ssd/Percona-Server-5.6.8-alpha60.2-313-
debug.Linux.x86_64/bin/mysqld-debug: ready for connections.
Version: '5.6.10-alpha60.2-debug-log' socket:
'/ssd/198649/current1_6/tmp/master.sock' port: 13100 Percona Server with XtraDB
(GPL), Release alpha60.2, Revision 313-debug
mysqld-debug: /ssd/ps56-univ-log-archive-qa/Percona-Server-5.6.8-
alpha60.2/sql/protocol.cc:518: void Protocol::end_statement(): Assertion `0' failed.
04:00:32 UTC - mysqld got signal 6 ;
● 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Operating system error number 2 in a file
operation. InnoDB: The error means the system cannot find the path specified.
2013-03-17 16:17:44 7f45c9e96700 InnoDB: File name
/tmp/1363526254145352487/ib_log_archive_0000000000045568
2013-03-17 16:17:44 7f45c9e96700 InnoDB: File operation call: 'open' returned OS
error 71.
2013-03-17 16:17:44 7f45c9e96700 InnoDB: Cannot continue operation.
2013-03-17 16:17:44 7f45c9e96700 InnoDB: Assertion failure in thread
13993771698764813 in file os0file.cc line 62
InnoDB: Failing assertion: 0
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
21
Error LogError Log AnalysisAnalysis ExampleExample
● Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_4/tmp/master.sock'
port: 13060 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug
2013-03-13 02:20:59 8466 [ERROR] InnoDB: Unable to lock
/ssd/tmp/ib_log_archive_0000000687614464, error: 11
2013-03-13 02:20:59 8466 [Note] InnoDB: Check that you do not already have another
mysqld process using the same InnoDB data or log files.
InnoDB: Cannot create or open archive log file
/ssd/tmp/ib_log_archive_0000000687614464.
InnoDB: Cannot continue operation.
InnoDB: Check that the log archive directory exists,
InnoDB: you have access rights to it, and
InnoDB: there is space available.
● 2013-03-11 03:34:23 7f9866dca700 InnoDB: Assertion failure in thread 140292537493248
in file row0purge.cc line 459
InnoDB: Failing assertion: 0x20UL & rec_get_info_bits( btr_cur_get_rec(btr_cur),
dict_table_is_comp(index->table))
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
22
ErrorErrorss
● 130325 5:54:07 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
130325 5:54:07 [ERROR] Fatal error: Can't open and lock privilege tables: Table 'mysql.host' doesn't exist
● 130223 12:44:05 InnoDB: Error: Write to file ./apr/fr1 failed at offset 13.
InnoDB: 49152 bytes should have been written, only 0 were written.
InnoDB: Operating system error number 9.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 9 means 'Bad file descriptor'.
● 130325 5:36:11 [ERROR] /ssd/Server/bin/mysqld: Incorrect information in file: './test/v.frm'
● 130325 6:07:46 [ERROR] /ssd/Server/bin/mysqld: Sort aborted: Query execution was interrupted
● 130325 6:10:07 [ERROR] /ssd/Server/bin/mysqld: Sort aborted: Server shutdown in progress
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
23
CrashCrasheses
● thread 10 (LWP 23954):
+bt
#0 0x0000000004e3969c in pthread_kill () from /lib64/libpthread.so.0
#1 0x00000000007e2779 in my_write_core (sig=11) at /ssd/QA-16274-5.5/Percona-Server-
5.5.28-rel29.3/mysys/stacktrace.c:433
#2 0x00000000006ab0ea in handle_fatal_signal (sig=11) at /ssd/QA-16274-5.5/Percona-
Server-5.5.28-rel29.3/sql/signal_handler.cc:249
#3 <signal handler called>
#4 rbt_free_node (node=0x0, nil=0x1040f170) at /ssd/QA-16274-5.5/Percona-Server-5.5.28-
rel29.3/storage/innobase/ut/ut0rbt.c:731
#5 0x00000000009935e9 in rbt_free_node (node=0x1040f1e0, nil=0x1040f170) at /ssd/QA-
16274-5.5/Percona-Server-5.5.28-rel29.3/storage/innobase/ut/ut0rbt.c:731
● https://bugs.launchpad.net/percona-server/+bug/1111226 (Crash, Valgrind, Error Log)
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
24
OSOS//HardwareHardware Related MessageRelated Message((ss))
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
25
OSOS Related IssuesRelated Issues
●
https://bugs.launchpad.net/percona-server/+bug/806975https://bugs.launchpad.net/percona-server/+bug/806975
●
OS errors: PerrorOS errors: Perror
– <base_dir>/bin/perror
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
26
OOMOOM
● CLI:
ERROR 5 (HY000): Out of memory (Needed 128992 bytes)
● Error Log:
110531 17:12:08 [ERROR] /home/philips/bzr/mysql-55-
eb/sql/mysqld: Out of memory (Needed 129872 bytes)
● Use Valgrind [Memcheck, Massif]!
● https://bugs.launchpad.net/percona-server/+bug/1042946
– Could cause OOM
– Valgrind [Massif] helps
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
27
SigSig=x &=x & KillKill -x-x
●
Signal NumbersSignal Numbers
– Sig=x Signal Results Note
– 4 SIGILL Core Illegal Instruction
– 6 SIGABRT Core Abort signal by abort()
– 8 SIGFPE Core Floating Point Exception
– 11 SIGSEGV Core Invalid Memory Reference
●
Tip: you can use for example 'kill -11' to get a core dump atTip: you can use for example 'kill -11' to get a core dump at
any given point, for example I've used this when seeing aany given point, for example I've used this when seeing a
memory allocation issue.memory allocation issue.
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
28
HaltHaltss
●
Different from Sig=x.Different from Sig=x.
– Server just “halts”
●
Different from unplanned shutdownDifferent from unplanned shutdown
– Server just “halts”
●
Query loggingQuery logging
●
Error log informationError log information
●
(gdb breakpoints on exit functions)(gdb breakpoints on exit functions)
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
29
DatabaseDatabase CorruptionCorruption
●
Standard recovery (Not corruption):Standard recovery (Not corruption):
130327 11:02:02 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
130327 11:02:02 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
●
Data Corruption:Data Corruption:
120117 1:22:00 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5[...] 99
InnoDB: Apply batch completed
120117 1:22:02 InnoDB: Rolling back trx with id A01D1001, 13 rows to undo
InnoDB: Dropping table with id 54885 in recovery if it exists
InnoDB: Error: trying to load index PRIMARY for table nr92/#sql2-2e46-316ce0
InnoDB: but the index tree has been freed!
InnoDB: Rolling back of trx id A01D1001 completed
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
30
DeadlocksDeadlocks
●
Deadlocks are funDeadlocks are fun
●
User initiated vs actual server deadlockUser initiated vs actual server deadlock
– User Initiated:
mysql> select * from t1 where a = 2;
#with corresponding other session
ERROR 1205 (HY000): Lock wait timeout exceeded;
try restarting transaction
– Server deadlock
Programming deadlock
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
31
(3(3rdrd
) Party Messages (RQG, Valgrind) vs.) Party Messages (RQG, Valgrind) vs.
●
Other items which may write to the error logOther items which may write to the error log
- RQG
- Valgrind
- InnoDB status monitor
●
Valgrind Example:Valgrind Example:
==12667== Thread 15:
==12667== Invalid read of size 8
==12667== at 0x93D473: lock_rec_block_validate (lock0lock.c:4969)
==12667== by 0x93D8D0: lock_print_info_all_transactions (lock0lock.c:5113)
==12667== by 0x862BAC: srv_printf_innodb_monitor (srv0srv.c:2263)
==12667== by 0x862DA5: srv_monitor_thread (srv0srv.c:2580)
==12667== by 0x4E34850: start_thread (in /lib64/libpthread-2.12.so)
==12667== by 0x19FCA6FF: ???
==12667== Address 0x16220c48 is 664 bytes inside a block of size 872 free'd
==12667== at 0x4C2695D: free (vg_replace_malloc.c:366)
==12667== by 0x952579: mem_area_free (mem0pool.c:519)
[...]
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
32
SideTour:SideTour: Googling++Googling++
(Demo)(Demo)
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
33
Considering InnoDB Status & RecoveryConsidering InnoDB Status & Recovery
●
This is for 'Production' (non-test) systems onlyThis is for 'Production' (non-test) systems only
●
innodb_force_recoveryinnodb_force_recovery
http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.htmlhttp://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
34
Crash Severity SummaryCrash Severity Summary
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
35
Core Dumps: Locations & SetupCore Dumps: Locations & Setup
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
36
Core Dumps: Analysis: using gdbCore Dumps: Analysis: using gdb
●
LES exampleLES example
●
Google search exampleGoogle search example
●
thread apply all btthread apply all bt
●
Cheat SheetCheat Sheet
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
37
Core Dumps: Analysis: using WinDbgCore Dumps: Analysis: using WinDbg
●
DemoDemo
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
38
Valgrind: IntroductionValgrind: Introduction
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
39
RQG: IntroductionRQG: Introduction
(Demo)(Demo)
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
40
SideTour: Bash Scripting Fun!SideTour: Bash Scripting Fun!
(Demo)(Demo)
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
41
SQL ErrorsSQL Errors
●
Place?Place?
– MySQL CLI
– Error Log
● Usually higher severity then in the CLI
– Application
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
42
Query SimplificationQuery Simplification
●
Objective?Objective?
– Good testcase: Reduce a crashing/failing query to the
minimum length and complexity required to still obtain
the “desired” crash/error/issue (QA/Debugging)
– Optimizing the query: Reduce a query to obtain the
same result without altering it's functionality or future
results with changed data (Support/Optimization)
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
43
Query Simplification: ClausesQuery Simplification: Clauses
mysql> SELECT * FROM t1 WHERE a>1 LIMIT 20;mysql> SELECT * FROM t1 WHERE a>1 LIMIT 20;
mysql> SELECT * FROM t1;mysql> SELECT * FROM t1;
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
44
Query Simplification: Query SplitQuery Simplification: Query Split
mysql> SELECT * FROM (SELECT a FROM t1) AS res1mysql> SELECT * FROM (SELECT a FROM t1) AS res1
WHERE a > 2;WHERE a > 2;
mysql> SELECT a FROM t1;mysql> SELECT a FROM t1;
mysql> SELECT a FROM t1 WHERE a > 2;mysql> SELECT a FROM t1 WHERE a > 2;
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
45
Query Simplification: Generalize DataQuery Simplification: Generalize Data
mysql> SELECT "There was once a little bug" INTO @a;mysql> SELECT "There was once a little bug" INTO @a;
mysql> SELECT "a" INTO @a;mysql> SELECT "a" INTO @a;
mysql> SELECT 1 INTO @a;mysql> SELECT 1 INTO @a;
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
46
Query Simplification: Move Data into QueryQuery Simplification: Move Data into Query
●
Often involves changes of results, but that matters “little”Often involves changes of results, but that matters “little”
(reproducibility factor may change) if the issue still(reproducibility factor may change) if the issue still
reproduces.reproduces.
mysql> SELECT a FROM t1; a=columnmysql> SELECT a FROM t1; a=column
mysql> SELECT "a" FROM t1; a=?mysql> SELECT "a" FROM t1; a=?
mysql> SELECT 1 FROM t1; 1=digit 1 (x rows)mysql> SELECT 1 FROM t1; 1=digit 1 (x rows)
mysql> SELECT 1;mysql> SELECT 1;
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
47
Query Simplification: Limit nr of FieldsQuery Simplification: Limit nr of Fields
mysql> SELECT a,b,x,z FROM t1;mysql> SELECT a,b,x,z FROM t1;
mysql> SELECT * FROM t1;mysql> SELECT * FROM t1;
mysql> SELECT a FROM t1;mysql> SELECT a FROM t1;
mysql> SELECT 1 FROM t1;mysql> SELECT 1 FROM t1;
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
48
Query Simplification: Simplify TableQuery Simplification: Simplify Table
mysql> SELECT a FROM t1;mysql> SELECT a FROM t1;
mysql> ALTER TABLE t1 DROP COLUMN b;mysql> ALTER TABLE t1 DROP COLUMN b;
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
49
Test Case ProductionTest Case Production
●
Strategies:Strategies:
– Bring server up -> re-run crashing query
● mysqldump -> add query -> “Simplify the query”
– Use randgen or Gypsy with grammars based on
crashing query (or usually used queries if the
crashing query is not known)
– Run Valgrind for some time on the server to check
for programming errors
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
50
SideTour:SideTour: Logging Bugs++Logging Bugs++
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
51
SideTour:SideTour: A note on threading & reproducibilityA note on threading & reproducibility
●
1 Thread?1 Thread?
– Usually easy to reproduce
● But not always: timing, OS slicing, SE dives etc.
● Impossible to 100% match timing
●
Many threads?Many threads?
– Usually hard to reproduce (exception: RQG/Gypsy)
● Dev core analysis is usually quickest way forward
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
52
Percona!Percona!
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
53
Connect++Connect++
From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona
54
ConnectConnect++++
roel.vandepaar@percona.com

More Related Content

PDF
Hp0 a16 question answers
PDF
Oracle cluster installation with grid and nfs
PDF
What is new in BIND 9.11?
PDF
Riyaj real world performance issues rac focus
KEY
Varnish @ Velocity Ignite
PDF
A deep dive about VIP,HAIP, and SCAN
PDF
RIPE 71 and IETF 94 reports webinar
PDF
Advanced RAC troubleshooting: Network
Hp0 a16 question answers
Oracle cluster installation with grid and nfs
What is new in BIND 9.11?
Riyaj real world performance issues rac focus
Varnish @ Velocity Ignite
A deep dive about VIP,HAIP, and SCAN
RIPE 71 and IETF 94 reports webinar
Advanced RAC troubleshooting: Network

What's hot (20)

PDF
Oracle cluster installation with grid and iscsi
PDF
Debugging Ruby
PDF
Debugging Ruby Systems
DOCX
Data Guard on EBS R12 DB 10g
PDF
Ceph issue 해결 사례
PDF
Rac introduction
PDF
2 netcat enum-pub
PDF
Kernel Recipes 2017: Performance Analysis with BPF
PDF
3 scanning-ger paoctes-pub
PDF
Web Server Deathmatch 2009 Erlang Factory Joe Williams
PDF
Linux administration ii-parti
PDF
Performance tweaks and tools for Linux (Joe Damato)
PPT
3.4 use streams, pipes and redirects v2
TXT
PDF
MariaDB Replication manager and HAProxy (HAProxy Paris Meetup)
PPT
101 3.4 use streams, pipes and redirects v2
PDF
Unix executable buffer overflow
PPTX
[MathWorks] Versioning Infrastructure
PDF
Managing MariaDB Server operations with Percona Toolkit
TXT
Oracle cluster installation with grid and iscsi
Debugging Ruby
Debugging Ruby Systems
Data Guard on EBS R12 DB 10g
Ceph issue 해결 사례
Rac introduction
2 netcat enum-pub
Kernel Recipes 2017: Performance Analysis with BPF
3 scanning-ger paoctes-pub
Web Server Deathmatch 2009 Erlang Factory Joe Williams
Linux administration ii-parti
Performance tweaks and tools for Linux (Joe Damato)
3.4 use streams, pipes and redirects v2
MariaDB Replication manager and HAProxy (HAProxy Paris Meetup)
101 3.4 use streams, pipes and redirects v2
Unix executable buffer overflow
[MathWorks] Versioning Infrastructure
Managing MariaDB Server operations with Percona Toolkit
Ad

Viewers also liked (20)

PPTX
SQL Data Manipulation
PPT
4. SQL in DBMS
PDF
CBSE XII Database Concepts And MySQL Presentation
PPTX
Data Manipulation Language
PPTX
Commands
PPT
Sql Authorization
PPT
MYSQL Aggregate Functions
PPT
DBMS : Relational Algebra
PPTX
Data integrity Dbms presentation 12 cs 18
PDF
Nested Queries Lecture
PPTX
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
PDF
3 data modeling using the entity-relationship (er) model
PPTX
Data model and entity relationship
PPT
Dbms ii mca-ch5-ch6-relational algebra-2013
PDF
Enhanced Entity-Relationship (EER) Modeling
PPTX
T-SQL Overview
PPTX
trigger dbms
PPT
SQL Views
PPT
Er & eer to relational mapping
ODP
ER Model in DBMS
SQL Data Manipulation
4. SQL in DBMS
CBSE XII Database Concepts And MySQL Presentation
Data Manipulation Language
Commands
Sql Authorization
MYSQL Aggregate Functions
DBMS : Relational Algebra
Data integrity Dbms presentation 12 cs 18
Nested Queries Lecture
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
3 data modeling using the entity-relationship (er) model
Data model and entity relationship
Dbms ii mca-ch5-ch6-relational algebra-2013
Enhanced Entity-Relationship (EER) Modeling
T-SQL Overview
trigger dbms
SQL Views
Er & eer to relational mapping
ER Model in DBMS
Ad

Similar to From crash to testcase (20)

PDF
Basic MySQL Troubleshooting for Oracle Database Administrators
PDF
Performance schema in_my_sql_5.6_pluk2013
PDF
MariaDB/MySQL pitfalls - And how to come out again...
PDF
How to Monitor MySQL
ODP
Mastering InnoDB Diagnostics
ODP
Harrison fisk masteringinnodb-diagnostics
PDF
MariaDB / MySQL tripping hazard and how to get out again?
PDF
Reducing Risk When Upgrading MySQL
PPTX
Finding an unusual cause of max_user_connections in MySQL
PDF
MySQL 5.5 Guide to InnoDB Status
PPTX
Troubleshooting MySQL from a MySQL Developer Perspective
PDF
Basic MySQL Troubleshooting for Oracle DBAs
PDF
More on gdb for my sql db as (fosdem 2016)
PDF
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
PDF
Gdb basics for my sql db as (percona live europe 2019)
DOCX
Upgrading mysql version 5.5.30 to 5.6.10
PDF
The Accidental DBA
PDF
Preventing and Resolving MySQL Downtime
PDF
OSDC 2017 | Lessons from database failures by Colin Charles
PDF
Perconalive feb-2011-share
Basic MySQL Troubleshooting for Oracle Database Administrators
Performance schema in_my_sql_5.6_pluk2013
MariaDB/MySQL pitfalls - And how to come out again...
How to Monitor MySQL
Mastering InnoDB Diagnostics
Harrison fisk masteringinnodb-diagnostics
MariaDB / MySQL tripping hazard and how to get out again?
Reducing Risk When Upgrading MySQL
Finding an unusual cause of max_user_connections in MySQL
MySQL 5.5 Guide to InnoDB Status
Troubleshooting MySQL from a MySQL Developer Perspective
Basic MySQL Troubleshooting for Oracle DBAs
More on gdb for my sql db as (fosdem 2016)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
Gdb basics for my sql db as (percona live europe 2019)
Upgrading mysql version 5.5.30 to 5.6.10
The Accidental DBA
Preventing and Resolving MySQL Downtime
OSDC 2017 | Lessons from database failures by Colin Charles
Perconalive feb-2011-share

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPT
Teaching material agriculture food technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
1. Introduction to Computer Programming.pptx
PPTX
Machine Learning_overview_presentation.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation theory and applications.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Group 1 Presentation -Planning and Decision Making .pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Teaching material agriculture food technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Accuracy of neural networks in brain wave diagnosis of schizophrenia
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Programs and apps: productivity, graphics, security and other tools
Assigned Numbers - 2025 - Bluetooth® Document
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Spectroscopy.pptx food analysis technology
Heart disease approach using modified random forest and particle swarm optimi...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
A comparative analysis of optical character recognition models for extracting...
1. Introduction to Computer Programming.pptx
Machine Learning_overview_presentation.pptx

From crash to testcase

  • 1. From Crash to Testcase: a Debugging Primer (Or: how to get your bugs fixed really quickly and be loved by developers) Roel Van de Paar QA Lead, Percona 22 April 1:30pm - 4:30pm @ Ballroom D
  • 2. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 2 Database Issues Cheat SheetDatabase Issues Cheat Sheet
  • 3. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 3 Oops! My ServerOops! My Server CRASHED!CRASHED! Or did it?Or did it?
  • 4. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 4 Application Error or not?Application Error or not? ● Application side: Signs that database server is alive?Application side: Signs that database server is alive? ● Server side: Check mysql CLI (Command Line Interface)Server side: Check mysql CLI (Command Line Interface) ● Check Error Log (data_dir/host_name.err)Check Error Log (data_dir/host_name.err) – Start at end and work your way up – Happy day if the last line reads “Writing a core file” ● Check for cores (data_dir/core.pid, system locations)Check for cores (data_dir/core.pid, system locations)
  • 5. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 5 Application Errors: You're It!Application Errors: You're It! ● Bad News?Bad News? – You need to fix it – “original developer” has “left the building” ● Good News?Good News? – Your A+ developer incorporated app logs “Pay me now or pay me later”
  • 6. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 6 Debugging Your AppDebugging Your App
  • 7. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 7 Related Misconfigurations:Related Misconfigurations: Buffer & File Sizing, Communication ErrorsBuffer & File Sizing, Communication Errors ● Comms IssueComms Issue – Hardware – Comms buffer settings on server set to small etc. ● max_allowed_packet http://dev.mysql.com/doc/refman/5.6/en/packet-too-large.html ● Other non-comms IssuesOther non-comms Issues – Buffer settings on server ● Example: [ERROR] mysqld: Sort aborted – Can be due to a small sort_buffer_size ● Example:InnoDB: ERROR: the age of the last checkpoint is 724774680, InnoDB: which exceeds the log group capacity 724770200. – Due to innodb_log_file_size being too small ● Example: per-session var set too high thereby causing slowness, OOM etc.
  • 8. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 8 Severity/Error LevelSeverity/Error Level
  • 9. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 9 Server Crash?Server Crash?
  • 10. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 10 Always check the Error LogAlways check the Error Log 1st1st
  • 11. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 11 Error Log Analysis?Error Log Analysis? ● 2013-03-10 06:58:50 19481 [Note] /ssd/Percona-Server-5.6.8-alpha60.2-313- debug.Linux.x86_64/bin/mysqld-debug: ready for connections. Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_6/tmp/master.sock' port: 13100 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug mysqld-debug: /ssd/ps56-univ-log-archive-qa/Percona-Server-5.6.8- alpha60.2/sql/protocol.cc:518: void Protocol::end_statement(): Assertion `0' failed. 04:00:32 UTC - mysqld got signal 6 ; ● 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Operating system error number 2 in a file operation. InnoDB: The error means the system cannot find the path specified. 2013-03-17 16:17:44 7f45c9e96700 InnoDB: File name /tmp/1363526254145352487/ib_log_archive_0000000000045568 2013-03-17 16:17:44 7f45c9e96700 InnoDB: File operation call: 'open' returned OS error 71. 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Cannot continue operation. 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Assertion failure in thread 13993771698764813 in file os0file.cc line 62 InnoDB: Failing assertion: 0
  • 12. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 12 Error Log Analysis?Error Log Analysis? ● Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_4/tmp/master.sock' port: 13060 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug 2013-03-13 02:20:59 8466 [ERROR] InnoDB: Unable to lock /ssd/tmp/ib_log_archive_0000000687614464, error: 11 2013-03-13 02:20:59 8466 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files. InnoDB: Cannot create or open archive log file /ssd/tmp/ib_log_archive_0000000687614464. InnoDB: Cannot continue operation. InnoDB: Check that the log archive directory exists, InnoDB: you have access rights to it, and InnoDB: there is space available. ● 2013-03-11 03:34:23 7f9866dca700 InnoDB: Assertion failure in thread 140292537493248 in file row0purge.cc line 459 InnoDB: Failing assertion: 0x20UL & rec_get_info_bits( btr_cur_get_rec(btr_cur), dict_table_is_comp(index->table))
  • 13. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 13 Error Log Analysis: Initial TipsError Log Analysis: Initial Tips ● RRR++: Read! Read! Read! Read Again! (And once more)RRR++: Read! Read! Read! Read Again! (And once more) – Why? ● What's the problem?What's the problem? – Assert vs. Error vs. Crash vs. OS vs. OOM vs. Sigx vs. Halt vs. Kill vs. Corruption vs. Deadlocks vs. Buffer & File sizing vs. Communication Errors vs. SQL Errors vs. Warnings vs. 3rd Party Messages vs. …
  • 14. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 14 Analyzing the Error Log &Analyzing the Error Log & allall it containsit contains :$ / RRR++:$ / RRR++ Research++Research++
  • 15. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 15 WhichWhich QueryQuery caused trouble?caused trouble? ● Crashing query: In Error log:Crashing query: In Error log: – Query (3ff000002300): select f1 from t2 limit 5 ● Faulting query:Faulting query: – 130325 6:07:46 [ERROR] mysqld: Sort aborted: Query execution was interrupted
  • 16. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 16 Resolving StacksResolving Stacks [roel@localhost log]$ grep "mysqld(_" master.err | sed 's/^.*mysqld//' (_ZN8Protocol13end_statementEv+0x1db)[0x525ce7] (_Z16dispatch_command19enum_server_commandP3THDPcj+0x1496)[0x5a2e6d] (_Z10do_commandP3THD+0x284)[0x5a3702] (_Z24do_handle_one_connectionP3THD+0x121)[0x648f1d] [roel@localhost log]$ grep "mysqld(_" master.err | sed 's/^.*mysqld//' | c++filt (Protocol::end_statement()+0x1db)[0x525ce7] (dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x1496)[0x5a2e6d] (do_command(THD*)+0x284)[0x5a3702] (do_handle_one_connection(THD*)+0x121)[0x648f1d]
  • 17. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 17 SideTour:SideTour: WhichWhich QueryQuery caused trouble: LEScaused trouble: LES ● LES: Last Executed StatementLES: Last Executed Statement – LES Error log ExtractionLES Error log Extraction /server/bin/mysqld(handle_one_connection+0x52)[0x649010] /lib64/libpthread.so.0[0x333d007851] /lib64/libc.so.6(clone+0x6d)[0x333cce890d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (7fe000009188): select `c3`,`c4` from `qa07` limit 10 – LES gdb ExtractionLES gdb Extraction Select 'do_command' frame in crashing thread using thread & frame, then use: p thd->query_string.string.str http://www.mysqlperformanceblog.com/2012/09/09/obtain-last-executed-statement-from-optimized-core-dump/ Demo: /ssd/Percona-Server-5.5.29-rel30.0--debug.Linux.x86_64/data4
  • 18. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 18 AssertAssertions: Generalions: General ● Assert: “I, developer x, assert that at this point y, x=0Assert: “I, developer x, assert that at this point y, x=0 (as an example) should not be the case.”(as an example) should not be the case.” – RRR++: SE, File, Line, Vars, TimeRRR++: SE, File, Line, Vars, Time 121204 7:45:06 InnoDB: Assertion failure in thread 1390 in file row0upd.c line 2023 InnoDB: Failing assertion: btr_pcur_restore_position(thr_get_trx(thr)->fake_changes ? BTR_SEARCH_TREE : BTR_MODIFY_TREE, pcur, mtr) – RRR++: Are you a dev?RRR++: Are you a dev? 130127 0:20:37 InnoDB: Assertion failure in thread 1396 in file row0sel.c line 115 InnoDB: Failing assertion: prefix_len >= sec_len
  • 19. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 19 AssertAssertions: Typeions: Type ● Server AssertionServer Assertion mysqld: /mysql-5.5/sql/sql_string.cc:37: bool String::real_alloc(uint32): Assertion `arg_length > length' failed. ● InnoDB/XtraDB/Other SE Assertion (Seen most often)InnoDB/XtraDB/Other SE Assertion (Seen most often) InnoDB: Error: Waited for 600 secs for hash index ref_count (1) to drop to 0. index: "c32" table: "test/#sql2-4b20-a" 121203 3:48:15 InnoDB: Assertion failure in thread 352803136 in file dict0dict.c line 1883
  • 20. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 20 Error LogError Log AnalysisAnalysis ExampleExample ● 2013-03-10 06:58:50 19481 [Note] /ssd/Percona-Server-5.6.8-alpha60.2-313- debug.Linux.x86_64/bin/mysqld-debug: ready for connections. Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_6/tmp/master.sock' port: 13100 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug mysqld-debug: /ssd/ps56-univ-log-archive-qa/Percona-Server-5.6.8- alpha60.2/sql/protocol.cc:518: void Protocol::end_statement(): Assertion `0' failed. 04:00:32 UTC - mysqld got signal 6 ; ● 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Operating system error number 2 in a file operation. InnoDB: The error means the system cannot find the path specified. 2013-03-17 16:17:44 7f45c9e96700 InnoDB: File name /tmp/1363526254145352487/ib_log_archive_0000000000045568 2013-03-17 16:17:44 7f45c9e96700 InnoDB: File operation call: 'open' returned OS error 71. 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Cannot continue operation. 2013-03-17 16:17:44 7f45c9e96700 InnoDB: Assertion failure in thread 13993771698764813 in file os0file.cc line 62 InnoDB: Failing assertion: 0
  • 21. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 21 Error LogError Log AnalysisAnalysis ExampleExample ● Version: '5.6.10-alpha60.2-debug-log' socket: '/ssd/198649/current1_4/tmp/master.sock' port: 13060 Percona Server with XtraDB (GPL), Release alpha60.2, Revision 313-debug 2013-03-13 02:20:59 8466 [ERROR] InnoDB: Unable to lock /ssd/tmp/ib_log_archive_0000000687614464, error: 11 2013-03-13 02:20:59 8466 [Note] InnoDB: Check that you do not already have another mysqld process using the same InnoDB data or log files. InnoDB: Cannot create or open archive log file /ssd/tmp/ib_log_archive_0000000687614464. InnoDB: Cannot continue operation. InnoDB: Check that the log archive directory exists, InnoDB: you have access rights to it, and InnoDB: there is space available. ● 2013-03-11 03:34:23 7f9866dca700 InnoDB: Assertion failure in thread 140292537493248 in file row0purge.cc line 459 InnoDB: Failing assertion: 0x20UL & rec_get_info_bits( btr_cur_get_rec(btr_cur), dict_table_is_comp(index->table))
  • 22. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 22 ErrorErrorss ● 130325 5:54:07 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it. 130325 5:54:07 [ERROR] Fatal error: Can't open and lock privilege tables: Table 'mysql.host' doesn't exist ● 130223 12:44:05 InnoDB: Error: Write to file ./apr/fr1 failed at offset 13. InnoDB: 49152 bytes should have been written, only 0 were written. InnoDB: Operating system error number 9. InnoDB: Check that your OS and file system support files of this size. InnoDB: Check also that the disk is not full or a disk quota exceeded. InnoDB: Error number 9 means 'Bad file descriptor'. ● 130325 5:36:11 [ERROR] /ssd/Server/bin/mysqld: Incorrect information in file: './test/v.frm' ● 130325 6:07:46 [ERROR] /ssd/Server/bin/mysqld: Sort aborted: Query execution was interrupted ● 130325 6:10:07 [ERROR] /ssd/Server/bin/mysqld: Sort aborted: Server shutdown in progress
  • 23. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 23 CrashCrasheses ● thread 10 (LWP 23954): +bt #0 0x0000000004e3969c in pthread_kill () from /lib64/libpthread.so.0 #1 0x00000000007e2779 in my_write_core (sig=11) at /ssd/QA-16274-5.5/Percona-Server- 5.5.28-rel29.3/mysys/stacktrace.c:433 #2 0x00000000006ab0ea in handle_fatal_signal (sig=11) at /ssd/QA-16274-5.5/Percona- Server-5.5.28-rel29.3/sql/signal_handler.cc:249 #3 <signal handler called> #4 rbt_free_node (node=0x0, nil=0x1040f170) at /ssd/QA-16274-5.5/Percona-Server-5.5.28- rel29.3/storage/innobase/ut/ut0rbt.c:731 #5 0x00000000009935e9 in rbt_free_node (node=0x1040f1e0, nil=0x1040f170) at /ssd/QA- 16274-5.5/Percona-Server-5.5.28-rel29.3/storage/innobase/ut/ut0rbt.c:731 ● https://bugs.launchpad.net/percona-server/+bug/1111226 (Crash, Valgrind, Error Log)
  • 24. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 24 OSOS//HardwareHardware Related MessageRelated Message((ss))
  • 25. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 25 OSOS Related IssuesRelated Issues ● https://bugs.launchpad.net/percona-server/+bug/806975https://bugs.launchpad.net/percona-server/+bug/806975 ● OS errors: PerrorOS errors: Perror – <base_dir>/bin/perror
  • 26. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 26 OOMOOM ● CLI: ERROR 5 (HY000): Out of memory (Needed 128992 bytes) ● Error Log: 110531 17:12:08 [ERROR] /home/philips/bzr/mysql-55- eb/sql/mysqld: Out of memory (Needed 129872 bytes) ● Use Valgrind [Memcheck, Massif]! ● https://bugs.launchpad.net/percona-server/+bug/1042946 – Could cause OOM – Valgrind [Massif] helps
  • 27. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 27 SigSig=x &=x & KillKill -x-x ● Signal NumbersSignal Numbers – Sig=x Signal Results Note – 4 SIGILL Core Illegal Instruction – 6 SIGABRT Core Abort signal by abort() – 8 SIGFPE Core Floating Point Exception – 11 SIGSEGV Core Invalid Memory Reference ● Tip: you can use for example 'kill -11' to get a core dump atTip: you can use for example 'kill -11' to get a core dump at any given point, for example I've used this when seeing aany given point, for example I've used this when seeing a memory allocation issue.memory allocation issue.
  • 28. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 28 HaltHaltss ● Different from Sig=x.Different from Sig=x. – Server just “halts” ● Different from unplanned shutdownDifferent from unplanned shutdown – Server just “halts” ● Query loggingQuery logging ● Error log informationError log information ● (gdb breakpoints on exit functions)(gdb breakpoints on exit functions)
  • 29. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 29 DatabaseDatabase CorruptionCorruption ● Standard recovery (Not corruption):Standard recovery (Not corruption): 130327 11:02:02 InnoDB: highest supported file format is Barracuda. InnoDB: The log sequence number in ibdata files does not match InnoDB: the log sequence number in the ib_logfiles! 130327 11:02:02 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite buffer... ● Data Corruption:Data Corruption: 120117 1:22:00 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 0 1 2 3 4 5[...] 99 InnoDB: Apply batch completed 120117 1:22:02 InnoDB: Rolling back trx with id A01D1001, 13 rows to undo InnoDB: Dropping table with id 54885 in recovery if it exists InnoDB: Error: trying to load index PRIMARY for table nr92/#sql2-2e46-316ce0 InnoDB: but the index tree has been freed! InnoDB: Rolling back of trx id A01D1001 completed
  • 30. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 30 DeadlocksDeadlocks ● Deadlocks are funDeadlocks are fun ● User initiated vs actual server deadlockUser initiated vs actual server deadlock – User Initiated: mysql> select * from t1 where a = 2; #with corresponding other session ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction – Server deadlock Programming deadlock
  • 31. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 31 (3(3rdrd ) Party Messages (RQG, Valgrind) vs.) Party Messages (RQG, Valgrind) vs. ● Other items which may write to the error logOther items which may write to the error log - RQG - Valgrind - InnoDB status monitor ● Valgrind Example:Valgrind Example: ==12667== Thread 15: ==12667== Invalid read of size 8 ==12667== at 0x93D473: lock_rec_block_validate (lock0lock.c:4969) ==12667== by 0x93D8D0: lock_print_info_all_transactions (lock0lock.c:5113) ==12667== by 0x862BAC: srv_printf_innodb_monitor (srv0srv.c:2263) ==12667== by 0x862DA5: srv_monitor_thread (srv0srv.c:2580) ==12667== by 0x4E34850: start_thread (in /lib64/libpthread-2.12.so) ==12667== by 0x19FCA6FF: ??? ==12667== Address 0x16220c48 is 664 bytes inside a block of size 872 free'd ==12667== at 0x4C2695D: free (vg_replace_malloc.c:366) ==12667== by 0x952579: mem_area_free (mem0pool.c:519) [...]
  • 32. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 32 SideTour:SideTour: Googling++Googling++ (Demo)(Demo)
  • 33. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 33 Considering InnoDB Status & RecoveryConsidering InnoDB Status & Recovery ● This is for 'Production' (non-test) systems onlyThis is for 'Production' (non-test) systems only ● innodb_force_recoveryinnodb_force_recovery http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.htmlhttp://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
  • 34. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 34 Crash Severity SummaryCrash Severity Summary
  • 35. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 35 Core Dumps: Locations & SetupCore Dumps: Locations & Setup
  • 36. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 36 Core Dumps: Analysis: using gdbCore Dumps: Analysis: using gdb ● LES exampleLES example ● Google search exampleGoogle search example ● thread apply all btthread apply all bt ● Cheat SheetCheat Sheet
  • 37. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 37 Core Dumps: Analysis: using WinDbgCore Dumps: Analysis: using WinDbg ● DemoDemo
  • 38. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 38 Valgrind: IntroductionValgrind: Introduction
  • 39. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 39 RQG: IntroductionRQG: Introduction (Demo)(Demo)
  • 40. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 40 SideTour: Bash Scripting Fun!SideTour: Bash Scripting Fun! (Demo)(Demo)
  • 41. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 41 SQL ErrorsSQL Errors ● Place?Place? – MySQL CLI – Error Log ● Usually higher severity then in the CLI – Application
  • 42. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 42 Query SimplificationQuery Simplification ● Objective?Objective? – Good testcase: Reduce a crashing/failing query to the minimum length and complexity required to still obtain the “desired” crash/error/issue (QA/Debugging) – Optimizing the query: Reduce a query to obtain the same result without altering it's functionality or future results with changed data (Support/Optimization)
  • 43. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 43 Query Simplification: ClausesQuery Simplification: Clauses mysql> SELECT * FROM t1 WHERE a>1 LIMIT 20;mysql> SELECT * FROM t1 WHERE a>1 LIMIT 20; mysql> SELECT * FROM t1;mysql> SELECT * FROM t1;
  • 44. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 44 Query Simplification: Query SplitQuery Simplification: Query Split mysql> SELECT * FROM (SELECT a FROM t1) AS res1mysql> SELECT * FROM (SELECT a FROM t1) AS res1 WHERE a > 2;WHERE a > 2; mysql> SELECT a FROM t1;mysql> SELECT a FROM t1; mysql> SELECT a FROM t1 WHERE a > 2;mysql> SELECT a FROM t1 WHERE a > 2;
  • 45. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 45 Query Simplification: Generalize DataQuery Simplification: Generalize Data mysql> SELECT "There was once a little bug" INTO @a;mysql> SELECT "There was once a little bug" INTO @a; mysql> SELECT "a" INTO @a;mysql> SELECT "a" INTO @a; mysql> SELECT 1 INTO @a;mysql> SELECT 1 INTO @a;
  • 46. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 46 Query Simplification: Move Data into QueryQuery Simplification: Move Data into Query ● Often involves changes of results, but that matters “little”Often involves changes of results, but that matters “little” (reproducibility factor may change) if the issue still(reproducibility factor may change) if the issue still reproduces.reproduces. mysql> SELECT a FROM t1; a=columnmysql> SELECT a FROM t1; a=column mysql> SELECT "a" FROM t1; a=?mysql> SELECT "a" FROM t1; a=? mysql> SELECT 1 FROM t1; 1=digit 1 (x rows)mysql> SELECT 1 FROM t1; 1=digit 1 (x rows) mysql> SELECT 1;mysql> SELECT 1;
  • 47. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 47 Query Simplification: Limit nr of FieldsQuery Simplification: Limit nr of Fields mysql> SELECT a,b,x,z FROM t1;mysql> SELECT a,b,x,z FROM t1; mysql> SELECT * FROM t1;mysql> SELECT * FROM t1; mysql> SELECT a FROM t1;mysql> SELECT a FROM t1; mysql> SELECT 1 FROM t1;mysql> SELECT 1 FROM t1;
  • 48. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 48 Query Simplification: Simplify TableQuery Simplification: Simplify Table mysql> SELECT a FROM t1;mysql> SELECT a FROM t1; mysql> ALTER TABLE t1 DROP COLUMN b;mysql> ALTER TABLE t1 DROP COLUMN b;
  • 49. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 49 Test Case ProductionTest Case Production ● Strategies:Strategies: – Bring server up -> re-run crashing query ● mysqldump -> add query -> “Simplify the query” – Use randgen or Gypsy with grammars based on crashing query (or usually used queries if the crashing query is not known) – Run Valgrind for some time on the server to check for programming errors
  • 50. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 50 SideTour:SideTour: Logging Bugs++Logging Bugs++
  • 51. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 51 SideTour:SideTour: A note on threading & reproducibilityA note on threading & reproducibility ● 1 Thread?1 Thread? – Usually easy to reproduce ● But not always: timing, OS slicing, SE dives etc. ● Impossible to 100% match timing ● Many threads?Many threads? – Usually hard to reproduce (exception: RQG/Gypsy) ● Dev core analysis is usually quickest way forward
  • 52. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 52 Percona!Percona!
  • 53. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 53 Connect++Connect++
  • 54. From Crash to Testcase: a Debugging Primer Roel Van de Paar, Percona 54 ConnectConnect++++ [email protected]