SlideShare a Scribd company logo
MySQL Indexes
MySQL Indexes
Why use indexes?
Most MySQL indexes (PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT) are stored in b-trees
B-tree is a self-balancing tree data structure that keeps data sorted and allows searches, sequential
access, insertions, and deletions in predictable time
B-tree
Time complexity:
Full table scan = O(n)
Using index = O(log(n))
Selectivity
Selectivity is the ratio of unique values within a certain column
The more unique the values, the higher the selectivity
The query engine likes highly selective key columns
The higher the selectivity, the faster the query engine can reduce the size of the
result set
Selectivity and Cardinality
Cardinality is number of unique values in the index.
In simple words:
Max cardinality: all values are unique
Min cardinality: all values are the same
Selectivity of index = cardinality/(number of records) * 100%
The perfect selectivity is 100%. Can be reached by unique indexes on NOT NULL columns.
Query optimization
The main idea is not to try to tune your database, but optimize
your query based on the data you have
Selectivity by example
Example:
Table of 10,000 rows with column `gender` (number of males ~ number of females)
Let’s count selectivity for the `gender` column
Selectivity = 2/10000 * 100% = 0.02% which is very low
When selectivity can be neglected
Selectivity can be neglected when values are distributed unevenly
Example:
If our query select rows with stat IN (0,1) then we can still use index.
As a general idea, we should create indexes on tables that are often queried for less than 15% of the
table's rows
How MySQL uses indexes
• Data Lookups
• Sorting
• Avoiding reading “data”
• Special Optimizations
Data Lookups
SELECT * FROM employees WHERE lastname=“Smith”
The classical use of index on (lastname)
Can use Multiple column indexes
SELECT * FROM employees WHERE lastname=“Smith” AND
dept=“accounting”
Use cases
Index (a,b,c) - order of columns matters
Will use Index for lookup (all listed keyparts)
a>5
a=5 AND b>6
a=5 AND b=6 AND c=7
a=5 AND b IN (2,3) AND c>5
Will NOT use Index
b>5 – Leading column is not referenced
b=6 AND c=7 - Leading column is not referenced
Will use Part of the index
The thing with ranges
MySQL will stop using key parts in multi part index as soon as
it met the real range (<,>, bETWEEN), it however is able to
continue using key parts further to the right if IN(…) range is
used
Sorting
SELECT * FROM players ORDER BY score DESC LIMIT 10
Will use index on SCORE column
Without index MySQL will do “filesort” (external sort) which is very expensive
Often Combined with using Index for lookup
SELECT * FROM players WHERE country=“US” ORDER BY score DESC LIMIT 10
Best served by Index on (country, score)
Use Cases
It becomes even more restricted!
KEY(a,b)
Will use Index for Sorting
ORDER BY a - sorting by leading column
a=5 ORDER BY b - EQ filtering by 1st and sorting by 2nd
ORDER BY a DESC, b DESC - Sorting by 2 columns in same order
a>5 ORDER BY a - Range on the column, sorting on the same
Will NOT use Index for Sorting
Sorting rules
You can’t sort in different order by 2 columns
You can only have Equality comparison (=) for columns which
are not part of ORDER BY
Not even IN() works in this case
Avoid reading the data
“Covering Index”
Applies to index use for specific query, not type of index.
Reading Index ONLY and not accessing the “data”
SELECT status FROM orders WHERE customer_id=123
KEY(customer_id, status)
Index is typically smaller than data
Access is a lot more sequential
Aggregation functions
Index help MIN()/MAX() aggregate functions
But only these
SELECT MAX(id) FROM table;
SELECT MAX(salary) FROM employee GROUP BY dept_id
Will benefit from (dept_id, salary) index
“Using index for group-by”
Joins
MySQL Performs Joins as “Nested Loops”
SELECT * FROM posts p, comments c WHERE p.author=“Peter” AND c.post_id=p.id
Scan table `posts` finding all posts which have Peter as an author
For every such post go to `comments` table to fetch all comments
Very important to have all JOINs Indexed
Index is only needed on table which is being looked up
The index on posts.id is not needed for this query performance
Multiple indexes
MySQL Can use More than one index
“Index Merge”
SELECT * FROM table WHERE a=5 AND b=6
Can often use Indexes on (a) and (b) separately
Index on (a,b) is much better
SELECT * FROM table WHERE a=5 OR b=6
2 separate indexes is as good as it gets
String indexes
There is no difference… really
Sort order is defined for strings (collation)
“AAAA” < “AAAB”
Prefix LIKE is a special type of Range
LIKE “ABC%” means
“ABC[LOWEST]”<KEY<“ABC[HIGHEST]”
LIKE “%ABC” can’t be optimized by use of the index
Real case: Problem
Lets take example from real world (Voltu first page campaigns list)
Real case: Timing
Initially it was like 1m 20sec seconds to run for the first time
After mysql cached the response, it was about 20sec
Real case: Query
SELECT wk2_campaign.*,
wk2_campaignGroup.category_id as group_category_id,
wk2_campaignGroup.subcategory_id as group_subcategory_id,
wk2_campaignGroup.summary as group_summary,
IFNULL(wk2_campaign.category_id, wk2_campaignGroup.category_id) category_id
FROM `wk2_campaign`
LEFT JOIN wk2_resource_status ON( wk2_resource_status.id = wk2_campaign.CaID)
LEFT JOIN campaign_has_group ON( wk2_campaign.CaID = campaign_has_group.campaign_id)
LEFT JOIN wk2_campaignGroup ON( campaign_has_group.campaign_group_id = wk2_campaignGroup.GrID)
LEFT JOIN si_private_campaigns pc ON( pc.campaign_id = wk2_campaign.CaID)
WHERE
(wk2_campaign.tracking_active = '1') AND
((IFNULL(wk2_campaign.category_id, wk2_campaignGroup.category_id) IS NOT NULL)
AND (IFNULL(wk2_campaign.category_id, wk2_campaignGroup.category_id) NOT IN(SELECT id FROM campaign_categories WHERE name IN(
'Mobile Content Subscription'
)))
AND(countries REGEXP 'US')) AND(
((wk2_campaign.stat IN('0', '1')) AND(
wk2_resource_status.resource_type =
'ca') AND(
wk2_resource_status.status =
'1') AND(wk2_campaign.access !=
'0') AND(wk2_campaign.external_id IS NULL) AND(
wk2_campaign.name IS NOT NULL
) AND(wk2_campaign.countries IS NOT NULL) AND(
trim(wk2_campaign.countries) IS NOT NULL
)) OR(pc.campaign_id IS NOT NULL)
);
Steps to optimize
1. Add missing indexes for the joined tables
2. Check the selectivity for different columns of the main table wk2_campaign
The `tracking_active`, `stat` columns have the best selectivity (the low number
of possible values) which can be indexed fast and boost query response time.
Steps to optimize
3. Add index on these columns:
ALTER TABLE wk2_campaign ADD INDEX(tracking_active, stat);
4. We needed just to move some conditions so that they would fit the index
Result of optimization
With these manipulations we made the query use only indexes
The explain select of this query:
Query run before after Performance
increase
First time 1m 20s 0m 2s 4000%
Subsequent (cached by
mysql)
20s 0.26s 7692%
Another example with “or”
Before
SELECT `wk2_campaign`.*
FROM `wk2_campaign`
LEFT JOIN campaign_summary ON (campaign_summary.campaign_id = caid)
WHERE (name LIKE '%buscape%' OR caid LIKE 'buscape%') OR mobile_app_id LIKE '%buscape%' OR caid in
('89630','89632');
130 rows in set (7.43 sec)
After
SELECT `wk2_campaign`.* FROM `wk2_campaign` LEFT JOIN campaign_summary ON (campaign_summary.campaign_id =
caid) WHERE (name LIKE '%buscape%' OR caid LIKE 'buscape%')
UNION
SELECT `wk2_campaign`.* FROM `wk2_campaign` LEFT JOIN campaign_summary ON (campaign_summary.campaign_id =
caid) WHERE mobile_app_id LIKE '%buscape%'
UNION
SELECT `wk2_campaign`.* FROM `wk2_campaign` LEFT JOIN campaign_summary ON (campaign_summary.campaign_id =
caid) WHERE caid in ('89630','89632');
130 rows in set (4.12 sec)
> SELECT text
FROM questions
LIMIT 5;
> EXPLAIN

More Related Content

PPTX
Session 8 connect your universal application with database .. builders & deve...
PPTX
07.3. Android Alert message, List, Dropdown, and Auto Complete
PPT
Indexing Strategies
ODP
Database index by Reema Gajjar
PPTX
Database
DOCX
Accessing data with android cursors
PPTX
How mysql choose the execution plan
PPTX
Oracle basic queries
Session 8 connect your universal application with database .. builders & deve...
07.3. Android Alert message, List, Dropdown, and Auto Complete
Indexing Strategies
Database index by Reema Gajjar
Database
Accessing data with android cursors
How mysql choose the execution plan
Oracle basic queries

What's hot (20)

PDF
Mysql cheatsheet
PPTX
MySQL index optimization techniques
PPTX
09.1. Android - Local Database (Sqlite)
PPTX
SQL Prepared Statements Tutorial
PPTX
3. R- list and data frame
PPTX
MYSQL single rowfunc-multirowfunc-groupby-having
PPTX
MYSql manage db
PPTX
MYSQL join
PPTX
02 database oprimization - improving sql performance - ent-db
PDF
View & index in SQL
PPTX
Obtain better data accuracy using reference tables
PPTX
Tutorial - Learn SQL with Live Online Database
PDF
MySQL Query And Index Tuning
PPTX
4. R- files Reading and Writing
PDF
MySQL Performance Optimization
PDF
How to Analyze and Tune MySQL Queries for Better Performance
DOC
30 08 Final Sql
PDF
PostgreSQL: Advanced features in practice
PPTX
SQL Server Learning Drive
PPTX
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Mysql cheatsheet
MySQL index optimization techniques
09.1. Android - Local Database (Sqlite)
SQL Prepared Statements Tutorial
3. R- list and data frame
MYSQL single rowfunc-multirowfunc-groupby-having
MYSql manage db
MYSQL join
02 database oprimization - improving sql performance - ent-db
View & index in SQL
Obtain better data accuracy using reference tables
Tutorial - Learn SQL with Live Online Database
MySQL Query And Index Tuning
4. R- files Reading and Writing
MySQL Performance Optimization
How to Analyze and Tune MySQL Queries for Better Performance
30 08 Final Sql
PostgreSQL: Advanced features in practice
SQL Server Learning Drive
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Ad

Viewers also liked (7)

KEY
OSCON 2011 Learning CouchDB
PPT
Open Source CRM
ODP
Crm Webinar
ZIP
CouchDB-Lucene
PPT
InviteBriefIntroEn_120531
PPTX
Microsoft Dynamics AX 2009 CRM training
KEY
Real World CouchDB
OSCON 2011 Learning CouchDB
Open Source CRM
Crm Webinar
CouchDB-Lucene
InviteBriefIntroEn_120531
Microsoft Dynamics AX 2009 CRM training
Real World CouchDB
Ad

Similar to MySQL Indexes (20)

PPT
Myth busters - performance tuning 101 2007
PPTX
U-SQL Partitioned Data and Tables (SQLBits 2016)
PPT
15 Ways to Kill Your Mysql Application Performance
PDF
MySQL Indexing
PDF
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
PPT
Indexing
PPTX
PPTX
Sql server lesson6
PDF
Mysql Optimization
PPT
INTRODUCTION TO SQL QUERIES REALTED BRIEF
PPTX
Query parameterization
PDF
Cassandra Data Modeling
ODP
BIS06 Physical Database Models
ODP
BIS06 Physical Database Models
PPT
Mysql Indexing
PPT
MYSQL.ppt
PDF
Mysql Explain Explained
PPT
Java Database Connectivity (JDBC) with Spring Framework is a powerful combina...
PPT
Sydney Oracle Meetup - indexes
PPTX
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Myth busters - performance tuning 101 2007
U-SQL Partitioned Data and Tables (SQLBits 2016)
15 Ways to Kill Your Mysql Application Performance
MySQL Indexing
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
Indexing
Sql server lesson6
Mysql Optimization
INTRODUCTION TO SQL QUERIES REALTED BRIEF
Query parameterization
Cassandra Data Modeling
BIS06 Physical Database Models
BIS06 Physical Database Models
Mysql Indexing
MYSQL.ppt
Mysql Explain Explained
Java Database Connectivity (JDBC) with Spring Framework is a powerful combina...
Sydney Oracle Meetup - indexes
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)

MySQL Indexes

  • 3. Why use indexes? Most MySQL indexes (PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT) are stored in b-trees B-tree is a self-balancing tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in predictable time
  • 4. B-tree Time complexity: Full table scan = O(n) Using index = O(log(n))
  • 5. Selectivity Selectivity is the ratio of unique values within a certain column The more unique the values, the higher the selectivity The query engine likes highly selective key columns The higher the selectivity, the faster the query engine can reduce the size of the result set
  • 6. Selectivity and Cardinality Cardinality is number of unique values in the index. In simple words: Max cardinality: all values are unique Min cardinality: all values are the same Selectivity of index = cardinality/(number of records) * 100% The perfect selectivity is 100%. Can be reached by unique indexes on NOT NULL columns.
  • 7. Query optimization The main idea is not to try to tune your database, but optimize your query based on the data you have
  • 8. Selectivity by example Example: Table of 10,000 rows with column `gender` (number of males ~ number of females) Let’s count selectivity for the `gender` column Selectivity = 2/10000 * 100% = 0.02% which is very low
  • 9. When selectivity can be neglected Selectivity can be neglected when values are distributed unevenly Example: If our query select rows with stat IN (0,1) then we can still use index. As a general idea, we should create indexes on tables that are often queried for less than 15% of the table's rows
  • 10. How MySQL uses indexes • Data Lookups • Sorting • Avoiding reading “data” • Special Optimizations
  • 11. Data Lookups SELECT * FROM employees WHERE lastname=“Smith” The classical use of index on (lastname) Can use Multiple column indexes SELECT * FROM employees WHERE lastname=“Smith” AND dept=“accounting”
  • 12. Use cases Index (a,b,c) - order of columns matters Will use Index for lookup (all listed keyparts) a>5 a=5 AND b>6 a=5 AND b=6 AND c=7 a=5 AND b IN (2,3) AND c>5 Will NOT use Index b>5 – Leading column is not referenced b=6 AND c=7 - Leading column is not referenced Will use Part of the index
  • 13. The thing with ranges MySQL will stop using key parts in multi part index as soon as it met the real range (<,>, bETWEEN), it however is able to continue using key parts further to the right if IN(…) range is used
  • 14. Sorting SELECT * FROM players ORDER BY score DESC LIMIT 10 Will use index on SCORE column Without index MySQL will do “filesort” (external sort) which is very expensive Often Combined with using Index for lookup SELECT * FROM players WHERE country=“US” ORDER BY score DESC LIMIT 10 Best served by Index on (country, score)
  • 15. Use Cases It becomes even more restricted! KEY(a,b) Will use Index for Sorting ORDER BY a - sorting by leading column a=5 ORDER BY b - EQ filtering by 1st and sorting by 2nd ORDER BY a DESC, b DESC - Sorting by 2 columns in same order a>5 ORDER BY a - Range on the column, sorting on the same Will NOT use Index for Sorting
  • 16. Sorting rules You can’t sort in different order by 2 columns You can only have Equality comparison (=) for columns which are not part of ORDER BY Not even IN() works in this case
  • 17. Avoid reading the data “Covering Index” Applies to index use for specific query, not type of index. Reading Index ONLY and not accessing the “data” SELECT status FROM orders WHERE customer_id=123 KEY(customer_id, status) Index is typically smaller than data Access is a lot more sequential
  • 18. Aggregation functions Index help MIN()/MAX() aggregate functions But only these SELECT MAX(id) FROM table; SELECT MAX(salary) FROM employee GROUP BY dept_id Will benefit from (dept_id, salary) index “Using index for group-by”
  • 19. Joins MySQL Performs Joins as “Nested Loops” SELECT * FROM posts p, comments c WHERE p.author=“Peter” AND c.post_id=p.id Scan table `posts` finding all posts which have Peter as an author For every such post go to `comments` table to fetch all comments Very important to have all JOINs Indexed Index is only needed on table which is being looked up The index on posts.id is not needed for this query performance
  • 20. Multiple indexes MySQL Can use More than one index “Index Merge” SELECT * FROM table WHERE a=5 AND b=6 Can often use Indexes on (a) and (b) separately Index on (a,b) is much better SELECT * FROM table WHERE a=5 OR b=6 2 separate indexes is as good as it gets
  • 21. String indexes There is no difference… really Sort order is defined for strings (collation) “AAAA” < “AAAB” Prefix LIKE is a special type of Range LIKE “ABC%” means “ABC[LOWEST]”<KEY<“ABC[HIGHEST]” LIKE “%ABC” can’t be optimized by use of the index
  • 22. Real case: Problem Lets take example from real world (Voltu first page campaigns list)
  • 23. Real case: Timing Initially it was like 1m 20sec seconds to run for the first time After mysql cached the response, it was about 20sec
  • 24. Real case: Query SELECT wk2_campaign.*, wk2_campaignGroup.category_id as group_category_id, wk2_campaignGroup.subcategory_id as group_subcategory_id, wk2_campaignGroup.summary as group_summary, IFNULL(wk2_campaign.category_id, wk2_campaignGroup.category_id) category_id FROM `wk2_campaign` LEFT JOIN wk2_resource_status ON( wk2_resource_status.id = wk2_campaign.CaID) LEFT JOIN campaign_has_group ON( wk2_campaign.CaID = campaign_has_group.campaign_id) LEFT JOIN wk2_campaignGroup ON( campaign_has_group.campaign_group_id = wk2_campaignGroup.GrID) LEFT JOIN si_private_campaigns pc ON( pc.campaign_id = wk2_campaign.CaID) WHERE (wk2_campaign.tracking_active = '1') AND ((IFNULL(wk2_campaign.category_id, wk2_campaignGroup.category_id) IS NOT NULL) AND (IFNULL(wk2_campaign.category_id, wk2_campaignGroup.category_id) NOT IN(SELECT id FROM campaign_categories WHERE name IN( 'Mobile Content Subscription' ))) AND(countries REGEXP 'US')) AND( ((wk2_campaign.stat IN('0', '1')) AND( wk2_resource_status.resource_type = 'ca') AND( wk2_resource_status.status = '1') AND(wk2_campaign.access != '0') AND(wk2_campaign.external_id IS NULL) AND( wk2_campaign.name IS NOT NULL ) AND(wk2_campaign.countries IS NOT NULL) AND( trim(wk2_campaign.countries) IS NOT NULL )) OR(pc.campaign_id IS NOT NULL) );
  • 25. Steps to optimize 1. Add missing indexes for the joined tables 2. Check the selectivity for different columns of the main table wk2_campaign The `tracking_active`, `stat` columns have the best selectivity (the low number of possible values) which can be indexed fast and boost query response time.
  • 26. Steps to optimize 3. Add index on these columns: ALTER TABLE wk2_campaign ADD INDEX(tracking_active, stat); 4. We needed just to move some conditions so that they would fit the index
  • 27. Result of optimization With these manipulations we made the query use only indexes The explain select of this query: Query run before after Performance increase First time 1m 20s 0m 2s 4000% Subsequent (cached by mysql) 20s 0.26s 7692%
  • 28. Another example with “or” Before SELECT `wk2_campaign`.* FROM `wk2_campaign` LEFT JOIN campaign_summary ON (campaign_summary.campaign_id = caid) WHERE (name LIKE '%buscape%' OR caid LIKE 'buscape%') OR mobile_app_id LIKE '%buscape%' OR caid in ('89630','89632'); 130 rows in set (7.43 sec) After SELECT `wk2_campaign`.* FROM `wk2_campaign` LEFT JOIN campaign_summary ON (campaign_summary.campaign_id = caid) WHERE (name LIKE '%buscape%' OR caid LIKE 'buscape%') UNION SELECT `wk2_campaign`.* FROM `wk2_campaign` LEFT JOIN campaign_summary ON (campaign_summary.campaign_id = caid) WHERE mobile_app_id LIKE '%buscape%' UNION SELECT `wk2_campaign`.* FROM `wk2_campaign` LEFT JOIN campaign_summary ON (campaign_summary.campaign_id = caid) WHERE caid in ('89630','89632'); 130 rows in set (4.12 sec)
  • 29. > SELECT text FROM questions LIMIT 5; > EXPLAIN