Preamble
Enabling profiling is an affordable way to get an accurate estimate of the time the request has been executed. First you need to enable profiling and call show profiles to get an accurate estimate of the time the request has been executed.
Request profiling
For example, we have the following operation to add data. Suppose that User1 and Gallery1 are already created:
INSERT INTO `homestead`.`images` (`id`, `gallery_id`, `original_filename`, `description`) VALUES
(1, 1, 'me.jpg', 'me.jpg', 'A photo of me walking down the street'),
(2, 1, 'dog.jpg', 'dog.jpg', 'A photo of my dog on the street'),
(3, 1, 'cat.jpg', 'cat.jpg', 'A photo of my cat walking down the street'),
(4, 1, 'purr.jpg', 'purr.jpg', 'A photo of my cat purring');
Executing this request will not cause any problems. But let’s consider the following command:
SELECT * FROM `homestead`.`images` AS i
WHERE i.description LIKE '%street%';
This query is a good example of what might cause problems in the future if we sample a large number of images from the database.
To get the exact time of execution of this query, you can use the following SQL code:
set profiling = 1;
SELECT * FROM `homestead`.`images` AS i
WHERE i.description LIKE '%street%';
show profiles;
Result:
Query_Id | Duration | Request |
1 | 0.00016950 | SHOW WARNINGS |
2 | 0.00039200 | SELECT * FROM homestead.images AS i nWHERE i.description LIKE ’%street%’nLIMIT 0, 1000 |
3 | 0.00037600 | SHOW KEYS FROM homestead.images |
4 | 0.00034625 | SHOW DATABASES LIKE ’homestead |
5 | 0.00027600 | SHOW TABLES FROM homestead LIKE ’images’ |
6 | 0.00024950 | SELECT * FROM homestead.images WHERE 0=1 |
7 | 0.00104300 | SHOW FULL COLUMNS FROM homestead.images LIKE ’id’ |
The command show profiles displays the execution time not only of the original request, but of all the others. In this way, you can accurately profile the requests.
Optimization
But how do you optimize them? To do this, you can use MySQL command explain and improve query performance based on actual information.
Explain is used to get a query execution plan. The way MySQL will execute a query. This command works with the SELECT, DELETE, INSERT, REPLACE and UPDATE operators. The official documentation describes the explain command as follows:
With EXPLAIN you can see where to add indexes to the table to make the operator run faster. You can also use EXPLAIN to check if the table optimizer combines in an optimal order.
As an example, we will look at the query that UserManager.php performs to find the user at the email address:
SELECT * FROM `homestead`.`users` WHERE email = 'claudio.ribeiro@examplemail.com';
To use the explain command, add it before requesting a selection:
EXPLAIN SELECT * FROM `homestead`.`users` WHERE email = 'claudio.ribeiro@examplemail.com';
The result of the work:
id | select_type | table | partitions | type | possible_keys | Key | key_len | ref | rows | filtered | Extra |
1 | SIMPLE | ‘users’ | NULL | ‘const’ | ‘UNIQ_1483A5E9E7927C74’ | ‘UNIQ_1483A5E9E7927C74’ | ‘182’ | ‘const’ | 100.00 | NULL |
- id: is a serial identifier for each SELECT request.
- select_type: the type of SELECT request. This field may accept different values:
- SIMPLE: simple query without subqueries or unions
- PRIMARY: select is in an external request;
- DERIVED: select is part of the subquery;
- SUBQUERY: first select is part of the subquery;
- UNION: select is the second or subsequent operator of the association.
- table: the name of the database table.
- type: specifies how MySQL merges the tables used. The value may indicate the missing indexes and how the query should be rewritten. Possible values for this field:
- system: the table has zero or one row.
- const: the table has only one corresponding row that is indexed. This is the fastest type of association.
- eq_ref: all parts of the index are used by the union. The PRIMARY_KEY or UNIQUE NOT NULL index is used.
- ref: all rows with matching index for each combination of rows from the previous one will be read from the table. This type of union is displayed for indexed columns compared using operators= or<=>.
- Fulltext: the union uses FULLTEXT table index.
- ref_or_null: this is the same as ref, but also contains rows with the value NULL.
- index_merge: the union uses an index list to get the resulting set. The KEY column will contain the keys used.
- unique_subquery: the IN subquery returns only one result from the table and uses the primary key.
- range: the index is used to find suitable rows within a certain range.
- index: the entire index tree is scanned to find the appropriate rows.
- all: the table is scanned to find the appropriate rows to combine. This is the least optimal type of merge. It often indicates that there are no corresponding indexes in the table.
- possible_keys: shows the keys that can be used by MySQL to find the rows in the table.
- keys: the actual index used by MySQL. The DBMS always looks for the optimal key that can be used for a query. When combining many tables, it can identify other keys that are not listed in the possible_keys list, but are more optimal.
- key_len: specifies the length of the index that the query optimizer has selected for use.
- ref: shows the columns or constants that are compared to the index specified in the key column.
- key_len: shows the number of records that have been checked to produce the output. This is an important indicator; the fewer records checked, the better.
- Extra: contains additional information. Values such as Using filesort or Using temporary in this column may indicate a problem request.
Full documentation on the output format of explain can be found on the official MySQL page.
Returning to our query. It has the SIMPLE sample type with the const. This is the most optimal combination. But what happens when more complex queries are executed?
For example, when you want to get all images of the gallery. Or display only pictures that contain the word “cat” in the description. Consider the following query:
SELECT gal.name, gal.description, img.filename, img.description FROM `homestead`.`users` AS users
LEFT JOIN `homestead`.`galleries` AS gal ON users.id = gal.user_id
LEFT JOIN `homestead`.`images` AS img on img.gallery_id = gal.id
WHERE img.description LIKE '%dog%';
In this case, we will have more information for analysis:
EXPLAIN SELECT gal.name, gal.description, img.filename, img.description FROM `homestead`.`users` AS users
LEFT JOIN `homestead`.`galleries` AS gal ON users.id = gal.user_id
LEFT JOIN `homestead`.`images` AS img on img.gallery_id = gal.id
WHERE img.description LIKE '%dog%';
The result of the query:
id | select_type | table | partitions | type | possible_keys | Key | key_len | ref | rows | filtered | Extra |
1 | SIMPLE | ‘users’ | NULL | ‘index’ | ‘PRIMARY,UNIQ_1483A5E9BF396750’ | ‘UNIQ_1483A5E9BF396750’ | ‘108’ | NULL | 100.00 |
‘Using index’
|
|
1 | SIMPLE | ‘gal’ | NULL | ‘ref’ | ‘PRIMARY,UNIQ_F70E6EB7BF396750,IDX_F70E6EB7A76ED395’ | ‘UNIQ_1483A5E9BF396750’ | ‘108’ | ‘homestead.users.id’ | 100.00 | NULL | |
1 | SIMPLE | ‘img’ | NULL | ‘ref’ | ‘IDX_E01FBE6A4E7AF8F’ | ‘IDX_E01FBE6A4E7AF8F’ | ‘109’ | ‘homestead.gal.id’ | ‘25.00’ |
‘Using where’
|
The main columns that we should pay attention to are type and the goal is to get the best value in the type column and the smallest possible number in the rows column.
The result of the first query is a bad index. This means that we can optimize the request.
The Users table is not used. Therefore, we can extend the query to make sure that we are covering the users, or we can remove some of the users query. But this will only increase the complexity and execution time.
SELECT gal.name, gal.description, img.filename, img.description FROM `homestead`.`galleries` AS gal
LEFT JOIN `homestead`.`images` AS img on img.gallery_id = gal.id
WHERE img.description LIKE '%dog%';
Let’s look at the output:
id | select_type | Table | partitions | type | possible_keys | key | key_len | Ref | rows | filtered | Extra |
1 | SIMPLE | ‘gal’ | NULL | ‘ALL’ | ‘PRIMARY,UNIQ_1483A5E9BF396750’ | NULL | NULL | NULL | 100.00 | NULL | |
1 | SIMPLE | ‘img’ | NULL | ‘ref’ | ‘IDX_E01FBE6A4E7AF8F’ | ‘IDX_E01FBE6A4E7AF8F’ | ‘109’ | ‘homestead.gal.id’ | ‘25.00’ |
‘Using where’
|
We still have the value of type ALL. This is one of the worst combination options, but sometimes it is the only possible type.
We need all images of the gallery, so we should look at the entire gallery table. The indexes are suitable for finding specific data in the table. But not to sample all the information in the table.
The last thing we can do is to add index FULLTEXT to the description field. So we will change LIKE to match() and increase productivity. More details about full-text indexes can be found here.
Let us return to the functionality of the application we are developing: newest and related. They are used in galleries. The following requests are used in them:
EXPLAIN SELECT * FROM `homestead`.`galleries` AS gal.
LEFT JOIN `homestead`.`users` AS u ON u.id = gal.user_id
WHERE u.id = 1
ORDER BY gal.created_at DESC
LIMIT 5;
The above code is intended for use by the user.
EXPLAIN SELECT * FROM `homestead`.`galleries` AS gal
ORDER BY gal.created_at DESC
LIMIT 5;
The above code is for newest.
At first glance, these requests are fast because they use . Unfortunately, in our application these queries also use the ORDER BY operator. We therefore lose the benefits of using LIMIT.
Working with LIMIT can degrade performance. To verify this, let us run the explain command.
id | select_type | Table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
1 | SIMPLE | ‘gal’ | NULL | ‘ALL’ | ‘IDX_F70E6EB7A76ED395’ | NULL | NULL | NULL | 100.00 | ‘Using where; Using filesort’ | |
1 | SIMPLE | ‘u’ | NULL | ‘eq_ref’ | ‘PRIMARY,UNIQ_1483A5E9BF396750’ | ‘PRIMARY | ‘108’ | ‘homestead.gal.id’ | ‘100.00’ | NULL |
and
id | select_type | table | partitions | Type | possible_keys | key | key_len | ref | rows | filtered | Extra |
1 | SIMPLE | ‘gal’ | NULL | ‘ALL’ | NULL | NULL | NULL | NULL | 100.00 |
‘Using filesort’
|
As we can see, we have the worst type of union: ALL for both requests.
The combination with LIMIT has often caused performance problems with MySQL. This operator mapping is used in most interactive applications with large data sets.
Recommendations for solving this problem
Use the indexes. In our case created_at is an excellent option. Thus, we execute both LIMIT without scanning and sorting the full set of results.
Sort by column in the leading table. If ORDER BY is specified after a field from the table that is not the first in the order of combining, the index cannot be used.
Do not sort by expression. Expressions and functions do not allow using ORDER BY indexes.
Beware of a large value . Large LIMIT values cause ORDER BY to be sorted by more rows. This affects performance.
Conclusion
The explain command allows you to identify problems in queries at an early stage of application development and provide the program with high performance.
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Enhancing Education Sector Efficiency: Enteros for Database Performance, AWS DevOps, Cloud FinOps, and RevOps Integration
- 27 December 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros: Optimizing Cloud Platforms and Database Software for Cost Efficiency in the Healthcare Sector with Cloud FinOps
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros and Cloud FinOps: Elevating Database Performance and Logical Models in the Public Sector
- 26 December 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Transforming Life Sciences with Enteros: Harnessing Database Software and Generative AI for Innovation
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…