Preamble
MySQL indexes: Starting with MySQL 5.7, you can create indexes for expressions, or functional indexes using the generated columns. Basically you need to first use the generated column to define a functional expression and then index that column.
Very useful when working with JSON functions, you can find an example here and documentation there.
MySQL indexes
Starting with MySQL 8.0.13 we now have the easiest way to create functional indices.
Let’s see, this is a quick practical example:
- Using MySQL 8.0.15
- Request the test_db database
Below is the structure of my_my_salaries table:
mysql> SHOW CREATE TABLE mysalaries\G
******************************* 1. row ***********************
Table: my_salaries
Create Table: CREATE TABLE `my_salaries` (
`sal_no` int(11) NOT NULL,
`salary` int(11) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
PRIMARY KEY (`sal_no`, `from_date`),
CONSTRAINT `my_salaries_ibfk_1` FOREIGN KEY (`sal_no`) REFERENCES `employees` (`sal_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)
It contains some data
mysql> SELECT count(*) FROM my_salaries;
+----------+
| count(*) |
+----------+
| 356767 |
+----------+
mysql> SELECT * FROM my_salaries LIMIT 3;
+--------+--------+------------+------------+
| sal_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 10001 | 55000 | 2020-01-27 | 2020-02-27 |
| 10001 | 56000 | 2020-02-27 | 2020-03-27 |
| 10001 | 60000 | 2020-03-27 | 2020-04-27 |
+--------+--------+------------+------------+
Let’s focus on the next request:
mysql> SELECT * FROM my_salaries WHERE YEAR(to_date)=2019;
+--------+--------+------------+------------+
| sal_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 9564 | 48000 | 2019-01-27 | 2019-02-27 |
...snip...
| 9574 | 48000 | 2019-02-27 | 2019-03-27 |
+--------+--------+------------+------------+
89 rows in set (0.80 sec)
mysql> explain SELECT * FROM my_salaries WHERE YEAR(to_date)=2019\G.
******************************* 1. row ***********************
id: 1
select_type: SIMPLE
table: my_salaries
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 100.00
Extra: Using where
We have a full table scan ( type: ALL), that is, the index is not used. Perhaps because there is no index for the to_date column.
So, let’s add an index for to_date!
mysql> ALTER TABLE my_salaries ADD INDEX idx_to_date (to_date);
Query OK, 0 rows affected (17,13 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> SHOW CREATE TABLE my_salaries\G
******************************* 1. row ***********************
Table: my_salaries
Create Table: CREATE TABLE `my_salaries` (
`sal_no` int(11) NOT NULL,
`salary` int(11) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
PRIMARY KEY (`sal_no`, `from_date`),
KEY `idx_to_date` (`to_date`),
CONSTRAINT `my_salaries_ibfk_1` FOREIGN KEY (`sal_no`) REFERENCES `employees` (`sal_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
And run the request again with the hope of a better execution plan
mysql> explain SELECT * FROM my_salaries WHERE YEAR(to_date)=2019\G.
******************************* 1. row ***********************
id: 1
select_type: SIMPLE
table: my_salaries
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 100.00
Extra: Using where
Oh! Still have a full table scan!
The index cannot be used because of the function (YEAR ()) in the indexed column (to_date).
By the way, if you are really surprised, maybe you should read this.
This is the case when you need a functional index!
mysql> ALTER TABLE my_salaries ADD INDEX idx_year_to_date((YEAR(to_date)));
Query OK, 0 rows affected (15,12 sec)
Records: 0 Duplicates: 0 Warnings: 0
The syntax is very similar to creating an “ordinary” index. Although you should know about double brackets: ((<expression>))
Now we can see our new index with the name idx_year_to_date and the year of the indexed expression (to_date) :
mysql> SHOW CREATE TABLE mysalaries\G
******************************* 1. row ***********************
Table: my_salaries
Create Table: CREATE TABLE `my_salaries` (
`sal_no` int(11) NOT NULL,
`salary` int(11) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
PRIMARY KEY (`sal_no`, `from_date`),
KEY `idx_to_date` (`to_date`),
KEY `idx_year_to_date` ((year(`to_date`))),
CONSTRAINT `my_salaries_ibfk_1` FOREIGN KEY (`sal_no`) REFERENCES `employees` (`sal_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
mysql> SELECT INDEX_NAME, EXPRESSION
FROM INFORMATION_SCHEMA.STATISTICS
WHERE TABLE_SCHEMA='employees'.
AND TABLE_NAME = "my_salaries"
AND INDEX_NAME='idx_year_to_date';
+------------------+-----------------+
| INDEX_NAME |
+------------------+-----------------+
| idx_year_to_date | year(`to_date`) |
+------------------+-----------------+
Let’s check our request one more time
mysql> explain SELECT * FROM my_salaries WHERE YEAR(to_date)=2019\G.
******************************* 1. row ***********************
id: 1
select_type: SIMPLE
table: my_salaries
partitions: NULL
type: ref
possible_keys: idx_year_to_date
key: idx_year_to_date
key_len: 5
ref: const
rows: 89
filtered: 100.00
Extra: NULL
mysql> SELECT * FROM my_salaries WHERE YEAR(to_date)=2019;
+--------+--------+------------+------------+
| sal_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 9564 | 45000 | 2019-07-27 | 2019-08-08 |
...snip...
| 9574 | 45000 | 2019-09-27 | 2019-09-28 |
+--------+--------+------------+------------+
89 rows in set (0.00 sec)
There you go!
Now the request can use an index. And in this case, we will have a positive impact on the execution time.
It is also interesting to note that we can use idx_to_date, the first index created (non-functional), if we can rewrite the original query:
mysql> EXPLAIN SELECT *
FROM my_salaries
WHERE to_date BETWEEN '2019-01-01' AND '2019-12-31'\G
******************************* 1. row ***********************
id: 1
select_type: SIMPLE
table: my_salaries
partitions: NULL
type: range
possible_keys: idx_to_date
key: idx_to_date
key_len: 3
ref: NULL
rows: 89
filtered: 100.00
Extra: Using index condition
mysql> SELECT *
FROM my_salaries
WHERE to_date BETWEEN '2019-01-01' AND '2019-12-31'
+--------+--------+------------+------------+
| sal_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 9564 | 40000 | 2019-02-27 | 2019-03-27 |
...snip...
| 9564 | 45000 | 2019-08-27 | 2019-12-27 |
+--------+--------+------------+------------+
89 rows in set (0.00 sec)
This keeps the index, I mean less indexes for the engine. In addition, if we talk about the cost of maintenance, the cost of maintaining a functional index is higher than the usual.
On the other hand, the execution plan is less good (request cost is higher), and obviously, you should rewrite the request.
Requirements and limitations
The primary key cannot be a functional index:
mysql> CREATE TABLE t1 (i INT, PRIMARY KEY ((ABS(i)));
ERROR 3756 (HY000): The primary key cannot be a functional index.
You cannot index non-deterministic functions (RAND (), UNIX_TIMESTAMP (), NOW ()…).
mysql> CREATE TABLE t1 (i int, KEY ((RAND(i)));
ERROR 3758 (HY000): Expression of functional index 'functional_index' contains a disallowed function.
The SPATIAL and FULLTEXT indexes cannot have functional key parts.
Conclusion
Functional index is an interesting and relevant function that can be very useful for optimizing your requests without overwriting them, especially when working with JSON documents and other complex types.
Obviously, all the details you should know are in MySQL documentation: Functional key parts
If you are interested in high level architecture and low level design, please read the workload.
Learning MySQL; Using Indexes
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Revolutionizing Cost Allocation and Attribution in Real Estate with Enteros: Optimizing Database Performance for Better Financial Insights
- 24 December 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros: Streamlining Root Cause Analysis and Shared Cost Optimization with Cloud FinOps in the Public Sector
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Database Performance with Enteros: Leveraging Cloud FinOps and Observability for the Financial Sector
- 23 December 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros: Enhancing Database Security Cost Management with RevOps and AIOps for the Insurance Sector
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…