How to determine the leading prefix part of a MySQL index

In MySQL, an index can be limited to use only a leading prefix part of a column values. Prefixes can be specified for CHAR, VARCHAR, BINARY, and VARBINARY columns, while for BLOB and TEXT columns the prefix must be given.

For example, the statement shown here creates an index using the first 20 characters of the mytext column:

CREATE INDEX part_of_mytext ON mytable (mytext(10));

The big question before creating new index is: what is the size of the leading prefix that should be use in the index?
The answer is depend on the data. You should choose the size that is approximately give unique values.

Use this query to see how unique parts are compared with the complete uniqueness (COUNT(DISTINCT mytext)):

SELECT
    COUNT(DISTINCT mytext),
    COUNT(DISTINCT LEFT(mytext, 5)),
    COUNT(DISTINCT LEFT(mytext, 10)),
    COUNT(DISTINCT LEFT(mytext, 20)),
    COUNT(DISTINCT LEFT(mytext, 40)),
    COUNT(DISTINCT LEFT(mytext, 60)),
    COUNT(DISTINCT LEFT(mytext, 80))
FROM
    mytable;

Notes:
1. Prefix lengths are given in characters for nonbinary string types (CHAR, VARCHAR, and TEXT) and in bytes for binary string types (BINARY, VARBINARY, and BLOB).
2. Using column prefixes for indexes can make the index file much smaller, which could save a lot of disk space and might also speed up INSERT operations.

MySQL Quiz
Leave a Reply

*