Five ways to improve MySQL select speed (Part 1)

This is the second article in the MySQL optimization series. In this article, I’m going to cover how to speed up select speed and improve query performance and improve MySQL performance.

Some of the concepts here will be similar to the ones in the previous article, so we will summarize them and not go over them again.

This article covers only optimizing InnoDB. Although some of the suggestions here will work for MyISAM, this article assumes InnoDB usage.

Server and hardware selection

Measuring performance

Measuring MySQL performance was also covered in the previous article. Another option is to see the time reported by MySQL to perform each query, but you need to ensure that the results were not served from the query cache.

Another possible caching related “issue” that may affect the measurement is OS cache. If there’s enough memory, it can cache MySQL raw files and skew the measurement.

Create an index

An index is the heart of every SQL based database, create the right one, and performance will be good.

If you are a MySQL professional, you can skip this part, as you are probably aware of what an Index is and how it is used.

Full table scan

To understand why indexes are needed, we need to know how MySQL gets the data.

Let’s say we have a simple table:

We want to get the age of John, so we make a simple selection:

That would work great as long as we have a small amount of data, but the speed will decrease when our table grows.

The reason is that MySQL will need to scan every row in the table and compare Name to ‘John.’ It will only exit once it found it, or it didn’t find anything.

Checking every row in the table is called a full table scan, which has zero performance. Think of a table with 1,000,000,000 rows (we have some of those in our data store), and every MySQL query will have to scan the entirety of the data. That would be crazy slow.

For example, we have a table with 300 million keywords. For debugging purposes, we need to do a full table scan once in a while. It takes about 5 minutes to get our data.

Creating the index

Indexes allow for fast retrieval of data. Just like the index in a phone book (if you remember those), you’d search the Name, and it will point you to the page where that Name is.

There are several types of indexes in MySQL, and each has its use case and performance. It’s essential to choose the right one for what we need.

MySQL index types:

Every table should have a primary key (unless you know what you are doing). The primary key can span multiple columns, which guarantees that the data is unique.

So in our example, if we put a primary key on the column Name, we can only have one name ‘John,’ but we can also have ‘Beth’ and ‘Karen.’

The advantage of a primary key is that MySQL sorts the data in a way that matches the key, and it’s the fastest key in terms of select performance.

Spatial index

This index is meant for geometric values, and it’s used for map and GIS-related purposes. I have never worked in that area and never used that index.

Unique index

This index is just like the primary key in that it guarantees that the columns that are part of the index will be unique, but unlike the primary key, managing this index takes space and memory.

For large indexes, you may need a big enough memory to handle any operations. I once used a 16GB instance, and the index had 200m rows; MySQL crashed when I tried to do a copy from one table to another.

Regular index

The regular index allows for multiple values with the same value; it’s managed just like the unique index.

Full text

This index allows us to index substrings inside a string column. This is useful for wildcard string searches.

Descending indexes

These are regular indexes, but their sort order is reversed. This is supported by MySQL 8 (which you should use anyway) and is useful to process new data first.

MySQL table schema

In the next paragraphs, we are going to discuss a table schema, so here it is:

Our test table is populated with 300 million records.

Checking how your selects are behaving

Adding indexes and primary keys doesn’t mean that MySQL will use them, MySQL has its own logic of how to use indexes for a given query. For queries that are slow or are running many times, it’s a good practice to check what happens under the hood.

The way you do it is with the command ‘explain.’ You put it before your SQL command, and MySQL lets you know the execution plans for the query.

Explain on a primary key

For example, let’s make a simple selection on a table:

SearchID is the primary key; the reply we get is:

As we can see, MySQL uses the primary key to do the select and will only make use of one row.

Explain without primary key

Now let’s do a select on a row without a primary key:

The column results have no index or primary key, and the result is:

A full table scan would be very slow.

Explain for index

Our next process is to do a select on a row with an index:

And the result is:

MySQL will process 49082 rows, but let’s see how many rows we actually have:

And the result is:

There are fewer rows than what the explanation showed. This can be due to how MySQL manages indexes, but it’s still fast enough and not a full table scan.

Checking the where clause

The where clause is one of the most crucial parts of every select query because that’s the purpose of a database: get specific data and fast.

This means that you need to make sure that the columns checked with the where clause are indexed or are small so that a scan will be fast.

Good index selection of complex where clause

The ‘explain’ paragraph covers a few examples with an index in the where clause, but those were straightforward examples. Let’s show another example where we use both a primary key and an index:

The result is:

This is good; MySQL used both indexes.

No index selection with calculation on a column

Let’s try to mess it up:

In this example, we are doing a calculation on the column with the primary key; look what MySQL will do:

It performed a full table scan, even though we have a primary key! The way to solve it is to add a precalculated field, index it, and use that index field in the where clause.

The wrong type of index

Let’s do a wildcard select on an indexed column:

We are searching for any keyword that has the letter ‘a’ inside, and MySQL does this:

For a full table scan, in this case, we need to add a full-text index to the column ‘Keyword.’

Order by and index

Here we do an order by using our primary key, which is in the where clause as well:

The result is:

Which means MySQL used the primary key, now let’s use the index as the sort factor:

This time we have a little surprise:

MySQL didn’t use our index; it sorted the data manually, which will slow down our query.

No index on table join

When doing a join between one or more tables (all type of joins: inner join, left join, right join, etc.), make sure that the join column you join on is indexed. Otherwise, MySQL will need to do a full table scan to do a proper join.

How can you know if this is the case? Use EXPLAIN. You should be an expert on using it by now. 😊

Check your MySQL configuration

MySQL out of the box configuration is modest and designed to support operations for web sites like WordPress.

To achieve the performance you’d expect from a modern database, you will need to tweak the default configuration.

The settings reside in the file my.cnf. In case of centos, it’s under /etc/my.cnf

innodb_buffer_pool_size

Personally, I believe this is the most critical MySQL setting, and it sets the amount of memory MySQL uses to cache table and index data. The settings should be 75% of memory because the size is not set in stone, and MySQL may be allocated 10% more of the set size.

Also, you may have other applications running, so you need to reserve memory for them.

I helped a friend (the same one I wrote about), and the first thing I checked even before the table structure is the memory allocated to MySQL. It was on default, which was 2GB on his machine. Changing it to 48GB increased performance drastically.

innodb-buffer-pool-instances

The number of instances the buffer pool is divided into, this is to allow for better concurrency with threads. It’s only relevant if the buffer pool is over 1.3GB.

On Windows 32bit, the default is the buffer pool size divided by 128MB. On all other systems, the default is 8.

With the new AMD processors, you can have 64 or 128 cores, and you can experiment with this value to see if you get better performance.

join_buffer_size

This setting tells MySQL how much memory to allocate to the join buffer, which is used while creating joins.

If the join data is too large, MySQL will resort to using a temporary table on the hard drive.

The default value is 256KB; keep in mind that increasing this value will increase the value for every MySQL thread.

This value is also relatively small because joins should be small, so before raising this value, check if the design is correct, and maybe a design change is a better option.

Another option is to “hint” the optimizer before a query with a large join.

In this guide we started to cover some aspects of how to optimize MySQL query speed and performance, in the next guide we will cover more topic like Table optimization, memory settings and more.

0
0
vote

Five ways to improve MySQL select speed (Part 1) – SEO Explorer’s Blog