意见箱
恒创运营部门将仔细参阅您的意见和建议,必要时将通过预留邮箱与您保持联络。感谢您的支持!
意见/建议
提交建议

2022-07-20 mysql-hashJoin说明

来源:恒创科技 编辑:恒创科技编辑部
2023-12-21 16:38:59


abstract:

hash join describe

hash join:

Hash Join


2022-07-20 mysql-hashJoin说明

The hash join algorithm is a very recent addition to MySQL and is supported in MySQL

8.0.18 and later. It marks a significant break with the tradition of nested loop joins

including the block nested loop variant. It is particularly useful for large joins without

indexes but can in some cases even outperform an index join.

MySQL implements a hybrid between a classic in-memory hash join and the on-disk

GRACE hash join algorithm. 2 If it is possible to store all the hashes in memory, then the

pure in-memory implementation is used. The join buffer is used for the in-memory part,

so the amount of memory that can be used for the hashes is limited by join_buffer_

size . When the join does not fit in memory, the join spills over to disk, but the actual

join operations are still performed in memory.




The in-memory hash join algorithm consists of two steps:

1. One of the tables in the join is chosen as the build table . The hash

is calculated for the columns required for the join and loaded into

memory. This is known as the build phase .

2. The other table in the join is the probe input . For this table, rows

are read one at a time, and the hash is calculated. Then a hash

key lookup is performed on the hashes calculated from the build

table, and the result of the join is generated from the matching

rows. This is known as the probe phase .




When the hashes of the build table do not fit into memory, MySQL automatically

switches to use the on-disk implementation (based on the GRACE hash join algorithm).

The switch from the in-memory to the on-disk algorithm happens if the join buffer

becomes full during the build phase. The on-disk algorithm consists of three steps:

1. Calculate the hashes of all rows in both the build and probe tables

and store them on disk in several small files partitioned by the

hash. The number of partitions is chosen to make each partition

of the probe table fit into the join buffer but with a limit of at most

128 partitions.

2. Load the first partition of the build table into memory and iterate

over the hashes from the probe table in the same way as for the

probe phase for the in-memory algorithm. Since the partitioning

in step 1 uses the same hash function for both the build and probe

tables, it is only necessary to iterate over the first partition of the

probe table.

3. Clear the in-memory buffer and continue with the rest of the

partitions one by one.

Both the in-memory and on-disk algorithms use the xxHash64 hash function which

is known as being fast while still providing hashes of good quality (reducing the number

of hash collisions). For the optimal performance, the join buffer needs to be large

enough to fit all the hashes from the build table. That said, the same considerations for

join_buffer_size exist for hash joins as for block nested loop joins.





MySQL will use the hash join whenever the block nested loop would otherwise be

chosen, and the hash join algorithm is supported for the query. At the time of writing,

the following requirements exist for the hash join algorithm to be used:

• The join must be an inner join.

• The join cannot be performed using an index, either because there is

no available index or because the indexes have been disabled for the

query.

• All joins in the query must have at least one equi-join condition

between the two tables in the join, and only columns from the two

tables as well as constants are referenced in the condition.

• As of 8.0.20, anti, semi, and outer joins are also supported. 3 If you join

the two tables t1 and t2 , then examples of join conditions that are

supported for hash join include

• t1.t1_val = t2.t2_val

• t1.t1_val = t2.t2_val + 2

• t1.t1_val1 = t2.t2_val AND t1.t1_val2 > 100

• MONTH(t1.t1_val) = MONTH(t2.t2_val)

If you consider the recurring example query for this section, you can execute it using

a hash join by ignoring the indexes on the tables that can be used for the join:

SELECT CountryCode, country.Name AS Country,

city.Name AS City, city.District

FROM world.country IGNORE INDEX (Primary)

INNER JOIN world.city IGNORE INDEX (CountryCode)

ON city.CountryCode = country.Code

WHERE Continent = 'Asia';

The pseudo code for performing this join is similar to that of a block nested loop

except that the columns needed for the join are hashed and that there is support for

overflowing to disk.





result = []
join_buffer = []
partitions = 0
on_disk = False
for country_row in country:
if country_row.Continent == 'Asia':
hash = xxHash64(country_row.Code)
if not on_disk:
join_buffer.append(hash)
if is_full(join_buffer):
# Create partitions on disk
on_disk = True
partitions = write_buffer_to_disk(join_buffer)
join_buffer = []
else
write_hash_to_disk(hash)
if not on_disk:
for city_row in city:
hash = xxHash64(city_row.CountryCode)
if hash in join_buffer:
country_row = get_row(hash)
city_row = get_row(hash)
result.append(join_rows(country_row, city_row))
else:
for city_row in city:
hash = xxHash64(city_row.CountryCode)
write_hash_to_disk(hash)
for partition in range(partitions):
join_buffer = load_build_from_disk(partition)
for hash in load_hash_from_disk(partition):
if hash in join_buffer:
country_row = get_row(hash)
city_row = get_row(hash)
result.append(join_rows(country_row, city_row))
join_buffer = []



The pseudo code starts out reading the rows from the country table and calculates

the hash for the Code column and stores it in the join buffer. If the buffer becomes full,

then the code switches to the on-disk algorithm and writes out the hashes from the

buffer. This is also where the number of partitions is determined. After this the rest of the

country table is hashed.

In the next part, for the in-memory algorithm, there is a simple loop over the rows in

the city table comparing the hashes to those in the buffer. For the on-disk algorithm, the

hashes of the city table are first calculated and stored on disk; then the partitions are

handled one by one.





2022-07-20 mysql-hashJoin说明_查询优化器


The values of the Code column for the matching rows from the country table are

hashed and stored in the join buffer. Then a table scan is executed for the city table with

the hash calculated of CountryCode for each row, and the result is constructed from the

matching rows.

You can see the query performs very well with the hash join examining the same

number of rows as the block nested loop, but it is faster than an index join. This is not a

mistake: in some cases, a hash join can outperform even an index join. You can use the

following rules to estimate how the hash join algorithm will perform compared to index

and block nested loop joins:

• For a join without using an index, the hash join will usually be much

faster than a block nested join unless a LIMIT clause has been added.

Improvements of more than a factor of 1000 have been observed. 4

• For a join without an index where there is a LIMIT clause, a block

nested loop can exit when enough rows have been found, whereas

a hash join will complete the entire join (but can skip fetching the

rows). If the number of rows included due to the LIMIT clause is small

compared to the total number of rows found by the join, a block

nested loop may be faster.

• For joins supporting an index, the hash join algorithm can be faster if

the index has a low selectivity.

The biggest benefit using the hash joins is by far for joins without an index and

without a LIMIT clause. In the end, only testing can prove which join strategy is the

optimal for your queries.

You can enable and disable support for hash joins using the hash_join optimizer

switch. Additionally, the block_nested_loop optimizer switch must be enabled. Both are

enabled by default. If you want to configure the use of hash joins for specific joins, you

can use the HASH_JOIN() and NO_HASH_JOIN() optimizer hints.

That concludes the discussion about the three high-level join strategies supported in

MySQL.There are some lower-level optimizations as well that are worth considering.

上一篇: 2022-08-01 mysql/stoonedb慢SQL-Q18分析 下一篇: btree索引无法使用处理方式