Improving search by compound indexes.#3915
Conversation
katzyn
left a comment
There was a problem hiding this comment.
Thank you for your contribution!
For the following table:
CREATE TABLE TEST(A INTEGER, B INTEGER, C INTEGER) AS
SELECT A, B, C FROM (VALUES 1, 2) T1(A) JOIN (VALUES 3, 4) T2(B) JOIN (VALUES 5, 6) T3(C);Index conditions aren't listed in execution plan any more, so something is wrong with them:
CREATE INDEX TEST_A_IDX ON TEST(A);
EXPLAIN SELECT * FROM TEST WHERE (A, B) IN ((1, 3), (2, 4));
was: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_A_IDX: A IN(1, 2) */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
now: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_A_IDX */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))The same problem here:
DROP INDEX TEST_A_IDX;
CREATE INDEX TEST_B_IDX ON TEST(B);
EXPLAIN SELECT * FROM TEST WHERE (A, B) IN ((1, 3), (2, 4));
was: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_B_IDX: B IN(3, 4) */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
now: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_B_IDX */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))And here
CREATE INDEX TEST_C_A_IDX ON TEST(C, A);
EXPLAIN SELECT * FROM TEST WHERE (A, B) IN ((1, 3), (2, 4));
was: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_B_IDX: B IN(3, 4) */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))
now: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_B_IDX */
WHERE ROW ("A", "B") IN(ROW (1, 3), ROW (2, 4))Here they are listed with a wrong order of values, it must be (5, 1), (6, 2)):
EXPLAIN SELECT * FROM TEST WHERE (A, C) IN ((1, 5), (2, 6));
was: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_C_A_IDX: A IN(1, 2) AND C IN(5, 6) */
WHERE ROW ("A", "C") IN(ROW (1, 5), ROW (2, 6))
now: SELECT "PUBLIC"."TEST"."A", "PUBLIC"."TEST"."B", "PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST" /* PUBLIC.TEST_C_A_IDX: IN(ROW (1, 5), ROW (2, 6)) */
WHERE ROW ("A", "C") IN(ROW (1, 5), ROW (2, 6))There problems need to be resolved and I think we need more tests for these conditions.
| * Contains a {@link Column} or {@code Column[]} depending on the condition type. | ||
| * @see #isCompoundColumns() | ||
| */ | ||
| private final Object column; |
There was a problem hiding this comment.
It will be better to use different fields here.
| */ | ||
| private IndexCondition(int compareType, ExpressionList columns, Expression expression) { | ||
| this.compareType = compareType; | ||
| if (columns == null) |
There was a problem hiding this comment.
Please, use this code style:
if (…) {
…
} else {
…
}| * | ||
| * @see Column#convert(CastDataProvider, Value) | ||
| */ | ||
| public ValueRow convert(CastDataProvider provider, Column[] columns) { |
There was a problem hiding this comment.
You need to move this code to some other place. Value classes may not depend on Column and other database objects.
There was a problem hiding this comment.
I created a static method in the Column class instead.
| return column; | ||
| if (column instanceof Column) | ||
| return (Column) column; | ||
| throw new IllegalStateException("The getColumn() method cannot be with multiple columns."); |
There was a problem hiding this comment.
If something goes horribly wrong, use throw DbException.getInternalError().
| } | ||
|
|
||
| private boolean canUseIndexForIn(Column[] columns) { | ||
| if ( inColumn != null ) { |
There was a problem hiding this comment.
(inColumn != null) (without spaces inside parentheses).
It must support them in absence of a better index, because we can't accept a PR with this regression. Actually your implementation still somehow chooses a proper index, but |
Hi @katzyn, Thank you for your feedback. I put back the previous I adjusted some tests to make them pass the new expectations. I need to fix the regular compound comparisons too (e.g: Please let me know if you have any suggestions. Thanks. |
|
Hi @katzyn, I hope I was able to solve every issue. I have run the tests, and it looks OK to me. Could you please review my pull request again? Thanks. |
| ExpressionVisitor visitor = ExpressionVisitor.getNotFromResolverVisitor(filter); | ||
| for (Expression e : valueList) { | ||
| if (!e.isEverything(visitor)) | ||
| return; |
There was a problem hiding this comment.
if (!e.isEverything(visitor)) {
return;
}| ExpressionVisitor visitor = ExpressionVisitor.getNotFromResolverVisitor(filter); | ||
| for (Expression e : valueList) { | ||
| if (!e.isEverything(visitor)) | ||
| return; |
| builder.append(" IN("); | ||
| for (int i = 0, s = expressionList.size(); i < s; i++) { | ||
| if (i > 0) | ||
| builder.append(", "); |
| Column[] columns = getColumns(); | ||
| for (int i = columns.length; --i >= 0; ) { | ||
| if (TableType.TABLE != columns[i].getTable().getTableType()) | ||
| return 0; |
| if (!isCompoundColumns()) { | ||
| builder.append("column=").append(column); | ||
| } else { | ||
| builder.append("columns=").append(columns); |
There was a problem hiding this comment.
Column[].toString() doesn't return anything useful here, use Column.writeColumns(builder, columns, TRACE_SQL_FLAGS) instead.
| for (int i = 0; i < cols.length; i++) { | ||
| IndexColumn idxCol = cols[i]; | ||
| if (idxCol != null && idxCol.column != columns[i]) | ||
| return false; |
h2/src/main/org/h2/table/Column.java
Outdated
| copy[i] = nv; | ||
| } | ||
| } | ||
| return copy == null ? valueRow : ValueRow.get(valueRow.getType(), copy); |
There was a problem hiding this comment.
I guess valueRow.getType() can return wrong data type information here, so TypeInfo.getTypeInfo(Value.ROW, 0, 0, new ExtTypeInfoRow(column)) should be used instead.
| Column col = columns[j]; | ||
| indexed = col.getColumnId() >= 0 && index.getColumnIndex(col) >= 0; | ||
| if (!indexed) | ||
| break; |
| if (col.getColumnId() >= 0) { | ||
| int columnIndex = index.getColumnIndex(col); | ||
| if (columnIndex == 0) // The first column of the index always matches. | ||
| continue; |
|
|
||
| EXPLAIN SELECT * FROM TEST2COL WHERE (A, B)=(0, 0); | ||
| >> SELECT "PUBLIC"."TEST2COL"."A", "PUBLIC"."TEST2COL"."B", "PUBLIC"."TEST2COL"."C" FROM "PUBLIC"."TEST2COL" /* PUBLIC.PRIMARY_KEY_E: A = 0 AND B = 0 */ WHERE ROW ("A", "B") = ROW (0, 0) | ||
|
|
There was a problem hiding this comment.
Please, don't add any new tests to this legacy script. You can add them to indexes.sql instead.
I think we need more tests including multi-column joins with other tables with both EXPLAIN and actual test of query results.
There was a problem hiding this comment.
Ok, I deleted the new lines.
I created a new test class: TestCompoundIndexSearch
|
Hi @katzyn, I fixed your previous findings. Could you please check the pull request again? Thanks. |
|
Something is wrong with execution plan in the following example: CREATE TABLE TEST(A INTEGER, B INTEGER, C INTEGER) AS
SELECT A, B, C FROM (VALUES 1, 2) T1(A) JOIN (VALUES 3, 4) T2(B) JOIN (VALUES 5, 6) T3(C);
CREATE INDEX TEST_C_A_IDX ON TEST(C, A);
EXPLAIN ANALYZE SELECT * FROM TEST WHERE (A, C) IN ((1, 5), (2, 6));SELECT
"PUBLIC"."TEST"."A",
"PUBLIC"."TEST"."B",
"PUBLIC"."TEST"."C"
FROM "PUBLIC"."TEST"
/* PUBLIC.TEST_C_A_IDX: IN(ROW (1, 5), ROW (2, 6))
AND C IN(5, 6)
*/
/* scanCount: 9 */
WHERE ROW ("A", "C") IN(ROW (1, 5), ROW (2, 6))Correct plan should look like that: /* PUBLIC.TEST_C_A_IDX: IN(ROW (5, 1), ROW (6, 2))
*/Columns need to be specified in correct order and
|
…sed in the right order.
|
Hi @katzyn, I fixed the index condition error although, I am not sure this is what you wanted to see. If the query uses the indexed columns in a wrong order, I re-create the index condition with a correct column order. It looks a bit ugly to me, but it seems to work. |
katzyn
left a comment
There was a problem hiding this comment.
Please, send a license statement as described here
https://h2database.com/html/build.html#providing_patches
to our mailing list (Google group)
https://groups.google.com/g/h2-database
(This group is partially pre-moderated, your post may not appear immediately.)
* Improving search by compound indexes. * Fixing review findings. * Fixing the index condition handling. * Adjusting test cases. * Fixing TableFilter.prepare(). * Adjusting test cases. * Refactoring the constructors of the IndexCondition class. * Removing unnecessary TODOs. * Fixing review findings. * Introducing the TestCompoundIndexSearch class. * Preparing IndexCondition to deal with queries where the columns not used in the right order. * Fixing test errors.
Hi,
As I can see, searching by compound indexes does not work properly.
I created table with some data inside to test the query execution. See: https://github.com/kiss034/h2database/blob/test/sql/company.sql
I tested with the following SQL:
Before my changes, the engine was able to find the matching compound index while preparing, but later it was deleted from the index conditions, so a full table scan was executed.
I prepared the IndexCondition and IndexCursor classes to deal with multiple columns. After these changes the indexed search was executed properly.
As I can see, the previous implementation used only the first indexed component of the compound IN condition. I do not know whether it was intentional. My implementation does not support single indexes in such queries.
Furthermore, currently the following test are fail.
Could you please let me know if my implementation or the test should be fixed? Thanks.
Best regards
János Áron Kiss