Skip to content

[WIP] Tests for debugging Dart and MSQ in Quidem #17861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

weishiuntsai
Copy link
Contributor

Add 2 tests for debugging Dart and MSQ in Quidem. Do not merge.

Add 2 tests for debugging Dart and Msq in Quidem.
@kgyrtkirk
Copy link
Member

downloaded the heapdump - the issue was caused by huge exceptions 5M each - and surefire tried to hold all of them in a map....which have lead to an OOM

the underlying exception is a diff related exception; but it contains the following exception(s):

> WHERE time_floor(__time, 'PT1H') BETWEEN timestamp '2019-08-25 00:00:00' AND timestamp '2019-08-25 06:00:00'": Remote driver error: QueryInterruptedException: Leaks happened, each suppressed exception represents one code path that checked out an object and didn't return it. -> RuntimeException: Leaks happened, each suppressed exception represents one code path that checked out an object and didn't return it.
>       at org.apache.calcite.avatica.Helper.createException(Helper.java:54)
>       at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
[...]
>       Suppressed: org.apache.druid.collections.StupidPool$LeakedException: Originally checked out by thread [Test-runner-processing-pool-msq-worker[00000000-0000-0000-0000-000000000000_2_0]]
>               at org.apache.druid.collections.StupidPool.take(StupidPool.java:162)
>               at org.apache.druid.segment.CompressedPools.getByteBuf(CompressedPools.java:110)
>               at org.apache.druid.segment.data.DecompressingByteBufferObjectStrategy.fromByteBuffer(DecompressingByteBufferObjectStrategy.java:70)
>               at org.apache.druid.segment.data.DecompressingByteBufferObjectStrategy.fromByteBuffer(DecompressingByteBufferObjectStrategy.java:30)
>               at org.apache.druid.segment.data.GenericIndexed$BufferIndexed.get(GenericIndexed.java:598)
>               at org.apache.druid.segment.data.BlockLayoutColumnarLongsSupplier$1.loadBuffer(BlockLayoutColumnarLongsSupplier.java:103)
>               at org.apache.druid.segment.data.BlockLayoutColumnarLongsSupplier$1.get(BlockLayoutColumnarLongsSupplier.java:90)
>               at org.apache.druid.segment.column.LongsColumn.getLongSingleValueRow(LongsColumn.java:77)
>               at org.apache.druid.segment.QueryableIndexCursorHolder.asCursor(QueryableIndexCursorHolder.java:185)
>               at org.apache.druid.msq.querykit.scan.ScanQueryFrameProcessor.runWithSegment(ScanQueryFrameProcessor.java:271)
>               at org.apache.druid.msq.querykit.BaseLeafFrameProcessor.runIncrementally(BaseLeafFrameProcessor.java:88)
>               at org.apache.druid.msq.querykit.scan.ScanQueryFrameProcessor.runIncrementally(ScanQueryFrameProcessor.java:157)
>               at org.apache.druid.msq.counters.CpuTimeAccumulatingFrameProcessor.runIncrementally(CpuTimeAccumulatingFrameProcessor.java:66)
>               at org.apache.druid.frame.processor.FrameProcessors$1FrameProcessorWithBaggage.runIncrementally(FrameProcessors.java:72)
>               at org.apache.druid.frame.processor.FrameProcessorExecutor$1ExecutorRunnable.runProcessorNow(FrameProcessorExecutor.java:239)
>               at org.apache.druid.frame.processor.FrameProcessorExecutor$1ExecutorRunnable.run(FrameProcessorExecutor.java:141)
>               at org.apache.druid.msq.exec.WorkerImpl$2$2.run(WorkerImpl.java:900)
>               ... 3 more
[this repeats around a 40 times]

@kgyrtkirk
Copy link
Member

kgyrtkirk commented Apr 11, 2025

running the same query which errors out multiple times leads to the same issue:

tt.iq
!set dartQueryId 00000000-0000-0000-0000-000000000000
!set useApproximateCountDistinct false
!use druidtest://?componentSupplier=DartComponentSupplier&datasets=sql/src/test/quidem/qatests/qaArray/sql&numMergeBuffers=3
!set outputformat mysql


# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error


# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

# TESTCASE: test_subquery_with_where TEST_ID: A2_B46_C5 TYPE: NEGATIVE TEST
SELECT a_int
FROM
  (SELECT *
   FROM test_array)
WHERE a_int NOT IN
    (SELECT a_int
     FROM test_array);
ARRAY
!error

edit: copied the wrong testfile first time...this above should be the right one

@kgyrtkirk
Copy link
Member

furthermore: forgot to mention that the same issue didn't appeared in my IDE - only from a maven run...which is also odd

and even connecting with a remote debugger (-Dmaven.surefire.debug) makes the problem disappear as it just passes

This PR splits up the iq files into smaller files at the testcase level. The files are grouped in subdirectories at the testsuite level.

This version also makes the tests run for all 3 engines.  The file name of an iq file indicicates what engine(s) the test uses:

If all 3 engines expect the same results:
*.all.iq: AllDruidEnginesComponentSupplier

If one of the enigines expect different results:
*.std.iq: StandardComponentSupplier
*.dart.iq: DartComponentSupplier
*.msq.iq: StandardMSQComponentSupplier
@gianm
Copy link
Contributor

gianm commented Apr 15, 2025

I think #17915 might fix the issue you are seeing.

2)
OR s_int IS NULL)
AND (ip4_match(c, c) IS NOT NULL);
No match found
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does No match found mean?
since it supposed to be part of an Exception could we have a more details about the problem?

!ok

#-------------------------------------------------------------------------
# TESTCASE: test001 TEST_ID: A02 TYPE: POSITIVE TEST
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this TYPE: part is redundant: it has no real value as the test has some expectation like !ok - which should document the expected outcome already

!ok

#-------------------------------------------------------------------------
# Total query count 283 Positive tests: 135 Negative tests: 148
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have more cases and less queries per file?

SELECT c
FROM test_unnest,
unnest(s_int) AS u(c);
Cannot apply
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does Cannot apply mean here? it should be more readable why this doesn't work

!ok

#-------------------------------------------------------------------------
# Total query count 43 Positive tests: 37 Negative tests: 6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these headers are usefull - instead they could cause confusion:
what happens if tomorrow one more NEGATIVE tests will be supported...should these numbers be adjusted as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

large files should not arrive like this! see other comment

Comment on lines +7 to +9
"type" : "local",
"baseDir" : "sql/src/test/quidem/qatests/kttm_nested/data",
"filter" : "kttm-nested-v2-2019-08-25.json.gz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should find a way to access the already accessible kttm-nested-v2-2019-08-25.json on the classpath; and that should be the way to get larger datafiles

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the equivalent of !use druidtest:///?componentSupplier=KttmNestedComponentSupplier if to use this dataset for AllDruidEnginesComponentSupplier, DartComponentSupplier, StandardMSQComponentSupplier, and StandardComponentSupplier? These tests originally used !use druidtest:///?componentSupplier=KttmNestedComponentSupplier. We need to make them run for all 3 engines, but I couldn't find a way to do so. That's why the dataset was added.

@weishiuntsai
Copy link
Contributor Author

I think #17915 might fix the issue you are seeing.

Tried the latest code. It still OOMed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants