Skip to content

[#21878] extensive test for multi-dictionary column group bys#22888

Open
Rich-T-kid wants to merge 3 commits into
apache:mainfrom
Rich-T-kid:rich-T-kid/multi-column-dictionary-test
Open

[#21878] extensive test for multi-dictionary column group bys#22888
Rich-T-kid wants to merge 3 commits into
apache:mainfrom
Rich-T-kid:rich-T-kid/multi-column-dictionary-test

Conversation

@Rich-T-kid

@Rich-T-kid Rich-T-kid commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

works towards closing #21878. But also adds test coverage for existing code that wasnt there before. There were no test for dictionaries specifically for groupValueRows

Rationale for this change

There was a lack of test that focused on dictionary arrays in GroupValueRows. I also would like to keep the test separate from the implementation (for #21878) so that its easier to review.

What changes are included in this PR?

  • Adds a dedicated test module (multi_group_by/dictionary.rs) covering correctness of multi-column dictionary GROUP BY using the GroupValuesRows path
  • Tests are organized into three modules: general_test (uniqueness, null semantics, multi-type composition), dictionary_encoding (key-index shuffling, bloated/non-normalized dictionaries, mixed key types), and streaming (EmitTo::First(n) renumbering and EmitTo::All reingest)
  • Adds two SQL logic tests to dictionary.slt covering multi-column dictionary GROUP BY with mixed key/value types (Dict<Int8,Utf8>, Dict<Int32,LargeUtf8>, Dict<Int8,Utf8View>). these are the first SLT tests where the grouping keys themselves are dictionary columns

Are these changes tested?

The changes are test 😆

Are there any user-facing changes?

No

@github-actions github-actions Bot added the physical-plan Changes to the physical-plan crate label Jun 10, 2026
@github-actions github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label Jun 10, 2026
@Rich-T-kid Rich-T-kid force-pushed the rich-T-kid/multi-column-dictionary-test branch from 777a7d5 to a6c003c Compare June 10, 2026 21:00
@Rich-T-kid Rich-T-kid marked this pull request as ready for review June 10, 2026 21:00
@Rich-T-kid Rich-T-kid changed the title [Draft][#21878] extensive test for multi-dictionary column group bys [#21878] extensive test for multi-dictionary column group bys Jun 10, 2026
@Rich-T-kid

Copy link
Copy Markdown
Contributor Author

I tried to make each test function as declarative as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant