Skip to content

Support sharded Parquet file querying and conversion#7610

Open
SungJin1212 wants to merge 2 commits into
cortexproject:masterfrom
SungJin1212:parquet-shard
Open

Support sharded Parquet file querying and conversion#7610
SungJin1212 wants to merge 2 commits into
cortexproject:masterfrom
SungJin1212:parquet-shard

Conversation

@SungJin1212

@SungJin1212 SungJin1212 commented Jun 9, 2026

Copy link
Copy Markdown
Member

This PR supports for querying sharded Parquet files within a bucket store and enables the conversion of sharded Parquet files.

Benchmark Results

Currently, the concurrency is hard-coded as 4.

GOROOT=/usr/local/opt/go/libexec #gosetup
GOPATH=/Users/kakao_ent/go #gosetup
/usr/local/opt/go/libexec/bin/go test -c -tags=slicelabels -o /Users/kakao_ent/Library/Caches/JetBrains/GoLand2026.1/tmp/GoLand/___1BenchmarkParquetBucketStore_MultiShard_in_github_com_cortexproject_cortex_pkg_storegateway.test github.com/cortexproject/cortex/pkg/storegateway #gosetup
/Users/kakao_ent/Library/Caches/JetBrains/GoLand2026.1/tmp/GoLand/___1BenchmarkParquetBucketStore_MultiShard_in_github_com_cortexproject_cortex_pkg_storegateway.test -test.v -test.paniconexit0 -test.bench ^\QBenchmarkParquetBucketStore_MultiShard\E$ -test.run ^$ #gosetup
goos: darwin
goarch: amd64
pkg: github.com/cortexproject/cortex/pkg/storegateway
cpu: VirtualApple @ 2.50GHz
BenchmarkParquetBucketStore_MultiShard
BenchmarkParquetBucketStore_MultiShard/shards=1
BenchmarkParquetBucketStore_MultiShard/shards=1-14         	      72	  15630539 ns/op	36701543 B/op	  282624 allocs/op
BenchmarkParquetBucketStore_MultiShard/shards=2
BenchmarkParquetBucketStore_MultiShard/shards=2-14         	     100	  11494683 ns/op	38358405 B/op	  284007 allocs/op
BenchmarkParquetBucketStore_MultiShard/shards=4
BenchmarkParquetBucketStore_MultiShard/shards=4-14         	     100	  10774228 ns/op	38830028 B/op	  286728 allocs/op
BenchmarkParquetBucketStore_MultiShard/shards=8
BenchmarkParquetBucketStore_MultiShard/shards=8-14         	     100	  11819611 ns/op	38578193 B/op	  291999 allocs/op
PASS

Process finished with the exit code 0

Which issue(s) this PR fixes:
Fixes #7176 #7174

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
  • docs/configuration/v1-guarantees.md updated if this PR introduces experimental flags

@dosubot dosubot Bot added component/store-gateway go Pull requests that update Go code storage/blocks Blocks storage engine type/feature labels Jun 9, 2026
@SungJin1212 SungJin1212 force-pushed the parquet-shard branch 2 times, most recently from 635d72e to 524e917 Compare June 9, 2026 11:03
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/store-gateway go Pull requests that update Go code size/L storage/blocks Blocks storage engine type/feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Parquet] Support sharded parquet file conversion

1 participant