Skip to content

encoding_stats not present in Parquet generated by parquet-rewrite #7616

@JigaoLuo

Description

@JigaoLuo

Describe the bug

I recently tested the encoding_stats feature I requested #7341 in the Parquet writer (thank you for the PR #7354 !) and noticed an issue when using the parquet-rewrite tool.

Rewritten files do not include encoding_stats in the footer. I’m seeking clarification on whether this is a configuration issue overlooked by me or a bug. I’m unsure what’s causing this. If this isn’t expected behavior, this issue is a bug report. If I’ve overlooked something obvious, please correct me. Thank you.

To Reproduce

$ parquet-rewrite ... # I rewrite a parquet file
$ parquet footer generated.parquet | grep "encodingStats"
      "encodingStats" : null,
....
      "encodingStats" : null,

The parquet tool I used is from parquet-java : https://github.com/apache/parquet-java/tree/master/parquet-cli

Expected behavior

Should not be null in the footer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugparquetChanges to the parquet crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions