Skip to content

Conversation

@geruh
Copy link
Contributor

@geruhgeruh commented May 8, 2024

This PR adds the manifests metadata table the existing inspect logic for Iceberg tables as listed in #511. The manifests metadata table in Iceberg shows the current file manifests for a given table.

Java implementation: https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/ManifestsTable.java

Usage

table.inspect.manifests() 
pyarrow.Table content: int8 not null path: string not null length: int64 not null partition_spec_id: int32 not null added_snapshot_id: int64 not null added_data_files_count: int32 not null existing_data_files_count: int32 not null deleted_data_files_count: int32 not null added_delete_files_count: int32 not null existing_delete_files_count: int32 not null deleted_delete_files_count: int32 not null partition_summaries: list<item: struct<contains_null: bool not null, contains_nan: bool, lower_bound: string, upper_bound: string>> not null child 0, item: struct<contains_null: bool not null, contains_nan: bool, lower_bound: string, upper_bound: string> child 0, contains_null: bool not null child 1, contains_nan: bool child 2, lower_bound: string child 3, upper_bound: string ---- content: [[0]] path: [["s3://warehouse/default/table_metadata_manifests/metadata/3bf5b4c6-a7a4-4b43-a6ce-ca2b4887945a-m0.avro"]] length: [[6886]] partition_spec_id: [[0]] added_snapshot_id: [[3815834705531553721]] added_data_files_count: [[1]] existing_data_files_count: [[0]] deleted_data_files_count: [[0]] added_delete_files_count: [[0]] existing_delete_files_count: [[0]] deleted_delete_files_count: [[0]] partition_summaries: [[ -- is_valid: all not null -- child 0 type: bool [false] -- child 1 type: bool [false] -- child 2 type: string ["test"] -- child 3 type: string ["test"]]] 

@FokkoFokko mentioned this pull request May 13, 2024
8 tasks
@kevinjqliukevinjqliu mentioned this pull request May 14, 2024
39 tasks
Copy link
Contributor

@HonahXHonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for adding this @geruh

Copy link
Contributor

@FokkoFokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @geruh, thanks for working on this. I know quite a few folks are waiting for this. Thanks @HonahX for the review 👍

@FokkoFokko merged commit eba4bee into apache:mainMay 23, 2024
@geruhgeruh deleted the inspect-manifests branch July 1, 2024 18:10
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

@geruh@Fokko@HonahX