Skip to main content
Version: Latest-3.4

Iceberg Metadata Tables

This topic describes how to inspect the metadata information of Iceberg tables in StarRocks.

Overview

From V3.4.1 onwards, StarRocks supports Iceberg metadata tables. These metadata tables contain a variety of information about Iceberg tables, such as table change history, snapshots, and manifests. You can query each metadata table by appending the metadata table name to the original table name.

Currently, StarRocks supports the following Iceberg metadata tables:

Metadata tableDescription
historyShows a log of metadata changes made to the table.
metadata_log_entriesShows the metadata log entries for the table.
snapshotsShows details about the table snapshots.
manifestsShows an overview of the manifests associated with the snapshots in the table’s log.
partitionsShows details about the partitions in the table.
filesShows details about the data files and delete files in the current snapshot of the table.
refsShows details about the Iceberg references, including branches and tags.

history table

Usage:

SELECT * FROM [<catalog>.][<database>.]table$history;

Output:

FieldDescription
made_current_atThe time when the snapshot became the current snapshot.
snapshot_idThe ID of the snapshot.
parent_idThe ID of the parent snapshot.
is_current_ancestorWhether this snapshot is an ancestor of the current snapshot.

metadata_log_entries table

Usage:

SELECT * FROM [<catalog>.][<database>.]table$metadata_log_entries;

Output:

FieldDescription
timestampThe time when the metadata was recorded.
fileThe location of the metadata file.
latest_snapshot_idThe ID of the latest snapshot when the metadata was updated.
latest_schema_idThe ID of the latest schema when the metadata was updated.
latest_sequence_numberThe data sequence number of the metadata file.

snapshots table

Usage:

SELECT * FROM [<catalog>.][<database>.]table$snapshots;

Output:

FieldDescription
committed_atThe time when the snapshot was committed.
snapshot_idThe ID for the snapshot.
parent_idThe ID for the parent snapshot.
operationThe type of operation performed on the Iceberg table. Valid values:
  • append: New data is appended.
  • replace: Files are removed and replaced without changing the data in the table.
  • overwrite: Old data is overwritten by new data.
  • delete: Data is deleted from the table.
manifest_listThe list of Avro manifest files that contain detailed information about snapshot changes.
summaryA summary of the changes made from the previous snapshot to the current snapshot.

manifests table

Usage:

SELECT * FROM [<catalog>.][<database>.]table$manifests;

Output:

FieldDescription
pathThe location of the manifest file.
lengthThe length of the manifest file.
partition_spec_idThe ID for the partition specification that is used to write the manifest file.
added_snapshot_idThe ID of the snapshot during which this manifest entry has been added.
added_data_files_countThe number of data files with status ADDED in the manifest file.
added_rows_countThe total number of rows in all data files with status ADDED in the manifest file.
existing_data_files_countThe number of data files with status EXISTING in the manifest file.
existing_rows_countThe total number of rows in all data files with status EXISTING in the manifest file.
deleted_data_files_countThe number of data files with status DELETED in the manifest file.
deleted_rows_countThe total number of rows in all data files with status DELETED in the manifest file.
partition_summariesPartition range metadata.

partitions table

Usage:

SELECT * FROM [<catalog>.][<database>.]table$partitions;

Output:

FieldDescription
partition_valueThe mapping of the partition column names to the partition column values.
spec_idThe partition Spec ID of files.
record_countThe number of records in the partition.
file_countThe number of files mapped in the partition.
total_data_file_size_in_bytesThe size of all the data files in the partition.
position_delete_record_countThe total row count of Position Delete files in the partition.
position_delete_file_countThe number of Position Delete files in the partition.
equality_delete_record_countThe total row count of Equality Delete files in the partition.
equality_delete_file_countThe number of Position Equality files in the partition.
last_updated_atThe time when the partition was updated most recently.

files table

Usage:

SELECT * FROM [<catalog>.][<database>.]table$files;

Output:

FieldDescription
contentThe type of content stored in the file. Valid values: DATA(0), POSITION_DELETES(1), and EQUALITY_DELETES(2).
file_pathThe location of the data file.
file_formatThe format of the data file.
spec_idThe Spec ID that is used to track the file containing a row.
record_countThe number of entries contained in the data file.
file_size_in_bytesThe size of the data file.
column_sizesThe mapping between the Iceberg column ID and its corresponding size in the file.
value_countsThe mapping between the Iceberg column ID and its corresponding count of entries in the file.
null_value_countsThe mapping between the Iceberg column ID and its corresponding count of NULL values in the file.
nan_value_countsThe mapping between the Iceberg column ID and its corresponding count of non- numerical values in the file.
lower_boundsThe mapping between the Iceberg column ID and its corresponding lower bound in the file.
upper_boundsThe mapping between the Iceberg column ID and its corresponding upper bound in the file.
split_offsetsThe list of recommended split locations.
sort_idThe ID representing sort order for this file.
equality_idsThe set of field IDs used for equality comparison in equality delete files.
key_metadataThe metadata about the encryption key that is used to encrypt this file, if applicable.

refs table

Usage:

SELECT * FROM [<catalog>.][<database>.]table$refs;

Output:

FieldDescription
nameThe name of the reference.
typeThe type of the reference. Valid values: BRANCH or TAG.
snapshot_idThe snapshot ID of the reference.
max_reference_age_in_msThe maximum age of the reference before it could be expired.
min_snapshots_to_keepFor branch only, the minimum number of snapshots to keep in a branch.
max_snapshot_age_in_msFor branch only, the max snapshot age allowed in a branch. Older snapshots in the branch will be expired.