- StarRocks
- Introduction to StarRocks
- Quick Start
- Deployment
- Deployment overview
- Prepare
- Deploy
- Deploy shared-nothing StarRocks
- Deploy and use shared-data StarRocks
- Manage
- Table Design
- Understand StarRocks table design
- Table types
- Data distribution
- Data compression
- Sort keys and prefix indexes
- Data Loading
- Concepts
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP PUT
- Load data from HDFS
- Load data from cloud storage
- Load data from Apache Kafka®
- Continuously load data from Apache Kafka®
- Load data from Apache Sparkâ„¢
- Load data using INSERT
- Load data using Stream Load transaction interface
- Realtime synchronization from MySQL
- Continuously load data from Apache Flink®
- Change data through loading
- Transform data at loading
- Data Unloading
- Query Data Lakes
- Query Acceleration
- Gather CBO statistics
- Synchronous materialized views
- Asynchronous materialized views
- Colocate Join
- Lateral Join
- Query Cache
- Index
- Computing the Number of Distinct Values
- Sorted streaming aggregate
- Integrations
- Administration
- Management
- Data recovery
- User Privilege and Authentication
- Performance Tuning
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADD SQLBLACKLIST
- ADMIN CANCEL REPAIR TABLE
- ADMIN CHECK TABLET
- ADMIN REPAIR TABLE
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER RESOURCE GROUP
- ALTER STORAGE VOLUME
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- CREATE RESOURCE GROUP
- CREATE STORAGE VOLUME
- DELETE SQLBLACKLIST
- DESC STORAGE VOLUME
- DROP FILE
- DROP RESOURCE GROUP
- DROP STORAGE VOLUME
- EXPLAIN
- INSTALL PLUGIN
- KILL
- SET
- SET DEFAULT STORAGE VOLUME
- SHOW BACKENDS
- SHOW BROKER
- SHOW COMPUTE NODES
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW PROC
- SHOW PROCESSLIST
- SHOW RESOURCE GROUP
- SHOW SQLBLACKLIST
- SHOW STORAGE VOLUMES
- SHOW TABLE STATUS
- SHOW VARIABLES
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER MATERIALIZED VIEW
- ALTER TABLE
- ALTER VIEW
- ALTER RESOURCE
- ANALYZE TABLE
- BACKUP
- CANCEL ALTER TABLE
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE ANALYZE
- CREATE DATABASE
- CREATE EXTERNAL CATALOG
- CREATE FUNCTION
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE VIEW
- DROP ANALYZE
- DROP CATALOG
- DROP DATABASE
- DROP FUNCTION
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP STATS
- DROP TABLE
- DROP VIEW
- HLL
- KILL ANALYZE
- RECOVER
- REFRESH EXTERNAL TABLE
- RESTORE
- SET CATALOG
- SHOW ANALYZE JOB
- SHOW ANALYZE STATUS
- SHOW FUNCTION
- SHOW META
- SHOW RESOURCES
- TRUNCATE TABLE
- USE
- DML
- ALTER LOAD
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- CANCEL EXPORT
- CANCEL REFRESH MATERIALIZED VIEW
- CREATE ROUTINE LOAD
- DELETE
- DROP TASK
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- REFRESH MATERIALIZED VIEW
- RESUME ROUTINE LOAD
- SELECT
- SHOW ALTER TABLE
- SHOW ALTER MATERIALIZED VIEW
- SHOW BACKUP
- SHOW CATALOGS
- SHOW CREATE CATALOG
- SHOW CREATE DATABASE
- SHOW CREATE MATERIALIZED VIEW
- SHOW CREATE TABLE
- SHOW CREATE VIEW
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW MATERIALIZED VIEWS
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- SUBMIT TASK
- UPDATE
- Auxiliary Commands
- Data Types
- Keywords
- Function Reference
- Function list
- Java UDFs
- Window functions
- Lambda expression
- Aggregate Functions
- any_value
- approx_count_distinct
- array_agg
- avg
- bitmap
- bitmap_agg
- count
- corr
- covar_pop
- covar_samp
- group_concat
- grouping
- grouping_id
- hll_empty
- hll_hash
- hll_raw_agg
- hll_union
- hll_union_agg
- max
- max_by
- min
- min_by
- multi_distinct_sum
- multi_distinct_count
- percentile_approx
- percentile_cont
- percentile_disc
- retention
- stddev
- stddev_samp
- sum
- variance, variance_pop, var_pop
- var_samp
- window_funnel
- Array Functions
- all_match
- any_match
- array_agg
- array_append
- array_avg
- array_concat
- array_contains
- array_contains_all
- array_cum_sum
- array_difference
- array_distinct
- array_filter
- array_generate
- array_intersect
- array_join
- array_length
- array_map
- array_max
- array_min
- array_position
- array_remove
- array_slice
- array_sort
- array_sortby
- array_sum
- arrays_overlap
- array_to_bitmap
- cardinality
- element_at
- reverse
- unnest
- Bit Functions
- Bitmap Functions
- base64_to_bitmap
- bitmap_agg
- bitmap_and
- bitmap_andnot
- bitmap_contains
- bitmap_count
- bitmap_from_string
- bitmap_empty
- bitmap_has_any
- bitmap_hash
- bitmap_intersect
- bitmap_max
- bitmap_min
- bitmap_or
- bitmap_remove
- bitmap_subset_in_range
- bitmap_subset_limit
- bitmap_to_array
- bitmap_to_base64
- bitmap_to_string
- bitmap_union
- bitmap_union_count
- bitmap_union_int
- bitmap_xor
- intersect_count
- sub_bitmap
- to_bitmap
- JSON Functions
- Overview of JSON functions and operators
- JSON operators
- JSON constructor functions
- JSON query and processing functions
- Map Functions
- Binary Functions
- Conditional Functions
- Cryptographic Functions
- Date Functions
- add_months
- adddate
- convert_tz
- current_date
- current_time
- current_timestamp
- date
- date_add
- date_diff
- date_format
- date_slice
- date_sub, subdate
- date_trunc
- datediff
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- days_add
- days_diff
- days_sub
- from_days
- from_unixtime
- hour
- hours_add
- hours_diff
- hours_sub
- last_day
- makedate
- microseconds_add
- microseconds_sub
- minute
- minutes_add
- minutes_diff
- minutes_sub
- month
- monthname
- months_add
- months_diff
- months_sub
- next_day
- now
- previous_day
- quarter
- second
- seconds_add
- seconds_diff
- seconds_sub
- str_to_date
- str2date
- time_slice
- time_to_sec
- timediff
- timestamp
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- week
- week_iso
- weekofyear
- weeks_add
- day_of_week_iso
- weeks_diff
- weeks_sub
- year
- years_add
- years_diff
- years_sub
- Geographic Functions
- Math Functions
- String Functions
- append_trailing_char_if_absent
- ascii
- char
- char_length
- character_length
- concat
- concat_ws
- ends_with
- find_in_set
- group_concat
- hex
- hex_decode_binary
- hex_decode_string
- instr
- lcase
- left
- length
- locate
- lower
- lpad
- ltrim
- money_format
- null_or_empty
- parse_url
- repeat
- replace
- reverse
- right
- rpad
- rtrim
- space
- split
- split_part
- starts_with
- strleft
- strright
- str_to_map
- substring
- trim
- ucase
- unhex
- upper
- url_decode
- url_encode
- Pattern Matching Functions
- Percentile Functions
- Scalar Functions
- Struct Functions
- Table Functions
- Utility Functions
- cast function
- hash function
- AUTO_INCREMENT
- Generated columns
- System variables
- User-defined variables
- Error code
- System limits
- AWS IAM policies
- SQL Reference
- FAQ
- Benchmark
- Ecosystem Release Notes
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
Information Schema
The StarRocks information_schema
is a database within each StarRocks instance. information_schema
contains several read-only, system-defined tables which store extensive metadata information of all objects that the StarRocks instance maintains.
View metadata via Information Schema
You can view the metadata information within a StarRocks instance by querying the content of tables in information_schema
.
The following example views metadata information about a table named sr_member
in StarRocks by querying the table tables
.
mysql> SELECT * FROM information_schema.tables WHERE TABLE_NAME like 'sr_member'\G
*************************** 1. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: sr_hub
TABLE_NAME: sr_member
TABLE_TYPE: BASE TABLE
ENGINE: StarRocks
VERSION: NULL
ROW_FORMAT: NULL
TABLE_ROWS: 6
AVG_ROW_LENGTH: 542
DATA_LENGTH: 3255
MAX_DATA_LENGTH: NULL
INDEX_LENGTH: NULL
DATA_FREE: NULL
AUTO_INCREMENT: NULL
CREATE_TIME: 2022-11-17 14:32:30
UPDATE_TIME: 2022-11-17 14:32:55
CHECK_TIME: NULL
TABLE_COLLATION: utf8_general_ci
CHECKSUM: NULL
CREATE_OPTIONS: NULL
TABLE_COMMENT: OLAP
1 row in set (1.04 sec)
Information Schema tables
StarRocks has optimized the metadata information provided by the tables tables
, tables_config
, and load_tracking_logs
and has provided the loads
table from v3.1 onwards in information_schema
:
Information Schema table name | Description |
---|---|
tables | Provides general metadata information of tables. |
tables_config | Provides additional table metadata information that is unique to StarRocks. |
load_tracking_logs | Provides error information (if any) of load jobs. |
loads | Provides the results of load jobs. This table is supported from v3.1 onwards. Currently, you can only view the results of Broker Load and Insert jobs from this table. |
loads
The following fields are provided in loads
:
Field | Description |
---|---|
JOB_ID | The unique ID assigned by StarRocks to identify the load job. |
LABEL | The label of the load job. |
DATABASE_NAME | The name of the database to which the destination StarRocks tables belong. |
STATE | The state of the load job. Valid values:
|
PROGRESS | The progress of the ETL stage and LOADING stage of the load job. |
TYPE | The type of the load job. For Broker Load, the return value is BROKER . For INSERT, the return value is INSERT . |
PRIORITY | The priority of the load job. Valid values: HIGHEST , HIGH , NORMAL , LOW , and LOWEST . |
SCAN_ROWS | The number of data rows that are scanned. |
FILTERED_ROWS | The number of data rows that are filtered out due to inadequate data quality. |
UNSELECTED_ROWS | The number of data rows that are filtered out due to the conditions specified in the WHERE clause. |
SINK_ROWS | The number of data rows that are loaded. |
ETL_INFO | The ETL details of the load job. A non-empty value is returned only for Spark Load. For any other types of load jobs, an empty value is returned. |
TASK_INFO | The task execution details of the load job, such as the timeout and max_filter_ratio settings. |
CREATE_TIME | The time at which the load job was created. Format: yyyy-MM-dd HH:mm:ss . Example: 2023-07-24 14:58:58 . |
ETL_START_TIME | The start time of the ETL stage of the load job. Format: yyyy-MM-dd HH:mm:ss . Example: 2023-07-24 14:58:58 . |
ETL_FINISH_TIME | The end time of the ETL stage of the load job. Format: yyyy-MM-dd HH:mm:ss . Example: 2023-07-24 14:58:58 . |
LOAD_START_TIME | The start time of the LOADING stage of the load job. Format: yyyy-MM-dd HH:mm:ss . Example: 2023-07-24 14:58:58 . |
LOAD_FINISH_TIME | The end time of the LOADING stage of the load job. Format: yyyy-MM-dd HH:mm:ss . Example: 2023-07-24 14:58:58 . |
JOB_DETAILS | The details about the data loaded, such as the number of bytes and the number of files. |
ERROR_MSG | The error message of the load job. If the load job did not encounter any error, NULL is returned. |
TRACKING_URL | The URL from which you can access the unqualified data row samples detected in the load job. You can use the curl or wget command to access the URL and obtain the unqualified data row samples. If no unqualified data is detected, NULL is returned. |
TRACKING_SQL | The SQL statement that can be used to query the tracking log of the load job. A SQL statement is returned only when the load job involves unqualified data rows. If the load job does not involve any unqualified data rows, NULL is returned. |
REJECTED_RECORD_PATH | The path from which you can access all the unqualified data rows that are filtered out in the load job. The number of unqualified data rows logged is determined by the log_rejected_record_num parameter configured in the load job. You can use the wget command to access the path. If the load job does not involve any unqualified data rows, NULL is returned. |
tables
The following fields are provided in tables
:
Field | Description |
---|---|
TABLE_CATALOG | Name of the catalog that stores the table. |
TABLE_SCHEMA | Name of the database that stores the table. |
TABLE_NAME | Name of the table. |
TABLE_TYPE | Type of the table. Valid values: "BASE TABLE" or "VIEW". |
ENGINE | Engine type of the table. Valid values: "StarRocks", "MySQL", "MEMORY" or an empty string. |
VERSION | Applies to a feature not available in StarRocks. |
ROW_FORMAT | Applies to a feature not available in StarRocks. |
TABLE_ROWS | Row count of the table. |
AVG_ROW_LENGTH | Average row length (size) of the table. It is equivalent to DATA_LENGTH / TABLE_ROWS . Unit: Byte. |
DATA_LENGTH | Data length (size) of the table. Unit: Byte. |
MAX_DATA_LENGTH | Applies to a feature not available in StarRocks. |
INDEX_LENGTH | Applies to a feature not available in StarRocks. |
DATA_FREE | Applies to a feature not available in StarRocks. |
AUTO_INCREMENT | Applies to a feature not available in StarRocks. |
CREATE_TIME | The time when the table is created. |
UPDATE_TIME | The last time when the table is updated. |
CHECK_TIME | The last time when a consistency check is performed on the table. |
TABLE_COLLATION | The default collation of the table. |
CHECKSUM | Applies to a feature not available in StarRocks. |
CREATE_OPTIONS | Applies to a feature not available in StarRocks. |
TABLE_COMMENT | Comment on the table. |
tables_config
The following fields are provided in tables_config
:
Field | Description |
---|---|
TABLE_SCHEMA | Name of the database that stores the table. |
TABLE_NAME | Name of the table. |
TABLE_ENGINE | Engine type of the table. |
TABLE_MODEL | Table type. Valid values: "DUP_KEYS", "AGG_KEYS", "UNQ_KEYS" or "PRI_KEYS". |
PRIMARY_KEY | The primary key of a Primary Key table or a Unique Key table. An empty string is returned if the table is not a Primary Key table or a Unique Key table. |
PARTITION_KEY | The partitioning columns of the table. |
DISTRIBUTE_KEY | The bucketing columns of the table. |
DISTRIBUTE_TYPE | The data distribution method of the table. |
DISTRIBUTE_BUCKET | Number of buckets in the table. |
SORT_KEY | Sort keys of the table. |
PROPERTIES | Properties of the table. |
TABLE_ID | ID of the table. |
load_tracking_logs
This feature is supported since StarRocks v3.0.
The following fields are provided in load_tracking_logs
:
Field | Description |
---|---|
JOB_ID | The ID of the load job. |
LABEL | The label of the load job. |
DATABASE_NAME | The database that the load job belongs to. |
TRACKING_LOG | Error logs (if any) of the load job. |
Type | The type of the load job. Valid values: BROKER, INSERT, ROUTINE_LOAD and STREAM_LOAD. |
materialized_views
The following fields are provided in materialized_views
:
Field | Description |
---|---|
MATERIALIZED_VIEW_ID | ID of the materialized view |
TABLE_SCHEMA | Database in which the materialized view resides |
TABLE_NAME | Name of the materialized view |
REFRESH_TYPE | Refresh type of the materialized view, including ROLLUP , ASYNC , and MANUAL |
IS_ACTIVE | Indicates whether the materialized view is active. Inactive materialized views can not be refreshed or queried. |
INACTIVE_REASON | The reason that the materialized view is inactive |
PARTITION_TYPE | Type of partitioning strategy for the materialized view |
TASK_ID | ID of the task responsible for refreshing the materialized view |
TASK_NAME | Name of the task responsible for refreshing the materialized view |
LAST_REFRESH_START_TIME | Start time of the most recent refresh task |
LAST_REFRESH_FINISHED_TIME | End time of the most recent refresh task |
LAST_REFRESH_DURATION | Duration of the most recent refresh task |
LAST_REFRESH_STATE | State of the most recent refresh task |
LAST_REFRESH_FORCE_REFRESH | Indicates whether the most recent refresh task was a force refresh |
LAST_REFRESH_START_PARTITION | Starting partition for the most recent refresh task |
LAST_REFRESH_END_PARTITION | Ending partition for the most recent refresh task |
LAST_REFRESH_BASE_REFRESH_PARTITIONS | Base table partitions involved in the most recent refresh task |
LAST_REFRESH_MV_REFRESH_PARTITIONS | Materialized view partitions refreshed in the most recent refresh task |
LAST_REFRESH_ERROR_CODE | Error code of the most recent refresh task |
LAST_REFRESH_ERROR_MESSAGE | Error message of the most recent refresh task |
TABLE_ROWS | Number of data rows in the materialized view, based on approximate background statistics |
MATERIALIZED_VIEW_DEFINITION | SQL definition of the materialized view |