- StarRocks
- Introduction to StarRocks
- Quick Start
- Table Design
- Data Loading
- Concepts
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP PUT
- Load data from HDFS or cloud storage
- Continuously load data from Apache Kafka®
- Bulk load using Apache Sparkâ„¢
- Load data using INSERT
- Synchronize data from MySQL in real time
- Continuously load data from Apache Flink®
- Change data through loading
- Transform data at loading
- Data Unloading
- Query Data Sources
- Query Acceleration
- Administration
- Deployment
- Management
- Data Recovery
- User Privilege and Authentication
- Performance Tuning
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADD SQLBLACKLIST
- ADMIN CANCEL REPAIR TABLE
- ADMIN CHECK TABLET
- ADMIN REPAIR TABLE
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER RESOURCE GROUP
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- CREATE RESOURCE GROUP
- DELETE SQLBLACKLIST
- DROP FILE
- DROP RESOURCE GROUP
- EXPLAIN
- INSTALL PLUGIN
- KILL
- SET
- SHOW BACKENDS
- SHOW BROKER
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW PROC
- SHOW PROCESSLIST
- SHOW RESOURCE GROUP
- SHOW SQLBLACKLIST
- SHOW TABLE STATUS
- SHOW VARIABLES
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER TABLE
- ALTER VIEW
- ALTER RESOURCE
- BACKUP
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE DATABASE
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- HLL
- RECOVER
- RESTORE
- SHOW RESOURCES
- SHOW FUNCTION
- TRUNCATE TABLE
- USE
- DML
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- CANCEL EXPORT
- CANCEL REFRESH MATERIALIZED VIEW
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- RESUME ROUTINE LOAD
- ROUTINE LOAD
- SELECT
- SHOW ALTER TABLE
- SHOW BACKUP
- SHOW CREATE TABLE
- SHOW CREATE VIEW
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- Auxiliary Commands
- Data Types
- Function Reference
- Java UDFs
- Window functions
- Aggregate Functions
- Array Functions
- Bit Functions
- Bitmap Functions
- base64_to_bitmap
- bitmap_agg
- bitmap_and
- bitmap_andnot
- bitmap_contains
- bitmap_count
- bitmap_from_string
- bitmap_empty
- bitmap_has_any
- bitmap_hash
- bitmap_intersect
- bitmap_max
- bitmap_min
- bitmap_or
- bitmap_remove
- bitmap_to_array
- bitmap_to_string
- bitmap_union
- bitmap_union_count
- bitmap_union_int
- bitmap_xor
- intersect_count
- to_bitmap
- Conditional Functions
- Cryptographic Functions
- Date Functions
- add_months
- adddate
- convert_tz
- current_date
- current_time
- current_timestamp
- date
- date_add
- date_format
- date_sub, subdate
- date_trunc
- datediff
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- days_add
- days_diff
- days_sub
- from_days
- from_unixtime
- hour
- hours_add
- hours_diff
- hours_sub
- microseconds_add
- microseconds_sub
- minute
- minutes_add
- minutes_diff
- minutes_sub
- month
- monthname
- months_add
- months_diff
- months_sub
- now
- quarter
- second
- seconds_add
- seconds_diff
- seconds_sub
- str_to_date
- str2date
- time_slice
- time_to_sec
- timediff
- timestamp
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- week
- weekofyear
- weeks_add
- weeks_diff
- weeks_sub
- year
- years_add
- years_diff
- years_sub
- Geographic Functions
- JSON Functions
- Overview of JSON functions and operators
- JSON operators
- JSON constructor functions
- JSON query and processing functions
- Math Functions
- String Functions
- Pattern Matching Functions
- Percentile Functions
- Scalar Functions
- Utility Functions
- cast function
- hash function
- System variables
- Error code
- System limits
- SQL Reference
- FAQ
- Benchmark
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
CREATE MATERIALIZED VIEW
Description
This statement is used to create materialized views.
NOTE
CREATE MATERIALIZED VIEW is an asynchronous statement. Executing this statement successfully only indicates that the task to create the materialized view is successfully submitted. You can check the progress by using SHOW ALTER TABLE ROLLUP statement. After the progress is FINISHED, you can use the DESC table_name ALL statement to check the schema of the materialized view.
Syntax
CREATE MATERIALIZED VIEW materialized_view_name AS query
Parameters
materialized_view_name
Name of the materialized view. Required.
Materialized view names in the same table cannot be duplicated.
query
The query used to construct the materialized view. The result of the query is the data of the materialized view. Currently, the supported query format is:
SELECT select_expr[, select_expr ...] FROM [base_view_name] GROUP BY column_name[, column_name ...] ORDER BY column_name[, column_name ...]
The syntax is the same as the query syntax.
select_expr: All columns in the materialized view's schema.
- Only single columns and aggregate columns without expression calculation are supported.
- The aggregate function currently only supports SUM, MIN, MAX, and the parameters of the aggregate function can only be a single column without expression calculation.
- Contains at least one single column.
- All involved columns can only appear once.
base_view_name: The original table name of the materialized view. Required.
- Must be a single table and not a subquery
GROUP BY: Grouped column of materialized view. Optional.
- The data not filled will not be grouped.
ORDER BY: sort order of materialized view. Optional.
The order of the column sort must be the same as the column declaration order in select_expr.
If ORDER BY is not declared, sort columns are automatically supplemented by rules.
If the materialized view is of an aggregate type, all grouped columns are automatically supplemented with sort columns.
If the materialized view is of a non-aggregated type, the first 36 bytes are automatically supplemented as a sort column.
If the number of sorts automatically supplemented is less than three, the first three are sorted.
If the query contains a grouped column, the sort order must be the same as the grouped column.
Usage notes
- The current version of StarRocks does not support creating multiple materialized views at the same time. A new materialized view can only be created when the one before is completed.
- Synchronous materialized views only support aggregate functions on a single column. Query statements in the form of
sum(a+b)
are not supported. - Synchronous materialized views support only one aggregate function for each column of the base table. Query statements such as
select sum(a), min(a) from table
are not supported. - When creating a synchronous materialized view with an aggregate function, you must specify the GROUP BY clause, and specify at least one GROUP BY column in SELECT.
- Synchronous materialized views do not support clauses such as JOIN, WHERE, and the HAVING clause of GROUP BY.
- When using ALTER TABLE DROP COLUMN to drop a specific column in a base table, you must ensure that all synchronous materialized views of the base table do not contain the dropped column, otherwise the drop operation will fail. Before you drop the column, you must first drop all synchronous materialized views that contain the column.
- Creating too many synchronous materialized views for a table will affect the data load efficiency. When data is being loaded to the base table, the data in synchronous materialized view and base table will be updated synchronously. If a base table contains
n
synchronous materialized views, the efficiency of loading data into the base table is about the same as the efficiency of loading data inton
tables.
Example
Base table structure is:
mysql> desc duplicate_table;
+-------+--------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------+------+------+---------+-------+
| k1 | INT | Yes | true | N/A | |
| k2 | INT | Yes | true | N/A | |
| k3 | BIGINT | Yes | true | N/A | |
| k4 | BIGINT | Yes | true | N/A | |
+-------+--------+------+------+---------+-------+
Create a materialized view containing only the columns of the original table (k1, k2).
create materialized view k1_k2 as select k1, k2 from duplicate_table;
The materialized view contains only two columns k1, k2 without any aggregation.
+-----------------+-------+--------+------+------+---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +-----------------+-------+--------+------+------+---------+-------+ | k1_k2 | k1 | INT | Yes | true | N/A | | | | k2 | INT | Yes | true | N/A | | +-----------------+-------+--------+------+------+---------+-------+
Create a materialized view sorted by k2.
create materialized view k2_order as select k2, k1 from duplicate_table order by k2;
The materialized view's schema is shown below. The materialized view contains only two columns k2, k1, where column k2 is a sort column without any aggregation.
+-----------------+-------+--------+------+-------+---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +-----------------+-------+--------+------+-------+---------+-------+ | k2_order | k2 | INT | Yes | true | N/A | | | | k1 | INT | Yes | false | N/A | NONE | +-----------------+-------+--------+------+-------+---------+-------+
Create a materialized view grouped by k1, k2 with k3 as the SUM aggregate.
create materialized view k1_k2_sumk3 as select k1, k2, sum(k3) from duplicate_table group by k1, k2;
The materialized view's schema is shown below. The materialized view contains two columns k1, k2 and sum (k3), where k1, k2 are grouped columns, and sum (k3) is the sum of the k3 columns grouped according to k1, k2.
Because the materialized view does not declare a sort column, and the materialized view has aggregate data, the system supplements the grouped columns k1 and k2 by default.
+-----------------+-------+--------+------+-------+---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +-----------------+-------+--------+------+-------+---------+-------+ | k1_k2_sumk3 | k1 | INT | Yes | true | N/A | | | | k2 | INT | Yes | true | N/A | | | | k3 | BIGINT | Yes | false | N/A | SUM | +-----------------+-------+--------+------+-------+---------+-------+
Create a materialized view to remove duplicate rows.
create materialized view deduplicate as select k1, k2, k3, k4 from duplicate_table group by k1, k2, k3, k4;
The materialized view schema is shown below. The materialized view contains k1, k2, k3, and k4 columns, and there are no duplicate rows.
+-----------------+-------+--------+------+-------+---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +-----------------+-------+--------+------+-------+---------+-------+ | deduplicate | k1 | INT | Yes | true | N/A | | | | k2 | INT | Yes | true | N/A | | | | k3 | BIGINT | Yes | true | N/A | | | | k4 | BIGINT | Yes | true | N/A | | +-----------------+-------+--------+------+-------+---------+-------+
Create a non-aggregated materialized view that does not declare a sort column.
The schema of all_type_table is shown below:
+-------+--------------+------+-------+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+--------------+------+-------+---------+-------+ | k1 | TINYINT | Yes | true | N/A | | | k2 | SMALLINT | Yes | true | N/A | | | k3 | INT | Yes | true | N/A | | | k4 | BIGINT | Yes | true | N/A | | | k5 | DECIMAL(9,0) | Yes | true | N/A | | | k6 | DOUBLE | Yes | false | N/A | NONE | | k7 | VARCHAR(20) | Yes | false | N/A | NONE | +-------+--------------+------+-------+---------+-------+
The materialized view contains k3, k4, k5, k6, k7 columns, and no sort column is declared. The creation statement is as follows:
create materialized view mv_1 as select k3, k4, k5, k6, k7 from all_type_table;
The system's default supplementary sort columns are k3, k4, and k5. The sum of the number of bytes for these three column types is 4 (INT) + 8 (BIGINT) + 16 (DECIMAL) = 28 <36. So these three columns are added as sort columns.
The materialized view's schema is as follows. You can see that the key fields of the k3, k4, and k5 columns are true, which is the sort order. The key field of the k6, k7 columns is false, which is the non-sort order.
+----------------+-------+--------------+------+-------+---------+-------+ | IndexName | Field | Type | Null | Key | Default | Extra | +----------------+-------+--------------+------+-------+---------+-------+ | mv_1 | k3 | INT | Yes | true | N/A | | | | k4 | BIGINT | Yes | true | N/A | | | | k5 | DECIMAL(9,0) | Yes | true | N/A | | | | k6 | DOUBLE | Yes | false | N/A | NONE | | | k7 | VARCHAR(20) | Yes | false | N/A | NONE | +----------------+-------+--------------+------+-------+---------+-------+
keyword
CREATE, MATERIALIZED, VIEW