- Introduction
- Quick Start
- Table Design
- Data Loading
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP push
- Load data from HDFS or cloud storage
- Routine Load
- Spark Load
- Insert Into
- Change data through loading
- Transform data at loading
- Json Loading
- Synchronize data from MySQL
- Load data by using flink-connector-starrocks
- DataX Writer
- Data Export
- Using StarRocks
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADMIN CANCEL REPAIR
- ADMIN CHECK TABLET
- ADMIN REPAIR
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- DROP FILE
- INSTALL PLUGIN
- SHOW BACKENDS
- SHOW BROKER
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW TABLE STATUS
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER TABLE
- ALTER VIEW
- BACKUP
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE DATABASE
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- HLL
- RECOVER
- RESTORE
- SHOW RESOURCES
- SHOW FUNCTION
- TRUNCATE TABLE
- DML
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- RESUME ROUTINE LOAD
- ROUTINE LOAD
- SELECT
- SHOW ALTER
- SHOW BACKUP
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- Data Types
- Auxiliary Commands
- Function Reference
- Java UDFs
- Window Function
- Date Functions
- convert_tz
- curdate
- current_timestamp
- curtime
- datediff
- date_add
- date_format
- date_sub
- date_trunc
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- from_days
- from_unixtime
- hour
- minute
- month
- monthname
- now
- quarter
- second
- str_to_date
- timediff
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- weekofyear
- year
- hours_diff
- minutes_diff
- months_diff
- seconds_diff
- weeks_diff
- years_diff
- Aggregate Functions
- Geographic Functions
- String Functions
- JSON Functions
- Overview of JSON functions and operators
- JSON constructor functions
- JSON query and processing functions
- JSON operators
- Aggregate Functions
- Bitmap Functions
- Array Functions
- cast function
- hash function
- Cryptographic Functions
- Math Functions
- Utility Functions
- System variables
- Error code
- System limits
- SQL Reference
- Administration
- FAQ
- Deploy
- Data Migration
- SQL
- Other FAQs
- Benchmark
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
System variables
Variable Settings and Viewing
This section describes the variables supported by the StarRocks system. They can be viewed with the SHOW VARIABLES
command. These variables can take effect globally on the system or only on the current session.
The variables in StarRocks refer to the variable sets in MySQL, but some variables are only compatible with the MySQL client protocol and do not function on the MySQL database.
View
You can view all or some variables by SHOW VARIABLES [LIKE 'xxx']
;. For example:
SHOW VARIABLES;
SHOW VARIABLES LIKE '%time_zone%';
Settings
Variables can generally be set to take effect globally or only on the current session. When set to global, a new value will be used in subsequent new sessions without affecting the current session. When set to “current session only”, the variable will only take effect on the current session.
A variable set by SET var_name=xxx;
only takes effect for the current session. For example:
SET exec_mem_limit = 137438953472;
SET forward_to_master = true;
SET time_zone = "Asia/Shanghai";
A variable set by the SET GLOBAL var_name=xxx;
statement takes effect globally. For example:
SET GLOBAL exec_mem_limit = 137438953472;
Note: Only ADMIN users can set variables to be globally effective. Globally effective variables do not affect the current session, only subsequent new sessions.
Variables that can be set both globally or partially effective include:
- batch_size
- disable_streaming_preaggregations
- exec_mem_limit
- force_streaming_aggregate
- is_report_success
- hash_join_push_down_right_table
- parallel_fragment_exec_instance_num
- parallel_exchange_instance_num
- query_timeout
- sql_mode
- time_zone
- vectorized_engine_enable
- wait_timeout
Variables that can only be set globally effective include:
- default_rowset_type
In addition, variable settings also support constant expressions, such as:
SET exec_mem_limit = 10 * 1024 * 1024 * 1024;
SET forward_to_master = concat('tr', 'u', 'e');
Setting variables in query statements
In some scenarios, we may need to set variables specifically for certain queries. By using SET_VAR
, it is possible to set session variables that will only take effect within a single statement. For example:
SELECT /*+ SET_VAR(exec_mem_limit = 8589934592) */ name FROM people ORDER BY name;
SELECT /*+ SET_VAR(query_timeout = 1) */ sleep(3);
Note: It must start with
/*+
and can only be followed by theSELECT
keyword.
Supported Variables
- SQL_AUTO_IS_NULL
Used for compatibility with the JDBC connection pool C3P0. No practical usage.
- auto_increment_increment
Used for MySQL client compatibility. No practical usage.
- autocommit
Used for MySQL client compatibility. No practical usage.
- batch_size
Used to specify the number of rows of a single packet transmitted by each node during query execution. The default is 1024, i.e., every 1024 rows of data generated by the source node is packaged and sent to the destination node. A larger number of rows will improve the query throughput in large data volume scenarios, but may increase the query latency in small data volume scenarios. Also, it may increase the memory overhead of the query. We recommend to set batch_size
between 1024 to 4096.
- default_rowset_type
Used to set the default storage format used by the storage engine of the computing node. The currently supported storage formats are alpha
and beta
.
- disable_colocate_join
Used to control whether the Colocation Join is enabled. The default value is false
, meaning the feature is enabled. When this feature is disabled, query planning will not attempt to execute Colocation Join.
- disable_streaming_preaggregations
Used to enable the streaming pre-aggregations. The default value is false
, meaning it is enabled.
- div_precision_increment
Used for MySQL client compatibility. No practical usage.
- enable_insert_strict
Used to enable the strict mode when importing data using the INSERT statement. The default value is false, meaning strict mode is not enabled. For more information, see "Data Import" (... /loading/Loading_intro.md)".
- enable_spilling
Used to enable large data volume drop sorting. The default value is false, meaning it is not enabled. It is enabled when the user does not specify a LIMIT
condition in the ORDER BY
clause and sets enable_spilling
to true. When enabled, the BE data directory starrocks-scratch/
is used to store temporary spilling data that will be cleared after the query is completed.
This function is mainly used for sorting operations with large data volume using limited memory.
- event_scheduler
Used for MySQL client compatibility. No practical usage.
- exec_mem_limit
Used to set the memory limit that can be used by a single query plan instance. The default value is 2GB, and the default unit is B. B/K/KB/M/MB/G/GB/T/TB/P/PB
are supported.
There may be multiple instances of a query plan, and a BE node may execute one or more instances. Therefore, this parameter does not accurately limit the memory usage of a query across the cluster, nor does it accurately limit the memory usage of a query on a single BE node. It needs to be evaluated based on the generated query plan.
Usually, more memory is consumed on blocking nodes (e.g. Sort Node, Aggregate Node, Join Node). Unblocking nodes (e.g. Scan Node) transmit data in stream, which does not take up too much memory.
When there is a Memory Exceed Limit
error, try to increase this parameter.
- force_streaming_aggregate
Used to control whether the aggregation node enables streaming aggregation for computing. The default value is false, meaning the feature is not enabled.
- forward_to_master
Used to set whether some commands will be forwarded to the leader FE node for execution. The default value is false, meaning no forwarding. There are multiple FE nodes in StarRocks, one of which is the leader node. Normally, users can connect to any FE node for full-featured operation. However, some information is only available from the leader FE node.
For example, if the SHOW BACKENDS;
commend is not forwarded to the leader FE node, only basic information (e.g., whether the node is alive or not) can be seen. Forwarding to the leader FE can get more detailed information including the node start time, last heartbeat time, etc.
The commands affected by this parameter are as follows:
- SHOW FRONTENDS;
Forwarding to leader allows users to view the last heartbeat message.
- SHOW BACKENDS;
Forwarding to leader allows users to view the boot time, last heartbeat information, and disk capacity information.
- SHOW BROKER;
Forwarding to leader allows users to view the boot time and last heartbeat information.
- SHOW TABLET;
- ADMIN SHOW REPLICA DISTRIBUTION;
- ADMIN SHOW REPLICA STATUS;
Forwarding to leader allows users to view the tablet information stored in the leader FE metadata. Normally, the tablet information in the different FE metadata should be the same. In case of a problem, you can use this method to compare the differences in metadata of the current FE and the leader FE.
- Show PROC;
Forwarding to leader allows users to view the PROC information stored in the metadata. This is mainly used for metadata comparison.
- hash_join_push_down_right_table
Used to control whether the data of the left table can be filtered by using the filter condition against the right table in the Join query. If so, it can reduce the amount of data that needs to be processed during the query.
If “true”, indicating the operation is allowed and the system will decide whether the left table can be filtered. If “false”, indicating the operation is disabled. The default value is true.
- init_connect
Used for MySQL client compatibility. No practical usage.
- interactive_timeout
Used for MySQL client compatibility. No practical usage.
- is_report_success
Used to set whether the profile of the query needs to be viewed. The default is false, meaning no profile is required.
By default, a profile will only be sent to the FE when a query error occurs in the BEt. Profile sending causes network overhead and therefore affects high concurrency.
When there is a need to analyze the profile of a query, users can set this variable to true and send a query to deep dive. After the query is completed, the profile can be viewed on the web page of the currently connected FE (address: fe_host:fe_http_port/query
). This page displays the profiles of the last 100 queries with is_report_success
turned on.
- language
Used for MySQL client compatibility. No practical usage.
- license
Displays the license of StarRocks.
- load_mem_limit
Specifies the memory limit for the import operation. The default value is 0, meaning that this variable is not used and exec_mem_limit
is used instead.
This variable is only used for the INSERT
operation which involves both query and import. If the user does not set this variable, the memory limit for both query and import will be set as exec_mem_limit
. Otherwise, the memory limit for query will be set as exec_mem_limit
and the memory limit for import will be as load_mem_limit
.
Other import methods such as BROKER LOAD
, STREAM LOAD
still use exec_mem_limit
for memory limit.
- lower_case_table_names
Used for MySQL client compatibility. No practical usage. Table names in StarRocks are case-sensitive.
- max_allowed_packet
Used for compatibility with the JDBC connection pool C3P0. No practical usage.
- max_pushdown_conditions_per_column
This variable is set to -1
by default, indicating that the value configured in be.conf
is used. If this variable is set to be greater than 0, queries in the current session will use this value and ignore the configured value in be.conf
.
- net_buffer_length
Used for MySQL client compatibility. No practical usage.
- net_read_timeout
Used for MySQL client compatibility. No practical usage.
- net_write_timeout
Used for MySQL client compatibility. No practical usage.
- parallel_exchange_instance_num
Used to set the number of exchange nodes that an upper-level node uses to receive data from a lower-level node in the execution plan. The default value is -1, meaning the number of exchange nodes is equal to the number of execution instances of the lower-level node. When this variable is set to be greater than 0 but smaller than the number of execution instances of the lower-level node, the number of exchange nodes equals the set value.
In a distributed query execution plan, the upper-level node usually has one or more exchange nodes to receive data from the execution instances of the lower-level node on different BEs. Usually the number of exchange nodes is equal to the number of execution instances of the lower-level node.
In some aggregation query scenarios where the amount of data decreases drastically after aggregation, you can try to modify this variable to a smaller value to reduce the resource overhead. An example would be running aggregation queries using theDUPLICATE model.
- parallel_fragment_exec_instance_num
Used to set the number of instances used to scan nodes on each BE. The default value is 1.
A query plan typically produces a set of scan ranges. This data is distributed across multiple BE nodes. A BE node will have one or more scan ranges, and by default, each BE node's set of scan ranges is processed by only one execution instance. When machine resources suffice, you can increase this variable to allow more execution instances to process a scan range simultaneously for efficiency purposes.
The number of scan instances determines the number of other execution nodes in the upper level, such as aggregation nodes and join nodes. Therefore, it increases the concurrency of the entire query plan execution. Modifying this parameter will help improve efficiency, but larger values will consume more machine resources, such as CPU, memory, and disk IO.
- performance_schema
Used for compatibility with MySQL JDBC versions 8.0.16 and above. No practical usage.
- query_cache_size
Used for MySQL client compatibility. No practical use.
- query_cache_type
Used for compatibility with JDBC connection pool C3P0. No practical use.
- query_timeout
Used to set the query timeout in "seconds". This variable will act on all query statements in the current connection, as well as INSERT statements. The default value is 300 seconds.
- rewrite_count_distinct_to_bitmap_hll
Used to decide whether to rewrite count distinct queries to bitmap_union_count and hll_union_agg.
- sql_mode
Used to specify the SQL mode to accommodate certain SQL dialects.
- sql_safe_updates
Used for MySQL client compatibility. No practical usage.
- sql_select_limit
Used for MySQL client compatibility. No practical usage.
storage_engine
The types of engines supported by StarRocks:
- olap: StarRocks system-owned engine.
- mysql: Use MySQL external tables.
- broker: Access external tables through a broker program.
- elasticsearch or es: Use Elasticsearch external tables.
- hive: Use Hive external tables.
system_time_zone
Used to display the time zone of the current system. Cannot be changed.
- time_zone
Used to set the time zone of the current session. The time zone can affect the results of certain time functions.
- tx_isolation
Used for MySQL client compatibility. No practical usage.
- use_v2_rollup
Used to control the query to fetch data using the rollup index of the segment v2 storage format. This variable is used for validation when going online with segment v2. It is not recommended for other cases.
- vectorized_engine_enable
Used to control whether the vectorized engine is used to execute queries. A value of true
indicates that the vectorized engine is used, otherwise the non-vectorized engine is used. The default is true
.
- version
Used for MySQL client compatibility. No practical usage.
- version_comment
Used to display the version of StarRocks. Cannot be changed.
- wait_timeout
Used to set the connection timeout for idle connections. When an idle connection does not interact with StarRocks for that length of time, StarRocks will actively disconnect the link. The default value is 8 hours, in seconds.