- StarRocks
- Introduction to StarRocks
- Quick Start
- Deployment
- Deployment overview
- Prepare
- Deploy
- Deploy classic StarRocks
- Deploy and use shared-data StarRocks
- Manage
- Table Design
- Data Loading
- Concepts
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP PUT
- Load data from HDFS or cloud storage
- Continuously load data from Apache Kafka®
- Bulk load using Apache Sparkâ„¢
- Load data using INSERT
- Load data using Stream Load transaction interface
- Realtime synchronization from MySQL
- Continuously load data from Apache Flink®
- Change data through loading
- Transform data at loading
- Data Unloading
- Query Data Sources
- Query Acceleration
- Gather CBO statistics
- Synchronous materialized view
- Asynchronous materialized view
- Colocate Join
- Lateral Join
- Query Cache
- Index
- Computing the Number of Distinct Values
- Sorted streaming aggregate
- Administration
- Management
- Data recovery
- User Privilege and Authentication
- Performance Tuning
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADD SQLBLACKLIST
- ADMIN CANCEL REPAIR TABLE
- ADMIN CHECK TABLET
- ADMIN REPAIR TABLE
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER RESOURCE GROUP
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- CREATE RESOURCE GROUP
- DELETE SQLBLACKLIST
- DROP FILE
- DROP RESOURCE GROUP
- EXPLAIN
- INSTALL PLUGIN
- KILL
- SET
- SHOW BACKENDS
- SHOW BROKER
- SHOW COMPUTE NODES
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW PROC
- SHOW PROCESSLIST
- SHOW RESOURCE GROUP
- SHOW SQLBLACKLIST
- SHOW TABLE STATUS
- SHOW VARIABLES
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER MATERIALIZED VIEW
- ALTER TABLE
- ALTER VIEW
- ALTER RESOURCE
- ANALYZE TABLE
- BACKUP
- CANCEL ALTER TABLE
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE ANALYZE
- CREATE DATABASE
- CREATE EXTERNAL CATALOG
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP ANALYZE
- DROP STATS
- DROP CATALOG
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- HLL
- KILL ANALYZE
- RECOVER
- REFRESH EXTERNAL TABLE
- RESTORE
- SET CATALOG
- SHOW ANALYZE JOB
- SHOW ANALYZE STATUS
- SHOW META
- SHOW RESOURCES
- SHOW FUNCTION
- TRUNCATE TABLE
- USE
- DML
- ALTER LOAD
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- CANCEL EXPORT
- CANCEL REFRESH MATERIALIZED VIEW
- CREATE ROUTINE LOAD
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- REFRESH MATERIALIZED VIEW
- RESUME ROUTINE LOAD
- SELECT
- SHOW ALTER TABLE
- SHOW ALTER MATERIALIZED VIEW
- SHOW BACKUP
- SHOW CATALOGS
- SHOW CREATE CATALOG
- SHOW CREATE MATERIALIZED VIEW
- SHOW CREATE TABLE
- SHOW CREATE VIEW
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW MATERIALIZED VIEWS
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- SUBMIT TASK
- UPDATE
- Auxiliary Commands
- Data Types
- Keywords
- AUTO_INCREMENT
- Function Reference
- Java UDFs
- Window functions
- Lambda expression
- Aggregate Functions
- array_agg
- avg
- any_value
- approx_count_distinct
- bitmap
- bitmap_agg
- count
- grouping
- grouping_id
- hll_empty
- hll_hash
- hll_raw_agg
- hll_union
- hll_union_agg
- max
- max_by
- min
- multi_distinct_sum
- multi_distinct_count
- percentile_approx
- percentile_cont
- percentile_disc
- retention
- stddev
- stddev_samp
- sum
- variance, variance_pop, var_pop
- var_samp
- window_funnel
- Array Functions
- array_agg
- array_append
- array_avg
- array_concat
- array_contains
- array_contains_all
- array_cum_sum
- array_difference
- array_distinct
- array_filter
- array_intersect
- array_join
- array_length
- array_map
- array_max
- array_min
- array_position
- array_remove
- array_slice
- array_sort
- array_sortby
- array_sum
- arrays_overlap
- array_to_bitmap
- cardinality
- element_at
- reverse
- unnest
- Bit Functions
- Bitmap Functions
- base64_to_bitmap
- bitmap_agg
- bitmap_and
- bitmap_andnot
- bitmap_contains
- bitmap_count
- bitmap_from_string
- bitmap_empty
- bitmap_has_any
- bitmap_hash
- bitmap_intersect
- bitmap_max
- bitmap_min
- bitmap_or
- bitmap_remove
- bitmap_to_array
- bitmap_to_base64
- bitmap_to_string
- bitmap_union
- bitmap_union_count
- bitmap_union_int
- bitmap_xor
- intersect_count
- sub_bitmap
- to_bitmap
- JSON Functions
- Overview of JSON functions and operators
- JSON operators
- JSON constructor functions
- JSON query and processing functions
- Map Functions
- Binary Functions
- Conditional Functions
- Cryptographic Functions
- Date Functions
- add_months
- adddate
- convert_tz
- current_date
- current_time
- current_timestamp
- date
- date_add
- date_format
- date_slice
- date_sub, subdate
- date_trunc
- datediff
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- days_add
- days_diff
- days_sub
- from_days
- from_unixtime
- hour
- hours_add
- hours_diff
- hours_sub
- microseconds_add
- microseconds_sub
- minute
- minutes_add
- minutes_diff
- minutes_sub
- month
- monthname
- months_add
- months_diff
- months_sub
- now
- quarter
- second
- seconds_add
- seconds_diff
- seconds_sub
- str_to_date
- str2date
- time_slice
- time_to_sec
- timediff
- timestamp
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- week
- week_iso
- weekofyear
- weeks_add
- weeks_diff
- weeks_sub
- year
- years_add
- years_diff
- years_sub
- Geographic Functions
- Math Functions
- String Functions
- append_trailing_char_if_absent
- ascii
- char
- char_length
- character_length
- concat
- concat_ws
- ends_with
- find_in_set
- group_concat
- hex
- hex_decode_binary
- hex_decode_string
- instr
- lcase
- left
- length
- locate
- lower
- lpad
- ltrim
- money_format
- null_or_empty
- parse_url
- repeat
- replace
- reverse
- right
- rpad
- rtrim
- space
- split
- split_part
- starts_with
- strleft
- strright
- substring
- trim
- ucase
- unhex
- upper
- Pattern Matching Functions
- Percentile Functions
- Scalar Functions
- Utility Functions
- cast function
- hash function
- System variables
- User-defined variables
- Error code
- System limits
- SQL Reference
- FAQ
- Benchmark
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
- Integration
Other FAQs
This topic provides answers to some general questions.
Do VARCHAR (32) and STRING occupy the same storage space?
Both are variable-length data types. When you store data of the same length, VARCHAR (32) and STRING occupy the same storage space.
Do VARCHAR (32) and STRING perform the same for the data query?
Yes.
Why do TXT files imported from Oracle still appear garbled after I set the character set to UTF-8?
To solve this problem, perform the following steps:
For example, there is a file named original, whose text is garbled. The character set of this file is ISO-8859-1. Run the following code to obtain the character set of the file.
file --mime-encoding origin.txt origin.txt: iso-8859-1
Run the
iconv
command to convert the character set of this file into UTF-8.iconv -f iso-8859-1 -t utf-8 origin.txt > origin_utf-8.txt
After the conversion, the text of this file still appears garbled. You can then regrade the character set of this file as GBK and convert the character set into UTF-8 again.
iconv -f gbk -t utf-8 origin.txt > origin_utf-8.txt
Is the length of STRING defined by MySQL the same as that defined by StarRocks?
For VARCHAR(n), StarRocks defines "n" by bytes and MySQL defines "n" by characters. According to UTF-8, one Chinese character is equal to three bytes. When StarRocks and MySQL define "n" as the same number, MySQL saves three times as many characters as StarRocks.
Can the data type of partitioned fields of a table be FLOAT, DOUBLE, or DECIMAL?
No, only DATE, DATETIME, and INT are supported.
How to check the storage space that is occupied by the data in a table?
Execute the SHOW DATA statement to see the corresponding storage space. You can also see the data volume, the number of copies, and the number of rows.
Note: There is a time delay in data statistics.
How to request a quota increase for the StarRocks database?
To request a quota increase, run the following code:
ALTER DATABASE example_db SET DATA QUOTA 10T;
Does StarRocks support updating particular fields in a table by executing the UPSERT statement?
StarRocks 2.2 and later support updating specific fields in a table by using the Primary Key table. StarRocks 1.9 and later support updating all fields in a table by using the Primary Key table. For more information, see Primary Key table in StarRocks 2.2.
How to swap the data between two tables or two partitions?
Execute the SWAP WITH statement to swap the data between two tables or two partitions. The SWAP WITH statement is more secure than the INSERT OVERWRITE statement. Before you swap the data, check the data first and then see whether the data after the swapping is consistent with the data before the swapping.
Swap two tables: For example, there is a table named table 1. If you want to replace table 1 with another one, perform the following steps:
Create a new table named table 2.
create table2 like table1;
Use Stream Load, Broker Load, or Insert Into to load data from table 1 into table 2.
Replace table 1 with table 2.
ALTER TABLE table1 SWAP WITH table2;
By doing so, the data is loaded accurately into table 1.
Swap two partitions: For example, there is a table named table 1. If you want to replace the partition data in table 1, perform the following steps:
Create a temporary partition.
ALTER TABLE table1 ADD TEMPORARY PARTITION tp1 VALUES LESS THAN("2020-02-01");
Load the partition data from table 1 into the temporary partition.
Replace the partition of table 1 with the temporary partition.
ALTER TABLE table1 REPLACE PARTITION (p1) WITH TEMPORARY PARTITION (tp1);
This error "error to open replicated environment, will exit" occurs when I restart a frontend (FE)
This error occurs due to BDBJE's bug. To solve this problem, update the BDBJE version to 1.17 or later.
This error "Broker list path exception" occurs when I query data from a new Apache Hive table
Problem description
msg:Broker list path exception
path=hdfs://172.31.3.136:9000/user/hive/warehouse/zltest.db/student_info/*, broker=TNetworkAddress(hostname:172.31.4.233, port:8000)
Solution
Contact the StarRocks technical support and check whether the address and port of the namenode are correct and whether you have permission to access the address and port of the namenode.
This error "get hive partition metadata failed" occurs when I query data from a new Apache Hive table
Problem description
msg:get hive partition meta data failed: java.net.UnknownHostException: emr-header-1.cluster-242
Solution
Ensure that the network is connected and upload the host file to each backend (BE) in your StarRocks cluster.
This error "do_open failed. reason = Invalid ORC postscript length" occurs when I access ORC external table in Apache Hive
Problem description
The metadata of the Apache Hive is cached in the FEs. But there is a two-hours time lag for StarRocks to update the metadata. Before StarRocks finishes the update, If you insert new data or update data in the Apache Hive table, the data in HDFS scanned by the BEs and the data obtained by the FEs are different. Therefore, this error occurs.
MySQL [bdp_dim]> select * from dim_page_func_s limit 1;
ERROR 1064 (HY000): HdfsOrcScanner::do_open failed. reason = Invalid ORC postscript length
Solution
To solve this problem, perform one of the following operations:
- Upgrade your current version to StarRocks 2.2 or later.
- Manually refresh your Apache Hive table. For more information, see Metadata caching strategy.
This error "caching_sha2_password cannot be loaded" occurs when I connect external tables of MySQL
Problem description
The default authentication plugin of MySQL 8.0 is caching_sha2_password. The default authentication plugin of MySQL 5.7 is mysql_native_password. This error occurs because you use the wrong authentication plugin.
Solution
To solve this problem, perform one of the following operations:
- Connect to the StarRocks.
ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'yourpassword';
- Modify the
my.cnf
file.
vim my.cnf
[mysqld]
default_authentication_plugin=mysql_native_password
How to release disk space immediately after deleting a table?
If you execute the DROP TABLE statement to delete a table, StarRocks takes a while to release the allocated disk space. To release the allocated disk space immediately, execute the DROP TABLE FORCE statement to delete a table. When you execute the DROP TABLE FORCE statement, the StarRocks deletes the table directly without checking whether there are unfinished events in it. We recommend that you execute the DROP TABLE FORCE statement with caution. Because once the table is deleted, you cannot restore it.
How to view the current version of StarRocks?
Run the select current_version();
command or the CLI command ./bin/show_fe_version.sh
to view the current version.
How to set the memory size of an FE?
FEs are used to store metadata. You can set the memory size of an FE based on the number of tablets. In StarRocks, you can set the memory size of an FE to 20 GB at most. 10 million tablets occupy about 20 GB of an FE memory.
How does StarRocks calculate its query time?
StarRocks supports querying data by using multiple threads. Query time refers to the time used by multiple threads to query data.
Does StarRocks support setting the path when I export data locally?
No.
What are the concurrency upper limits of StarRocks?
You can test the concurrency limitations based on the actual business scenarios or simulated business scenarios. According to the feedback of some users, maximum of 20,000 QPS or 30,000 QPS can be achieved.
Why is the first-time SSB test performance of StarRocks slower than that done the second time?
The speed to read disks for the first query relates to the performance of disks. After the first query, the page cache is generated for the subsequent queries, so the query is faster than before.
How many BEs need to be configured at least for a cluster?
StarRocks supports single node deployment, so you need to configure at least one BE. BEs need to be run with AVX2, so we recommend that you deploy BEs on machines with 8-core and 16GB or higher configurations.
How to set data permissions when I use Apache Superset to visualize the data in StarRocks?
You can create a new user account and then set the data permission by granting permissions on the table query to the user.
Why does the profile fail to display after I set enable_profile
to true
?
The report is only submitted to the leader FE for access.
How to check field annotations in the tables of StarRocks?
Run the show create table xxx
command.
When I create a table, how to specify the default value for the NOW() function?
Only StarRocks 2.1 or later version supports specifying the default value for a function. For versions earlier than StarRocks 2.1, you can only specify a constant for a function.
How can I release the storage space of BE nodes?
You can remove the directory trash
using rm -rf
command. If you have already restored your data from snapshot, you can remove the directory snapshot
.
Can add extra disks to BE nodes?
Yes. You can add the disks to the directory specified by the BE configuration item storage_root_path
.
- Other FAQs
- Do VARCHAR (32) and STRING occupy the same storage space?
- Do VARCHAR (32) and STRING perform the same for the data query?
- Why do TXT files imported from Oracle still appear garbled after I set the character set to UTF-8?
- Is the length of STRING defined by MySQL the same as that defined by StarRocks?
- Can the data type of partitioned fields of a table be FLOAT, DOUBLE, or DECIMAL?
- How to check the storage space that is occupied by the data in a table?
- How to request a quota increase for the StarRocks database?
- Does StarRocks support updating particular fields in a table by executing the UPSERT statement?
- How to swap the data between two tables or two partitions?
- This error "error to open replicated environment, will exit" occurs when I restart a frontend (FE)
- This error "Broker list path exception" occurs when I query data from a new Apache Hive table
- This error "get hive partition metadata failed" occurs when I query data from a new Apache Hive table
- This error "do_open failed. reason = Invalid ORC postscript length" occurs when I access ORC external table in Apache Hive
- This error "caching_sha2_password cannot be loaded" occurs when I connect external tables of MySQL
- How to release disk space immediately after deleting a table?
- How to view the current version of StarRocks?
- How to set the memory size of an FE?
- How does StarRocks calculate its query time?
- Does StarRocks support setting the path when I export data locally?
- What are the concurrency upper limits of StarRocks?
- Why is the first-time SSB test performance of StarRocks slower than that done the second time?
- How many BEs need to be configured at least for a cluster?
- How to set data permissions when I use Apache Superset to visualize the data in StarRocks?
- Why does the profile fail to display after I set enable_profile to true?
- How to check field annotations in the tables of StarRocks?
- When I create a table, how to specify the default value for the NOW() function?
- How can I release the storage space of BE nodes?
- Can add extra disks to BE nodes?