- StarRocks
- Introduction to StarRocks
- Quick Start
- Table Design
- Data Loading
- Concepts
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP PUT
- Load data from HDFS or cloud storage
- Continuously load data from Apache Kafka®
- Bulk load using Apache Sparkâ„¢
- Load data using INSERT
- Synchronize data from MySQL in real time
- Continuously load data from Apache Flink®
- Change data through loading
- Transform data at loading
- Data Unloading
- Query Data Sources
- Query Acceleration
- Administration
- Deployment
- Management
- Data Recovery
- User Privilege and Authentication
- Performance Tuning
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADD SQLBLACKLIST
- ADMIN CANCEL REPAIR TABLE
- ADMIN CHECK TABLET
- ADMIN REPAIR TABLE
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER RESOURCE GROUP
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- CREATE RESOURCE GROUP
- DELETE SQLBLACKLIST
- DROP FILE
- DROP RESOURCE GROUP
- EXPLAIN
- INSTALL PLUGIN
- KILL
- SET
- SHOW BACKENDS
- SHOW BROKER
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW PROC
- SHOW PROCESSLIST
- SHOW RESOURCE GROUP
- SHOW SQLBLACKLIST
- SHOW TABLE STATUS
- SHOW VARIABLES
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER TABLE
- ALTER VIEW
- ALTER RESOURCE
- BACKUP
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE DATABASE
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- HLL
- RECOVER
- RESTORE
- SHOW RESOURCES
- SHOW FUNCTION
- TRUNCATE TABLE
- USE
- DML
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- CANCEL EXPORT
- CANCEL REFRESH MATERIALIZED VIEW
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- RESUME ROUTINE LOAD
- ROUTINE LOAD
- SELECT
- SHOW ALTER TABLE
- SHOW BACKUP
- SHOW CREATE TABLE
- SHOW CREATE VIEW
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- Auxiliary Commands
- Data Types
- Function Reference
- Java UDFs
- Window functions
- Aggregate Functions
- Array Functions
- Bit Functions
- Bitmap Functions
- base64_to_bitmap
- bitmap_agg
- bitmap_and
- bitmap_andnot
- bitmap_contains
- bitmap_count
- bitmap_from_string
- bitmap_empty
- bitmap_has_any
- bitmap_hash
- bitmap_intersect
- bitmap_max
- bitmap_min
- bitmap_or
- bitmap_remove
- bitmap_to_array
- bitmap_to_string
- bitmap_union
- bitmap_union_count
- bitmap_union_int
- bitmap_xor
- intersect_count
- to_bitmap
- Conditional Functions
- Cryptographic Functions
- Date Functions
- add_months
- adddate
- convert_tz
- current_date
- current_time
- current_timestamp
- date
- date_add
- date_format
- date_sub, subdate
- date_trunc
- datediff
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- days_add
- days_diff
- days_sub
- from_days
- from_unixtime
- hour
- hours_add
- hours_diff
- hours_sub
- microseconds_add
- microseconds_sub
- minute
- minutes_add
- minutes_diff
- minutes_sub
- month
- monthname
- months_add
- months_diff
- months_sub
- now
- quarter
- second
- seconds_add
- seconds_diff
- seconds_sub
- str_to_date
- str2date
- time_slice
- time_to_sec
- timediff
- timestamp
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- week
- weekofyear
- weeks_add
- weeks_diff
- weeks_sub
- year
- years_add
- years_diff
- years_sub
- Geographic Functions
- JSON Functions
- Overview of JSON functions and operators
- JSON operators
- JSON constructor functions
- JSON query and processing functions
- Math Functions
- String Functions
- Pattern Matching Functions
- Percentile Functions
- Scalar Functions
- Utility Functions
- cast function
- hash function
- System variables
- Error code
- System limits
- SQL Reference
- FAQ
- Benchmark
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
Deploy StarRocks
This QuickStart tutorial guides you through the procedures to deploy a simple StarRocks cluster. Before getting started, you can read StarRocks Architecture for more conceptual details.
Following these steps, you can deploy a StarRocks instance with only one frontend (FE) node and one backend (BE) node on your local machine. This instance can help you complete the upcoming QuickStart tutorials on creating a table and loading and querying data, and thereby acquaints you with the basic operations of StarRocks.
CAUTION
- To guarantee the high availability and performance in the production environment, we recommend that you deploy at least three FE nodes and three BE nodes in your StarRocks cluster.
- You can deploy an FE node and a BE node on one machine. However, deploying multiple nodes of the same kind on one machine is not allowed, because the same kinds of nodes cannot share a same IP address.
Prerequisites
Before deploying StarRocks, make sure the following requirements are satisfied.
- Hardware You can follow these steps on relatively elementary hardware, such as a machine with 4 CPU cores and 16 GB of RAM. The CPU MUST support AVX2 instruction sets.
NOTE
You can run
cat /proc/cpuinfo | grep avx2
in your terminal to check if the CPU supports the AVX2 instruction sets.
Software Your machine MUST be running on OS with Linux kernel 3.10 or later. In addition, you must have JDK 1.8 or later and MySQL client 5.5 or later installed on your machine. Please note that JRE is not supported.
Environment variable StarRocks relies on system environment variable
JAVA_HOME
to locate the Java dependency on the machine. Set this environment variable as the directory to which Java is installed, for example,/opt/jdk1.8.0_301**
**.
NOTE
You can run
echo $JAVA_HOME
in your terminal to check if you have specifiedJAVA_HOME
. If you have not specified it, see How to Set and List Environment Variables in Linux for detailed instructions.
Step 1: Download and install StarRocks
After all the prerequisites are met, you can download the StarRocks software package to install it on your machine.
Launch a terminal, navigate to a local directory to which you have both read and write access, and run the following command to create a dedicated directory for StarRocks deployment.
mkdir -p HelloStarRocks
NOTE
You can remove the instance cleanly at any time by deleting the directory.
Download the StarRocks software package to this directory.
cd HelloStarRocks wget https://releases.starrocks.io/starrocks/StarRocks-2.3.12.tar.gz
Extract the files in the software package to install StarRocks on your machine.
tar -xzvf StarRocks-2.3.12.tar.gz --strip-components 1
The software package includes the working directories of FE (fe), BE (be), Broker (apache_hdfs_broker), User Defined Function (udf), and LICENSE and NOTICE files.
Step 2: Start the FE node
Having installed StarRocks, you need to start the FE node. FE is the front layer of StarRocks. It manages the system metadata, client connections, query plan, and query schedule.
Create a directory for FE metadata storage under the working directory of FE.
mkdir -p fe/meta
NOTE
It is recommended to create a separate directory for FE metadata storage in production environment, because the operation of a StarRocks cluster is largely dependent on its metadata.
Check the IP addresses of the machine.
ifconfig
If your machine has multiple IP addresses, for example,
eth0
andeth1
, you must specify a dedicated IP address for FE service when configuring the propertypriority_networks
of FE node in the following sub-step.Check the connection status of the following ports.
- FE HTTP server port (
http_port
, Default:8030
) - FE thrift server port (
rpc_port
, Default:9020
) - FE MySQL server port (
query_port
, Default:9030
) - FE internal communication port (
edit_log_port
, Default:9010
)
netstat -tunlp | grep 8030 netstat -tunlp | grep 9020 netstat -tunlp | grep 9030 netstat -tunlp | grep 9010
If any of the above ports are occupied, you must find an alternative and specify it when configuring the corresponding ports of the FE node in the following sub-step.
- FE HTTP server port (
Modify the FE configuration file fe/conf/fe.conf.
If your machine has multiple IP addresses, you must add the configuration item
priority_networks
in the FE configuration file and assign a dedicated IP address to the FE node.priority_networks = x.x.x.x
If any of the above FE ports are occupied, you must assign a valid alternative in the FE configuration file.
http_port = aaaa rpc_port = bbbb query_port = cccc edit_log_port = dddd
If you want to set a different
JAVA_HOME
for StarRocks when you have multiple Java dependencies in your machine, you must specify it in the FE configuration file.JAVA_HOME = /path/to/your/java
Start FE node.
./fe/bin/start_fe.sh --daemon
Verify if the FE node is started successfully.
cat fe/log/fe.log | grep thrift
A record of log like "2022-08-10 16:12:29,911 INFO (UNKNOWN x.x.x.x_9010_1660119137253(-1)|1) [FeServer.start():52] thrift server started with port 9020." suggests that the FE node is started properly.
Step 3: Start the BE node
After FE node is started, you need to start the BE node. BE is the executing layer of StarRocks. It stores data and executes queries.
Create a directory for data storage under the working directory of BE.
mkdir -p be/storage
NOTE
It is recommended to create a separate directory for BE data storage in production environment to ensure the security of data.
Check the connection status of the following ports.
BE thrift server port (
be_port
, Default:9060
)BE HTTP server port (
webserver_port
, Default:8040
)heartbeat service port (
heartbeat_service_port
, Default:9050
)BE BRPC port (
brpc_port
, Default:8060
)netstat -tunlp | grep 9060 netstat -tunlp | grep 8040 netstat -tunlp | grep 9050 netstat -tunlp | grep 8060
If any of the above ports are occupied, you must find an alternative and specify it when configuring BE node in the following sub-step.
Modify the BE configuration file be/conf/be.conf.
If your machine has multiple IP addresses, you must add the configuration item
priority_networks
in the BE configuration file and assign a dedicated IP address to BE node.NOTE
The BE node can have the same IP addresses as the FE node if they are installed on the same machine.
priority_networks = x.x.x.x
If any of the above BE ports are occupied, you must assign a valid alternative in the BE configuration file.
be_port = vvvv webserver_port = xxxx heartbeat_service_port = yyyy brpc_port = zzzz
If you want to set a different
JAVA_HOME
for StarRocks when you have multiple Java dependencies in your machine, you must specify it in the BE configuration file.JAVA_HOME = /path/to/your/java
Start BE node.
./be/bin/start_be.sh --daemon
Verify if the BE node is started successfully.
cat be/log/be.INFO | grep heartbeat
A record of log like "I0810 16:18:44.487284 3310141 task_worker_pool.cpp:1387] Waiting to receive first heartbeat from frontend" suggests that the BE node is started properly.
Step 4: Set up a StarRocks cluster
After the FE node and BE node are started properly, you can set up the StarRocks cluster.
Log in to StarRocks via your MySQL client. You can log in with the default user
root
, and the password is empty by default.mysql -h <fe_ip> -P<fe_query_port> -uroot
NOTE
- Change the
-P
value accordingly if you have assigned a different FE MySQL server port (query_port
, Default:9030
). - Change the
-h
value accordingly if you have specified the configuration itempriority_networks
in the FE configuration file.
- Change the
Check the status of the FE node by running the following SQL in the MySQL client.
SHOW PROC '/frontends'\G
Example:
MySQL [(none)]> SHOW PROC '/frontends'\G *************************** 1. row *************************** Name: x.x.x.x_9010_1660119137253 IP: x.x.x.x EditLogPort: 9010 HttpPort: 8030 QueryPort: 9030 RpcPort: 9020 Role: FOLLOWER IsMaster: true ClusterId: 58958864 Join: true Alive: true ReplayedJournalId: 30602 LastHeartbeat: 2022-08-11 20:34:26 IsHelper: true ErrMsg: StartTime: 2022-08-10 16:12:29 Version: 2.3.0-a9bdb09 1 row in set (0.01 sec)
- If the field
Alive
istrue
, this FE node is properly started and added to the cluster. - If the field
Role
isFOLLOWER
, this FE node is eligible to be elected as the Leader node. - If the field
Role
isLEADER
, this FE node is the Leader node.
- If the field
Add the BE node to the cluster.
ALTER SYSTEM ADD BACKEND "<be_ip>:<heartbeat_service_port>";
NOTE
- If your machine has multiple IP addresses,
be_ip
must match thepriority_networks
you have specified in the BE configuration file. heartbeat_service_port
must match the BE heartbeat service port (heartbeat_service_port
, Default:9050
) you have specified in the BE configuration file
- If your machine has multiple IP addresses,
Check the status of the BE node.
SHOW PROC '/backends'\G
Example:
MySQL [(none)]> SHOW PROC '/backends'\G
*************************** 1. row ***************************
BackendId: 10036
Cluster: default_cluster
IP: x.x.x.x
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2022-08-10 17:39:01
LastHeartbeat: 2022-08-11 20:34:31
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 0
DataUsedCapacity: .000
AvailCapacity: 1.000 B
TotalCapacity: .000
UsedPct: 0.00 %
MaxDiskUsedPct: 0.00 %
ErrMsg:
Version: 2.3.0-a9bdb09
Status: {"lastSuccessReportTabletsTime":"N/A"}
DataTotalCapacity: .000
DataUsedPct: 0.00 %
CpuCores: 16
If the field Alive
is true
, this BE node is properly started and added to the cluster.
Stop the StarRocks cluster
You can stop the StarRocks cluster by running the following commands.
Stop the FE node.
./fe/bin/stop_fe.sh --daemon
Stop the BE node.
./be/bin/stop_be.sh --daemon
Troubleshooting
Try the following steps to identify the errors that occurs when you start the FE or BE node:
- If the FE node is not started properly, you can identify the problem by checking the log in fe/log/fe.warn.log.
cat fe/log/fe.warn.log
Having identified and resolved the problem, you must first terminate the existing FE process, delete the FE meta directory, create a new metadata storage directory, and then restart the FE node with the correct configuration.
- If the BE node is not started properly, you can identify the problem by checking the log in be/log/be.WARNING.
cat be/log/be.WARNING
Having identified and resolved the problem, you must first terminate the existing BE process, delete the BE storage directory, create a new data storage directory, and then restart the BE node with the correct configuration.
What to do next
Having deployed StarRocks, you can continue the QuickStart tutorials on creating a table and loading and querying data.
You can also: