- Introduction
- Quick Start
- Table Design
- Data Loading
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP push
- Load data from HDFS or cloud storage
- Routine Load
- Spark Load
- Insert Into
- Change data through loading
- Transform data at loading
- Json Loading
- Synchronize data from MySQL
- Load data by using flink-connector-starrocks
- DataX Writer
- Data Export
- Using StarRocks
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADMIN CANCEL REPAIR
- ADMIN CHECK TABLET
- ADMIN REPAIR
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- DROP FILE
- INSTALL PLUGIN
- SHOW BACKENDS
- SHOW BROKER
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW TABLE STATUS
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER TABLE
- ALTER VIEW
- BACKUP
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE DATABASE
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- HLL
- RECOVER
- RESTORE
- SHOW RESOURCES
- SHOW FUNCTION
- TRUNCATE TABLE
- DML
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- RESUME ROUTINE LOAD
- ROUTINE LOAD
- SELECT
- SHOW ALTER
- SHOW BACKUP
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- Data Types
- Auxiliary Commands
- Function Reference
- Java UDFs
- Window Function
- Date Functions
- convert_tz
- curdate
- current_timestamp
- curtime
- datediff
- date_add
- date_format
- date_sub
- date_trunc
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- from_days
- from_unixtime
- hour
- minute
- month
- monthname
- now
- quarter
- second
- str_to_date
- timediff
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- weekofyear
- year
- hours_diff
- minutes_diff
- months_diff
- seconds_diff
- weeks_diff
- years_diff
- Aggregate Functions
- Geographic Functions
- String Functions
- JSON Functions
- Overview of JSON functions and operators
- JSON constructor functions
- JSON query and processing functions
- JSON operators
- Aggregate Functions
- Bitmap Functions
- Array Functions
- cast function
- hash function
- Cryptographic Functions
- Math Functions
- Utility Functions
- System variables
- Error code
- System limits
- SQL Reference
- Administration
- FAQ
- Deploy
- Data Migration
- SQL
- Other FAQs
- Benchmark
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
- Integration
Backup and Recovery
StarRocks supports backing up the current data as a file to a remote storage system via the broker (the broker is an optional process in the StarRocks cluster, mainly used to support StarRocks to read and write files and directories on the remote storage, please refer to the Broker Load documentation. The backup data can be restored from the remote storage system to any StarRocks cluster with the restore command. This feature supports periodic snapshot backup of data. It also allows migrating data between clusters.
To use this feature, you need to deploy a broker that corresponds to the remote storage system, such as HDFS. You can check the broker information by SHOW BROKER;
.
Principle Description
Backup
Backup is to upload data from a specified table or partition to a remote repository in the form of a file stored by StarRocks. When a user submits a Backup request, the system executes the following operations internally:
- Snapshot and Snapshot Upload
- First, it takes a snapshot of the data in the specified table or partition . The backup is performed on the snapshot. After the snapshot, any operation made to the table no longer affects the backup result. The snapshot quickly produces a hard link to the current data file. Once the snapshot is complete, the snapshot files are uploaded one by one. Snapshot upload is done concurrently by each Backend.
- Metadata Preparation and Upload
- After the snapshot upload is completed, Frontend will write the corresponding metadata to a local file, and then upload this file to the remote repository through the broker.
Restore
The restore operation requires specifying a backup that already exists in the remote repository and will restore the contents of this backup to the local cluster. When the user submits a Restore request, the system executes the following operations internally.
Create the corresponding metadata locally
- This step first creates a structure such as a table partition in the local cluster that corresponds to the restore request. Once created, the table will be visible, but not accessible.
local snapshot
- This step takes a snapshot of the table created in the previous step. The snapshot will be empty because the newly created table has no data. The purpose of this step is to generate the corresponding snapshot directory on Backend, which can be used to receive the snapshot files downloaded from the remote repository later.
download the snapshot
- The snapshot file from the remote repository is downloaded to the corresponding snapshot directory generated in the previous step. This step is done concurrently by each Backend.
ready to use
- After the snapshots are downloaded, the individual snapshots are mapped to the current local table metadata. These snapshots are then reloaded to be ready to use and complete the final recovery operation.
Overall Process
First create a cloud repository for both old and new clusters with the same
REPOSITORY
name. The BROKER Name can be checked in the cluster. Second, in the old cluster, prepare the tables that need to be migrated and backed up. Third, back up those tables to the cloud repository, and then Restore from the cloud repository to the new cluster.The tables to be backed up and restored do not need to be created in advance in the new cluster, because they will be created automatically when the Restore operation is performed.
Instructions
Data Backup
Currently StarRocks supports full backups at the partition level. If you need to perform regular backups, plan the partitioning and bucketing wisely when creating tables. Later you can perform data backups regularly for each partition.
Data Migration
To complete data migration, users can back up data to a remote repository and then restore data from the repository to another cluster. Because data backup is done in the form of snapshots, any data that is imported after the snapshot phase will not be backed up. Therefore, any data imported to the original cluster during the grace period needs to be imported again to the new cluster. It is recommended to perform parallel import to both clusters after the migration is completed. Complete data verification and then migrate the business to the new cluster.
Notes
- B Only ADMIN users can perform backup and recovery operations.
- Only one active backup or recovery job can run in a Database.
- Both backup and recovery can be performed at the partition level. For large volume data, it is recommended to perform the operations separately by partition to reduce retries after failures.
- Backup and recovery operations are performed on the actual data files. When a table has too many tablets or minor versions, it takes a long time to back up or restore even if the total data volume is small. Users can estimate the job execution time by viewing the number of tablets and the number of file versions in each partition with
SHOW PARTITIONS FROM table_name;
andSHOW TABLET FROM table_name;
respectively. The number of files has a great impact on the job execution time, so it is recommended to plan the partitioning and bucketing wisely. - When checking the job status with
SHOW BACKUP
orSHOW RESTORE
, you may see an error message in the TaskErrMsg column. However, as long as the State column is notCANCELLED
, the job is still running. Some tasks may be retried successfully. Some Task errors may cause the job to fail directly. - If the recovery job is an overwrite operation, i.e., it specifies to restore data to an existing table or partition, then the data overwritten on the current cluster may not be restored from the
COMMIT
phase. If the recovery job fails or is canceled at this point, it is possible that the previous data is corrupted and inaccessible. In this case, you can only recover the data by performing the recovery operation again and waiting for the job to complete. Therefore, try not to use overwrite to recover data unless it is confirmed that the current data is no longer in use. The overwrite operation checks whether the metadata of the snapshot is the same as the existing table or partition. If different, the recovery operation cannot be executed. - The snapshot data currently cannot be deleted directly by StarRocks.Users can manually delete the snapshot paths backed up in the remote storage system.
Related Commands
The following commands are related to the backup and recovery operations.
CREATE REPOSITORY
- Create a remote repository path for backup or recovery. This command requires access to the remote storage system via Broker. Different Brokers need different parameters, see [Broker documentation](. /loading/BrokerLoad.md). The backup and recovery clusters need to create the same repository, including the repository path and repository name, which allows the recovery cluster to read the backup snapshot of the backup cluster.
BACKUP
- Perform a backup operation.
SHOW BACKUP
View the execution status of the last backup job, including:
- JobId: The id of this backup job.
- SnapshotName: The user-specified name (Label) of this backup job.
- DbName: The Database for the backup job.
- State: The current phase of the backup job.
- PENDING: The initial state of the job.
- SNAPSHOTING: The snapshot operation is in progress.
- UPLOAD_SNAPSHOT: The snapshot is ready to be uploaded.
- UPLOADING: The snapshot is being uploaded.
- SAVE_META: Metadata file is being generated locally.
- UPLOAD_INFO: Uploading metadata file and job information.
- FINISHED: Backup is complete.
- CANCELLED: Backup failed or was cancelled.
- BackupObjs: A list of the tables and partitions involved in this backup.
- CreateTime: The time when the job was created.
- SnapshotFinishedTime: The completion time of the snapshot.
- UploadFinishedTime: The time when the snapshot was uploaded successfully.
- FinishedTime: The completion time of this job.
- UnfinishedTasks: There will be multiple subtasks in progress, such as SNAPSHOTTING, UPLOADING, etc. Here are the task ids of the incomplete subtasks.
- TaskErrMsg: If there is a subtask execution error, the error message will be displayed here.
- Status: The status that may occur when the job is running.
- Timeout: The timeout of the job, in "seconds".
SHOW SNAPSHOT
View the backups that already exist in the remote repository.
- Snapshot: The name (Label) of this backup specified at backup time.
- Timestamp: The timestamp of the backup.
- Status: Whether the backup is normal or not.
If you specify the
where
clause afterSHOW SNAPSHOT
, you will see more detailed backup information.- Database: The corresponding Database.
- Details: A holistic view of the backup data directory.
RESTORE
Perform a restore operation.
SHOW RESTORE
View the execution of the most recent recovery job, including:
JobId: The id of this recovery job.
Label: The name (Label) of the backup in the repository.
Timestamp: The timestamp of the backup in the repository.
DbName: The Database corresponding to the recovery job.
State: The current phase of the recovery job.
- PENDING: The initial state of the job.
- SNAPSHOTTING: Snapshot is being performed.
- DOWNLOAD: Snapshot downloading task is being sent.
- DOWNLOADING: Snapshot is being downloaded.
- COMMIT: Preparing to validate a downloaded snapshot.
- COMMITTING: The downloaded snapshot is being validated.
- FINISHED: Recovery is complete.
- CANCELLED: Recovery failed or was canceled.
AllowLoad: Whether import is allowed during recovery.
ReplicationNum: The number of replications.
RestoreObjs: A list of the tables and partitions involved in this recovery job.
CreateTime: The time when the job was created.
MetaPreparedTime: The time when the local metadata generation is completed.
SnapshotFinishedTime: Completion time of Local snapshot.
DownloadFinishedTime: Completion time of remote snapshot download.
FinishedTime: Completion time of this job.
UnfinishedTasks: There will be multiple subtasks in progress, such as S SNAPSHOTTING, DOWNLOADING, COMMITTING, etc. Here are the task ids of the incomplete subtasks.
TaskErrMsg: If there is a subtask execution error, the error message will be displayed here.
Status: The status that may occur when the job is running.
Timeout: The timeout of the job, in seconds.
CANCEL BACKUP
Cancel the running backup job.
CANCEL RESTORE
Cancel the running recovery job.
DROP REPOSITORY
Delete a remote repository that has been created. Deleting a repository only removes the mapping of that repository in StarRocks, not the actual repository data.