- Introduction
- Quick Start
- Table Design
- Data Loading
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP push
- Load data from HDFS or cloud storage
- Routine Load
- Spark Load
- Insert Into
- Change data through loading
- Transform data at loading
- Json Loading
- Synchronize data from MySQL
- Load data by using flink-connector-starrocks
- DataX Writer
- Data Export
- Using StarRocks
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADMIN CANCEL REPAIR
- ADMIN CHECK TABLET
- ADMIN REPAIR
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- DROP FILE
- INSTALL PLUGIN
- SHOW BACKENDS
- SHOW BROKER
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW TABLE STATUS
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER TABLE
- ALTER VIEW
- BACKUP
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE DATABASE
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- HLL
- RECOVER
- RESTORE
- SHOW RESOURCES
- SHOW FUNCTION
- TRUNCATE TABLE
- DML
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- RESUME ROUTINE LOAD
- ROUTINE LOAD
- SELECT
- SHOW ALTER
- SHOW BACKUP
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- Data Types
- Auxiliary Commands
- Function Reference
- Java UDFs
- Window Function
- Date Functions
- convert_tz
- curdate
- current_timestamp
- curtime
- datediff
- date_add
- date_format
- date_sub
- date_trunc
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- from_days
- from_unixtime
- hour
- minute
- month
- monthname
- now
- quarter
- second
- str_to_date
- timediff
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- weekofyear
- year
- hours_diff
- minutes_diff
- months_diff
- seconds_diff
- weeks_diff
- years_diff
- Aggregate Functions
- Geographic Functions
- String Functions
- JSON Functions
- Overview of JSON functions and operators
- JSON constructor functions
- JSON query and processing functions
- JSON operators
- Aggregate Functions
- Bitmap Functions
- Array Functions
- cast function
- hash function
- Cryptographic Functions
- Math Functions
- Utility Functions
- System variables
- Error code
- System limits
- SQL Reference
- Administration
- FAQ
- Deploy
- Data Migration
- SQL
- Other FAQs
- Benchmark
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
ALTER ROUTINE LOAD
Description
This command is used to modify a Routine Load job that has been created. The job to be modified must be in the PAUSED state. You can run the [PASUME](PAUSE ROUTINE LOAD) command to pause a loading job and then perform the ALTER ROUTINE LOAD
operation on the job.
Syntax
Note: You do not need to specify the content enclosed in square brackets [].
ALTER ROUTINE LOAD FOR [db.]job_name
[load_properties]
[job_properties]
FROM data_source
[data_source_properties]
Parameters
- [db.] job_name
The name of the job you want to modify.
- load_properties
The properties of the data to be imported.
Syntax:
[COLUMNS TERMINATED BY '<terminator>'],
[COLUMNS ([<column_name> [, ...] ] [, column_assignment [, ...] ] )],
[WHERE <expr>],
[PARTITION ([ <partition_name> [, ...] ])]
column_assignment:
<column_name> = column_expression
- Specify column separators.
You can specify column separators for the CSV data you want to import. For example, you can use commas (,) as column separators.
COLUMNS TERMINATED BY ","
The default separator is \t
.
- Specify column mapping.
Specify the mapping of columns in the source and destination tables, and define how derived columns are generated.
- Mapped columns
Specify which columns in the source table correspond to which columns in the destination table in sequence. If you want to skip a column, you can specify a column name that does not exist. For example, the destination table has three columns k1, k2, and v1. The source table has four columns, of which the first, second, and fourth columns correspond to k2, k1, and v1. You can write the code as follows.
COLUMNS (k2, k1, xxx, v1)
xxx
is the column that does not exist. It is used to skip the third column in the source table.
- Derived columns
Columns expressed in col_name = expr
are derived columns. These columns are generated by using expr
. Derived columns are usually placed after mapped columns. Although this is not a mandatory rule, StarRocks always parses mapped columns prior to derived columns. Assume that the destination table has a fourth column v2, which is generated by adding up k1 and k2. You can write the code as follows.
COLUMNS (k2, k1, xxx, v1, v2 = k1 + k2);
For CSV data, the number of mapped columns in COLUMNS must match the number of columns in the CSV file.
- Specify filter conditions.
You can specify filter conditions to filter out unwanted columns. The filter columns can be mapped columns or derived columns. For example, if you need to import data from columns whose k1 value is greater than 100 and k2 value equals 1000, you can write the code as follows.
WHERE k1 > 100 and k2 = 1000
- Specify the partitions into which you want to import data.
If no partitions are specified, data will be automatically imported into StarRocks partitions based on the partition key values in the CSV data. Example:
PARTITION(p1, p2, p3)
- job_properties
The job parameters you want to modify. Currently, you can modify the following parameters:
desired_concurrent_number
max_error_number
max_batch_interval
max_batch_rows
max_batch_size
jsonpaths
json_root
strip_outer_array
strict_mode
timezone
- data_source
The type of the data source. Currently, only Kafka data source is supported.
- data_source_properties
The properties of the data source. The following properties are supported:
kafka_partitions
You can only modify Kafka partitions that have been consumed.
kafka_offsets
You can only modify partition offsets that have not been consumed.
- Custom properties such as
property.group.id
andproperty.group.id
You can only specify Kafka partitions that have been consumed in
kafka_partitions
. You can only specify partition offsets that have not been consumed inkafka_offsets
.
Examples
Example 1: Change the value of desired_concurrent_number
to 1. This parameter specifies the parallelism of jobs used to consume Kafka data.
ALTER ROUTINE LOAD FOR db1.label1
PROPERTIES
(
"desired_concurrent_number" = "1"
);
Example 2: Change the value of desired_concurrent_number
to 10 and modify the partition offset and group ID.
ALTER ROUTINE LOAD FOR db1.label1
PROPERTIES
(
"desired_concurrent_number" = "10"
)
FROM kafka
(
"kafka_partitions" = "0, 1, 2",
"kafka_offsets" = "100, 200, 100",
"property.group.id" = "new_group"
);
Example 2: Change the filter condition to a > 0
and set the destination partition to p1
.
ALTER ROUTINE LOAD FOR db1.label1
WHERE a > 0
PARTITION (p1)
Keywords
ALTER, ROUTINE, LOAD