- Introduction
- Quick Start
- Table Design
- Data Loading
- Data Export
- Using StarRocks
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADMIN CANCEL REPAIR
- ADMIN CHECK TABLET
- ADMIN REPAIR
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- DROP FILE
- INSTALL PLUGIN
- SHOW BACKENDS
- SHOW BROKER
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW TABLE STATUS
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER TABLE
- ALTER VIEW
- BACKUP
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE DATABASE
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- HLL
- RECOVER
- RESTORE
- SHOW RESOURCES
- SHOW FUNCTION
- TRUNCATE TABLE
- DML
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- RESUME ROUTINE LOAD
- ROUTINE LOAD
- SELECT
- SHOW ALTER
- SHOW BACKUP
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- Data Type
- Auxiliary Commands
- Function Reference
- Date Functions
- Geographic Functions
- String Functions
- Aggregation Functions
- Bitmap Functions
- Array Functions
- cast function
- hash function
- Crytographic Functions
- Math Functions
- Utility Functions
- System variables
- Error code
- System limits
- SQL Reference
- Administration
- FAQ
- Deployment
- Data Migration
- SQL
- Others FAQs
- Benchmark
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
- Integration
Stream Load
Does Stream Load identify the column name of the first row in a text file? Or can I specify the first row not to be read?
Stream Load cannot identify the column name of the first row as it is not differentiated from other data. Currently nor can it be specified to read the first rows. If users need to specify the first row of the file as the column name, they can do it in the following ways:
- Modify settings in the export tool and export once again the text-based data files without column names.
- Use commands such as sed -i '1d' filename to delete the first rows in text files.
- Use -H "where: 列名 != '列名称'" in Stream Load statement to filter this column. When the first row strings cannot be converted to other types, they will be returned as NULL. Thus, this requires the table fields not to be set as NOT NULL.
- Add -H "max_filter_ratio:0.01" into Stream Load statement and based on the data volume give it a fault tolerance rate of 1% or below that could tolerate errors in the first row. After adding the tolerance rate, ErrorURL that returns results will still report errors but without affecting the success of overall tasks. The tolerance rate should not be set too high as it may miss other data issues.
The data corresponding to the current partition key is not the standard date or int. It is in the format of 202106.00 for example. If Stream Load is needed to perform data import to StarRocks, how will the data be converted?
StarRocks supports data conversion during imports. You can refer to "4.7 data conversion during imports" in the enterprise edition.
Let's take Stream Load as an example, suppose the table TEST has four columns NO, DATE, VERSION and PRICE, and the exported CSV data file has the DATE field in the unstandardised 202106.00 format. If the partition column to be used in StarRocks is DATE, then first we need to create a table in StarRocks, specifying the DATE type as date, datetime or int. After that, in the Stream Load command, use this to achieve data conversion in the four columns:
-H "columns: NO,DATE_1, VERSION, PRICE, DATE=LEFT(DATE_1,6)"
DATE_1 can simply be thought of as a placeholder for the number to be taken and then converted by the function to the corresponding field in StarRocks. Note in particular that we need to list all the columns in the CSV data file before we perform the function conversion. Common functions are all available here.
- Stream Load
- Does Stream Load identify the column name of the first row in a text file? Or can I specify the first row not to be read?
- The data corresponding to the current partition key is not the standard date or int. It is in the format of 202106.00 for example. If Stream Load is needed to perform data import to StarRocks, how will the data be converted?