- StarRocks
- Introduction to StarRocks
- Quick Start
- Deployment
- Deployment overview
- Prepare
- Deploy
- Deploy shared-nothing StarRocks
- Deploy and use shared-data StarRocks
- Manage
- Table Design
- Understand StarRocks table design
- Table types
- Data distribution
- Data compression
- Sort keys and prefix indexes
- Data Loading
- Concepts
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP PUT
- Load data from HDFS
- Load data from cloud storage
- Load data from Apache Kafka®
- Continuously load data from Apache Kafka®
- Load data from Apache Sparkâ„¢
- Load data using INSERT
- Load data using Stream Load transaction interface
- Realtime synchronization from MySQL
- Continuously load data from Apache Flink®
- Change data through loading
- Transform data at loading
- Data Unloading
- Query Data Lakes
- Query Acceleration
- Gather CBO statistics
- Synchronous materialized views
- Asynchronous materialized views
- Colocate Join
- Lateral Join
- Query Cache
- Index
- Computing the Number of Distinct Values
- Sorted streaming aggregate
- Integrations
- Administration
- Management
- Data recovery
- User Privilege and Authentication
- Performance Tuning
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADD SQLBLACKLIST
- ADMIN CANCEL REPAIR TABLE
- ADMIN CHECK TABLET
- ADMIN REPAIR TABLE
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER RESOURCE GROUP
- ALTER STORAGE VOLUME
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- CREATE RESOURCE GROUP
- CREATE STORAGE VOLUME
- DELETE SQLBLACKLIST
- DESC STORAGE VOLUME
- DROP FILE
- DROP RESOURCE GROUP
- DROP STORAGE VOLUME
- EXPLAIN
- INSTALL PLUGIN
- KILL
- SET
- SET DEFAULT STORAGE VOLUME
- SHOW BACKENDS
- SHOW BROKER
- SHOW COMPUTE NODES
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW PROC
- SHOW PROCESSLIST
- SHOW RESOURCE GROUP
- SHOW SQLBLACKLIST
- SHOW STORAGE VOLUMES
- SHOW TABLE STATUS
- SHOW VARIABLES
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER MATERIALIZED VIEW
- ALTER TABLE
- ALTER VIEW
- ALTER RESOURCE
- ANALYZE TABLE
- BACKUP
- CANCEL ALTER TABLE
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE ANALYZE
- CREATE DATABASE
- CREATE EXTERNAL CATALOG
- CREATE FUNCTION
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE VIEW
- DROP ANALYZE
- DROP CATALOG
- DROP DATABASE
- DROP FUNCTION
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP STATS
- DROP TABLE
- DROP VIEW
- HLL
- KILL ANALYZE
- RECOVER
- REFRESH EXTERNAL TABLE
- RESTORE
- SET CATALOG
- SHOW ANALYZE JOB
- SHOW ANALYZE STATUS
- SHOW FUNCTION
- SHOW META
- SHOW RESOURCES
- TRUNCATE TABLE
- USE
- DML
- ALTER LOAD
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- CANCEL EXPORT
- CANCEL REFRESH MATERIALIZED VIEW
- CREATE ROUTINE LOAD
- DELETE
- DROP TASK
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- REFRESH MATERIALIZED VIEW
- RESUME ROUTINE LOAD
- SELECT
- SHOW ALTER TABLE
- SHOW ALTER MATERIALIZED VIEW
- SHOW BACKUP
- SHOW CATALOGS
- SHOW CREATE CATALOG
- SHOW CREATE DATABASE
- SHOW CREATE MATERIALIZED VIEW
- SHOW CREATE TABLE
- SHOW CREATE VIEW
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW MATERIALIZED VIEWS
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- SUBMIT TASK
- UPDATE
- Auxiliary Commands
- Data Types
- Keywords
- Function Reference
- Function list
- Java UDFs
- Window functions
- Lambda expression
- Aggregate Functions
- any_value
- approx_count_distinct
- array_agg
- avg
- bitmap
- bitmap_agg
- count
- corr
- covar_pop
- covar_samp
- group_concat
- grouping
- grouping_id
- hll_empty
- hll_hash
- hll_raw_agg
- hll_union
- hll_union_agg
- max
- max_by
- min
- min_by
- multi_distinct_sum
- multi_distinct_count
- percentile_approx
- percentile_cont
- percentile_disc
- retention
- stddev
- stddev_samp
- sum
- variance, variance_pop, var_pop
- var_samp
- window_funnel
- Array Functions
- all_match
- any_match
- array_agg
- array_append
- array_avg
- array_concat
- array_contains
- array_contains_all
- array_cum_sum
- array_difference
- array_distinct
- array_filter
- array_generate
- array_intersect
- array_join
- array_length
- array_map
- array_max
- array_min
- array_position
- array_remove
- array_slice
- array_sort
- array_sortby
- array_sum
- arrays_overlap
- array_to_bitmap
- cardinality
- element_at
- reverse
- unnest
- Bit Functions
- Bitmap Functions
- base64_to_bitmap
- bitmap_agg
- bitmap_and
- bitmap_andnot
- bitmap_contains
- bitmap_count
- bitmap_from_string
- bitmap_empty
- bitmap_has_any
- bitmap_hash
- bitmap_intersect
- bitmap_max
- bitmap_min
- bitmap_or
- bitmap_remove
- bitmap_subset_in_range
- bitmap_subset_limit
- bitmap_to_array
- bitmap_to_base64
- bitmap_to_string
- bitmap_union
- bitmap_union_count
- bitmap_union_int
- bitmap_xor
- intersect_count
- sub_bitmap
- to_bitmap
- JSON Functions
- Overview of JSON functions and operators
- JSON operators
- JSON constructor functions
- JSON query and processing functions
- Map Functions
- Binary Functions
- Conditional Functions
- Cryptographic Functions
- Date Functions
- add_months
- adddate
- convert_tz
- current_date
- current_time
- current_timestamp
- date
- date_add
- date_diff
- date_format
- date_slice
- date_sub, subdate
- date_trunc
- datediff
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- days_add
- days_diff
- days_sub
- from_days
- from_unixtime
- hour
- hours_add
- hours_diff
- hours_sub
- last_day
- makedate
- microseconds_add
- microseconds_sub
- minute
- minutes_add
- minutes_diff
- minutes_sub
- month
- monthname
- months_add
- months_diff
- months_sub
- next_day
- now
- previous_day
- quarter
- second
- seconds_add
- seconds_diff
- seconds_sub
- str_to_date
- str2date
- time_slice
- time_to_sec
- timediff
- timestamp
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- week
- week_iso
- weekofyear
- weeks_add
- day_of_week_iso
- weeks_diff
- weeks_sub
- year
- years_add
- years_diff
- years_sub
- Geographic Functions
- Math Functions
- String Functions
- append_trailing_char_if_absent
- ascii
- char
- char_length
- character_length
- concat
- concat_ws
- ends_with
- find_in_set
- group_concat
- hex
- hex_decode_binary
- hex_decode_string
- instr
- lcase
- left
- length
- locate
- lower
- lpad
- ltrim
- money_format
- null_or_empty
- parse_url
- repeat
- replace
- reverse
- right
- rpad
- rtrim
- space
- split
- split_part
- starts_with
- strleft
- strright
- str_to_map
- substring
- trim
- ucase
- unhex
- upper
- url_decode
- url_encode
- Pattern Matching Functions
- Percentile Functions
- Scalar Functions
- Struct Functions
- Table Functions
- Utility Functions
- cast function
- hash function
- AUTO_INCREMENT
- Generated columns
- System variables
- User-defined variables
- Error code
- System limits
- AWS IAM policies
- SQL Reference
- FAQ
- Benchmark
- Ecosystem Release Notes
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
CREATE STORAGE VOLUME
Description
Creates a storage volume for a remote storage system. This feature is supported from v3.1.
A storage volume consists of the properties and credential information of the remote data storage. You can reference a storage volume when you create databases and cloud-native tables in a shared-data StarRocks cluster.
CAUTION
Only users with the CREATE STORAGE VOLUME privilege on the SYSTEM level can perform this operation.
Syntax
CREATE STORAGE VOLUME [IF NOT EXISTS] <storage_volume_name>
TYPE = S3
LOCATIONS = ('<s3_path>')
[ COMMENT '<comment_string>' ]
PROPERTIES
("key" = "value",...)
Parameters
Parameter | Description |
---|---|
storage_volume_name | The name of the storage volume. Please note that you cannot create a storage volume named builtin_storage_volume because it is used to create the builtin storage volume. |
TYPE | The type of the remote storage system. Valid values: S3 and AZBLOB . S3 indicates AWS S3 or S3-compatible storage systems. AZBLOB indicates Azure Blob Storage (supported from v3.1.1 onwards). |
LOCATIONS | The storage locations. The format is as follows:
|
COMMENT | The comment on the storage volume. |
PROPERTIES | Parameters in the "key" = "value" pairs used to specify the properties and credential information to access the remote storage system. For detailed information, see PROPERTIES. |
PROPERTIES
If you use AWS S3:
If you use the default authentication credential of AWS SDK to access S3, set the following properties:
"enabled" = "{ true | false }", "aws.s3.region" = "<region>", "aws.s3.endpoint" = "<endpoint_url>", "aws.s3.use_aws_sdk_default_behavior" = "true"
If you use IAM user-based credential (Access Key and Secret Key) to access S3, set the following properties:
"enabled" = "{ true | false }", "aws.s3.region" = "<region>", "aws.s3.endpoint" = "<endpoint_url>", "aws.s3.use_aws_sdk_default_behavior" = "false", "aws.s3.use_instance_profile" = "false", "aws.s3.access_key" = "<access_key>", "aws.s3.secret_key" = "<secrete_key>"
If you use Instance Profile to access S3, set the following properties:
"enabled" = "{ true | false }", "aws.s3.region" = "<region>", "aws.s3.endpoint" = "<endpoint_url>", "aws.s3.use_aws_sdk_default_behavior" = "false", "aws.s3.use_instance_profile" = "true"
If you use Assumed Role to access S3, set the following properties:
"enabled" = "{ true | false }", "aws.s3.region" = "<region>", "aws.s3.endpoint" = "<endpoint_url>", "aws.s3.use_aws_sdk_default_behavior" = "false", "aws.s3.use_instance_profile" = "true", "aws.s3.iam_role_arn" = "<role_arn>"
If you use Assumed Role to access S3 from an external AWS account, set the following properties:
"enabled" = "{ true | false }", "aws.s3.region" = "<region>", "aws.s3.endpoint" = "<endpoint_url>", "aws.s3.use_aws_sdk_default_behavior" = "false", "aws.s3.use_instance_profile" = "true", "aws.s3.iam_role_arn" = "<role_arn>", "aws.s3.external_id" = "<external_id>"
If you use GCP Cloud Storage, set the following properties:
"enabled" = "{ true | false }", -- For example: us-east-1 "aws.s3.region" = "<region>", -- For example: https://storage.googleapis.com "aws.s3.endpoint" = "<endpoint_url>", "aws.s3.access_key" = "<access_key>", "aws.s3.secret_key" = "<secrete_key>"
If you use MinIO, set the following properties:
"enabled" = "{ true | false }", -- For example: us-east-1 "aws.s3.region" = "<region>", -- For example: http://172.26.xx.xxx:39000 "aws.s3.endpoint" = "<endpoint_url>", "aws.s3.access_key" = "<access_key>", "aws.s3.secret_key" = "<secrete_key>"
Property Description enabled Whether to enable this storage volume. Default: false
. Disabled storage volume cannot be referenced.aws.s3.region The region in which your S3 bucket resides, for example, us-west-2
.aws.s3.endpoint The endpoint URL used to access your S3 bucket, for example, https://s3.us-west-2.amazonaws.com
.aws.s3.use_aws_sdk_default_behavior Whether to use the default authentication credential of AWS SDK. Valid values: true
andfalse
(Default).aws.s3.use_instance_profile Whether to use Instance Profile and Assumed Role as credential methods for accessing S3. Valid values: true
andfalse
(Default).- If you use IAM user-based credential (Access Key and Secret Key) to access S3, you must specify this item as
false
, and specifyaws.s3.access_key
andaws.s3.secret_key
. - If you use Instance Profile to access S3, you must specify this item as
true
. - If you use Assumed Role to access S3, you must specify this item as
true
, and specifyaws.s3.iam_role_arn
. - And if you use an external AWS account, you must specify this item as
true
, and specifyaws.s3.iam_role_arn
andaws.s3.external_id
.
aws.s3.access_key The Access Key ID used to access your S3 bucket. aws.s3.secret_key The Secret Access Key used to access your S3 bucket. aws.s3.iam_role_arn The ARN of the IAM role that has privileges on your S3 bucket in which your data files are stored. aws.s3.external_id The external ID of the AWS account that is used for cross-account access to your S3 bucket. - If you use IAM user-based credential (Access Key and Secret Key) to access S3, you must specify this item as
If you use Azure Blob Storage (supported from v3.1.1 onwards):
If you use Shared Key to access Azure Blob Storage, set the following properties:
"enabled" = "{ true | false }", "azure.blob.endpoint" = "<endpoint_url>", "azure.blob.shared_key" = "<shared_key>"
If you use shared access signatures (SAS) to access Azure Blob Storage, set the following properties:
"enabled" = "{ true | false }", "azure.blob.endpoint" = "<endpoint_url>", "azure.blob.sas_token" = "<sas_token>"
CAUTION
The hierarchical namespace must be disabled when you create the Azure Blob Storage Account.
Property Description enabled Whether to enable this storage volume. Default: false
. Disabled storage volume cannot be referenced.azure.blob.endpoint The endpoint of your Azure Blob Storage Account, for example, https://test.blob.core.windows.net
.azure.blob.shared_key The Shared Key used to authorize requests for your Azure Blob Storage. azure.blob.sas_token The shared access signatures (SAS) used to authorize requests for your Azure Blob Storage.
Examples
Example 1: Create a storage volume my_s3_volume
for the AWS S3 bucket defaultbucket
, use the IAM user-based credential (Access Key and Secret Key) to access S3, and enable it.
MySQL > CREATE STORAGE VOLUME my_s3_volume
-> TYPE = S3
-> LOCATIONS = ("s3://defaultbucket/test/")
-> PROPERTIES
-> (
-> "enabled" = "true",
-> "aws.s3.region" = "us-west-2",
-> "aws.s3.endpoint" = "https://s3.us-west-2.amazonaws.com",
-> "aws.s3.use_aws_sdk_default_behavior" = "false",
-> "aws.s3.use_instance_profile" = "false",
-> "aws.s3.access_key" = "xxxxxxxxxx",
-> "aws.s3.secret_key" = "yyyyyyyyyy"
-> );
Query OK, 0 rows affected (0.05 sec)