- StarRocks
- Introduction to StarRocks
- Quick Start
- Table Design
- Data Loading
- Concepts
- Overview of data loading
- Load data from a local file system or a streaming data source using HTTP PUT
- Load data from HDFS or cloud storage
- Continuously load data from Apache KafkaĀ®
- Bulk load using Apache Sparkā¢
- Load data using INSERT
- Synchronize data from MySQL in real time
- Continuously load data from Apache FlinkĀ®
- Change data through loading
- Transform data at loading
- Data Unloading
- Query Data Sources
- Query Acceleration
- Administration
- Deployment
- Management
- Data Recovery
- User Privilege and Authentication
- Performance Tuning
- Reference
- SQL Reference
- User Account Management
- Cluster Management
- ADD SQLBLACKLIST
- ADMIN CANCEL REPAIR TABLE
- ADMIN CHECK TABLET
- ADMIN REPAIR TABLE
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER RESOURCE GROUP
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- CREATE RESOURCE GROUP
- DELETE SQLBLACKLIST
- DROP FILE
- DROP RESOURCE GROUP
- EXPLAIN
- INSTALL PLUGIN
- KILL
- SET
- SHOW BACKENDS
- SHOW BROKER
- SHOW FILE
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW PROC
- SHOW PROCESSLIST
- SHOW RESOURCE GROUP
- SHOW SQLBLACKLIST
- SHOW TABLE STATUS
- SHOW VARIABLES
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER TABLE
- ALTER VIEW
- ALTER RESOURCE
- BACKUP
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE DATABASE
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE RESOURCE
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP RESOURCE
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- HLL
- RECOVER
- RESTORE
- SHOW RESOURCES
- SHOW FUNCTION
- TRUNCATE TABLE
- USE
- DML
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- CANCEL EXPORT
- CANCEL REFRESH MATERIALIZED VIEW
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- RESUME ROUTINE LOAD
- ROUTINE LOAD
- SELECT
- SHOW ALTER TABLE
- SHOW BACKUP
- SHOW CREATE TABLE
- SHOW CREATE VIEW
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW PARTITIONS
- SHOW PROPERTY
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- SPARK LOAD
- STOP ROUTINE LOAD
- STREAM LOAD
- Auxiliary Commands
- Data Types
- Function Reference
- Java UDFs
- Window functions
- Aggregate Functions
- Array Functions
- Bit Functions
- Bitmap Functions
- base64_to_bitmap
- bitmap_agg
- bitmap_and
- bitmap_andnot
- bitmap_contains
- bitmap_count
- bitmap_from_string
- bitmap_empty
- bitmap_has_any
- bitmap_hash
- bitmap_intersect
- bitmap_max
- bitmap_min
- bitmap_or
- bitmap_remove
- bitmap_to_array
- bitmap_to_string
- bitmap_union
- bitmap_union_count
- bitmap_union_int
- bitmap_xor
- intersect_count
- to_bitmap
- Conditional Functions
- Cryptographic Functions
- Date Functions
- add_months
- adddate
- convert_tz
- current_date
- current_time
- current_timestamp
- date
- date_add
- date_format
- date_sub, subdate
- date_trunc
- datediff
- day
- dayname
- dayofmonth
- dayofweek
- dayofyear
- days_add
- days_diff
- days_sub
- from_days
- from_unixtime
- hour
- hours_add
- hours_diff
- hours_sub
- microseconds_add
- microseconds_sub
- minute
- minutes_add
- minutes_diff
- minutes_sub
- month
- monthname
- months_add
- months_diff
- months_sub
- now
- quarter
- second
- seconds_add
- seconds_diff
- seconds_sub
- str_to_date
- str2date
- time_slice
- time_to_sec
- timediff
- timestamp
- timestampadd
- timestampdiff
- to_date
- to_days
- unix_timestamp
- utc_timestamp
- week
- weekofyear
- weeks_add
- weeks_diff
- weeks_sub
- year
- years_add
- years_diff
- years_sub
- Geographic Functions
- JSON Functions
- Overview of JSON functions and operators
- JSON operators
- JSON constructor functions
- JSON query and processing functions
- Math Functions
- String Functions
- Pattern Matching Functions
- Percentile Functions
- Scalar Functions
- Utility Functions
- cast function
- hash function
- System variables
- Error code
- System limits
- SQL Reference
- FAQ
- Benchmark
- Developers
- Contribute to StarRocks
- Code Style Guides
- Use the debuginfo file for debugging
- Development Environment
- Trace Tools
- Integration
Use catalogs to manage internal and external data
Catalogs are used to manage data. StarRocks 2.3 and later provide the following two types of catalogs:
Internal catalog: stores the databases and tables of StarRocks. You can use an internal catalog to manage internal data. For example, if you execute the CREATE DATABASE and CREATE TABLE statements to create a database and a table, they are stored in an internal catalog. Each StarRocks cluster has a default internal catalog whose name is
default_catalog
. Currently, you cannot modify the name of the internal catalog or create a new internal catalog.External catalog: is used to manage data in external sources. When you create an external catalog for an external source, you need to specify the access information of the external source. After the catalog is created, you can use it to query external data without the need to create external tables.
Create an external catalog
Syntax
CREATE EXTERNAL CATALOG <catalog_name> PROPERTIES ("key"="value", ...);
When you create an external catalog, you need to add configuration items in PROPERTIES
based on the external data source. For different external data sources, you need to add different configuration items. StarRocks 2.3 only supports creating an external catalog for Apache Hiveā¢. After the catalog is created, you can directly query all data in a Hive cluster without using external tables. Additionally, StarRocks only allows you to query tables in the catalog. It does not allow you to write data to tables in the catalog.
Examples
Create an external catalog named hive_catalog0
.
CREATE EXTERNAL CATALOG hive_catalog0
PROPERTIES(
"type"="hive",
"hive.metastore.uris"="thrift://127.0.0.1:9083"
);
The following table describes the related parameters.
Parameter | Description |
---|---|
type | The resource type. Set the value to hive . |
hive.metastore.uris | The URI of the Hive metastore. The parameter value is in the following format: thrift://< IP address of Hive metadata >:< port number > . The port number defaults to 9083. |
Query data
Query internal data
After logging in to StarRocks from the MySQL client, you connect to the
default_catalog
by default.Run the
show databases
command or theshow databases from default_catalog
command to view all internal databases in the current cluster.Query data in
default_catalog
by specifying a database name and a table name.
Query external data
After logging in to StarRocks from the MySQL client, you connect to the
default_catalog
by default.Run the
show catalogs
command to view all existing catalogs and find the specific external catalog. Then, run theshow databases from external_catalog
command to view all databases in the external catalog. For example, to view all databases inhive_table
, runshow databases from hive_catalog
.Run the
use external_catalog.database
command to switch the current session to the specific external catalog and database.Query data in the database by specifying a table name.
Cross-catalog query
If you want to perform a cross-catalog query, use the catalog_name.database_name
or catalog_name.database_name.table_name
format to reference data to be queried in your current catalog.
Examples
Query internal data
For example, to query data in olap_db.olap_table
, perform the following steps:
After you log in to StarRocks from the MySQL client, check all databases in the internal catalog of the current cluster.
show databases;
or
SHOW DATABASES FROM default_catalog;
Use
olap_db
as the current database.USE olap_db;
or
use default_catalog.olap_db;
Query data from
olap_table
.SELECT * FROM olap_table limit 1;
Query external data
For example, to query data in hive_catalog.hive_db.hive_table
, perform the following steps:
After you log in to StarRocks from the MySQL client, check all catalogs in the current cluster.
show catalogs;
View databases in
hive_table
.show databases from hive_catalog;
Switch the current session to
hive_catalog.hive_db
.USE hive_catalog.hive_db;
Query data from
hive_table
.SELECT * FROM hive_table limit 1;
Cross-catalog query
Query
hive_table
inhive_table
when the current session isdefault_catalog.olap_db
.SELECT * FROM hive_catalog.hive_db.hive_table;
Query
olap_table
indefault_catalog
when the current session ishive_catalog.hive_db
.SELECT * FROM default_catalog.olap_db.olap_table;
Perform a JOIN query on
hive_table
inhive_catalog
andolap_table
indefault_catalog
when the current session ishive_catalog.hive_db
.SELECT * FROM hive_table h JOIN default_catalog.olap_db.olap_table o WHERE h.id = o.id;
Perform a JOIN query on
hive_tab``le
inhive_catalog
andolap_table
indefault_catalog
by using a JOIN clause when the current session is another catalog.SELECT * FROM hive_catalog.hive_db.hive_table h JOIN default_catalog.olap_db.olap_table o WHERE h.id = o.id;
Drop an external catalog
Syntax
DROP EXTERNAL CATALOG <catalog_name>;
Examples
Drop an external catalog named hive_catalog
.
DROP EXTERNAL CATALOG hive_catalog;
Update the metadata of the tables in an external catalog
The metadata of the tables in external catalogs is cached in FEs. Therefore, the refresh strategy for the metadata (such as table schema and partition file) is the same as Metadata caching strategy. You can run the refresh external table catalog.db.table
command to refresh the cached metadata.