StarRocks version 4.0
4.0.0-RCβ
Release date: September 9, 2025
Data Lake Analyticsβ
- Unified Page Cache and Data Cache for BE metadata, and adopted an adaptive strategy for scaling. #61640
- Optimized metadata file parsing for Iceberg statistics to avoid repetitive parsing. #59955
- Optimized COUNT/MIN/MAX queries against Iceberg metadata by efficiently skipping over data file scans, significantly improving aggregation query performance on large partitioned tables and reducing resource consumption. #60385
- Supports compaction for Iceberg tables via procedure
rewrite_data_files
. - Supports Iceberg tables with hidden partitions, including creating, writing, and reading the tables. #58914
- Supports the TIME data type in the Paimon catalog. #58292
Security and Authenticationβ
- In scenarios where JWT authentication and the Iceberg REST Catalog are used, StarRocks supports the passthrough of user login information to Iceberg via the REST Session Catalog for subsequent data access authentication. #59611 #58850
- Supports vended credentials for the Iceberg catalog.
Storage Optimization and Cluster Managementβ
- Introduced β―the File Bundling optimization for the cloud-native table in shared-data clusters to automatically bundle the data files generated by loading, Compaction, or Publish operations, thereby reducing the API cost caused by high-frequency access to the external storage system. #58316
- Supports Kafka 4.0 for Routine Load.
- Supports full-text inverted indexes on Primary Key tables in shared-nothing clusters.
- Supports enabling case-insensitive processing on names of catalogs, databases, tables, views, and materialized views. #61136
- Supports blacklisting Compute Nodes in shared-data clusters. #60830
- Supports global connection ID. #57256
Query and Performance Improvementβ
- Supports DECIMAL256 data type, expanding the upper limit of precision from 38 to 76 bits. Its 256-bit storage provides better adaptability to high-precision financial and scientific computing scenarios, effectively mitigating DECIMAL128's precision overflow problem in very large aggregations and high-order operations. #59645
- Optimized the performance of the JOIN and AGG operators. #61691
- [Preview] Introduced SQL Plan Manager to allow users to bind a query plan to a query, thereby preventing the query plan from changing due to system state changes (mainly data updates and statistics updates), thus stabilizing query performance. #56310
- Introduced Partition-wise Spillable Aggregate/Distinct operators to replace the original Spill implementation based on sorted aggregation, significantly improving aggregation performance and reducing read/write overhead in complex and high-cardinality GROUP BY scenarios. #60216
- Flat JSON V2:
- Supports configuring Flat JSON on the table level. #57379
- Enhance JSON columnar storage by retaining the V1 mechanism while adding page- and segment-level indexes (ZoneMaps, Bloom filters), predicate pushdown with late materialization, dictionary encoding, and integration of a low-cardinality global dictionary to significantly boost execution efficiency. #60953
- Supports an adaptive ZoneMap index creation strategy for the STRING data type. #61960
Functions and SQL Syntaxβ
- Added the following functions:
- Provides the following syntactic extensions: