StarRocks version 3.5

v3.5.0-RC01

Shared-data clusters support generated columns. #53526
Cloud-native Primary Key tables in shared-data clusters support rebuilding specific indexes. The performance of the indexes is also optimized. #53971 #54178
Optimized the execution logic of large-scale data loading operations to avoid generating too many small files in Rowset due to memory limitations. During the import, the system will merge the temporary data blocks to reduce the generation of small files, which improves the query performance after the import and also reduces the subsequent Compaction operations to improve the system resource utilization. #53954

[Beta] Supports creating Iceberg views in the Iceberg Catalog with Hive Metastore integration. And supports adding or modifying the dialect of the Iceberg view using the ALTER VIEW statement for better syntax compatibility with external systems. #56120
Supports nested namespace for Iceberg REST Catalog. #58016
Supports using IcebergAwsClientFactory to create AWS clients in Iceberg REST Catalog to offer vended credentials. #58296
Parquet Reader supports filtering data with Bloom Filter. #56445
Supports automatically creating global dictionaries for low-cardinality columns in Parquet-formatted Hive/Iceberg tables during queries. #55167

Statistics optimization:
- Supports Table Sample. Improved statistics accuracy and query performance by sampling data blocks in physical files. #52787
- Supports recording the predicate columns in queries for targeted statistics collection. #53204
- Supports partition-level cardinality estimation. The system reuses the system-defined view _statistics_.column_statistics to record the NDV of each partition. #51513
- Supports multi-column Joint NDV collection to optimize the query plan generated by CBO in the scenario where columns correlate with each other. #56481 #56715 #56766 #56836
- Supports using histograms to estimate the Join node cardinality and in_predicate selectivity, thus improving the estimation accuracy in data skew. #57874 #57639
- Supports Query Feedback. Queries with the identical structure but different parameters will be categorized as the same type and share the same tuning guide for plan execution optimization. #58306
Supports Runtime Bitset Filter as an alternative for optimization to Bloom Filter in specific scenarios. #57157
Supports pushing down Join Runtime Filter to the storage layer. #55124
Supports Pipeline Event Scheduler. #54259

Supports using ALTER TABLE to merge expression partitions based on time functions for optimized storage efficiency and query performance. #56840
Supports partition Time-to-live (TTL) for List-partitioned tables and materialized views. And supports the property partition_retention_condition in tables and materialized views to allow users to set data retention strategies for list partitions, thus achieving more flexible partition deletion strategies. #53117
Supports using ALTER TABLE to delete partitions specified by common partition expressions, allowing users to flexibly delete partitions in batches. #53118

Upgraded FE compile target from Java 11 to Java 17 for better system stability and performance. #53617 #57030

Supports secure connections encrypted by SSL based on the MySQL protocol. #54877
Enhanced authentication using external systems:
- Supports creating StarRocks users with OAuth 2.0 and JSON Web Token (JWT).
- Supports Security Integration to simplify the authentication process with external systems. Security Integration supports LDAP, OAuth 2.0, and JWT. #55846
Supports Group Provider to obtain the user group information from external authentication services. The group information can then be used in authentication and authorization. Group Provider supports acquiring group information from LDAP, operating systems, or files. Users can query the user group they belong to using the function current_group(). #56670

Supports specifying multiple partition columns or expressions to allow users to partition the data with a more flexible strategy. #52576
Supports setting query_rewrite_consistency to force_mv to force the system to use the materialized view for query rewrite, thus keeping performance stability at the cost of data timeliness to a certain extent. #53819

Supports pausing Routine Load jobs on JSON parse errors by setting the property pause_on_json_parse_error to true. #56062
[Experimental] Supports transactions with multiple SQL statements (currently, only INSERT is supported). Users can start, apply, or undo a transaction to guarantee the ACID (atomicity, consistency, isolation, and durability) properties of multiple loading operations. #53978

Introduced the system variable lower_upper_support_utf8 on the session and global level, enhancing the support for UTF-8 strings (especially non-ASCII characters) in case conversion functions such as upper() and lower(). #56192
Added new functions:
- field() #5533
- ds_theta_count_distinct() #56960
- array_flatten() #50080
- inet_aton() #51883
- percentile_approx_weight() #57410

JDK 17 or later is required from StarRocks v3.5.0 onwards. To upgrade a cluster from v3.4 or earlier, you must upgrade the version of JDK that StarRocks depends, and remove the options that are incompatible with JDK 17 in the configuration item JAVA_OPTS in the FE configuration file fe.conf, for example, options that involve CMS and GC. In addition, as of v3.5.0, StarRocks no longer provides JVM configurations for specific JDK versions. All versions of JDK use JAVA_OPTS.