Skip to main content
Version: Latest-3.5

Query Profile Metrics

Authoritative reference for raw metrics emitted by StarRocks Query Profile, grouped by operator.
Use it as a glossary; for troubleshooting guidance jump to query_profile_tuning_recipes.md.

Summary Metrics

Basic information about the query execution:

MetricDescription
TotalThe total time consumed by the query, including Planning, Executing, and Profiling phase durations.
Query StateQuery state, possible states include Finished, Error, and Running.
Query IDUnique identifier for the query.
Start TimeTimestamp when the query started.
End TimeTimestamp when the query ended.
TotalTotal duration of the query.
Query TypeType of the query.
Query StateCurrent state of the query.
StarRocks VersionVersion of StarRocks used.
UserUser who executed the query.
Default DbDefault database used for the query.
Sql StatementSQL statement executed.
VariablesImportant variables used for the query.
NonDefaultSessionVariablesNon-default session variables used for the query.
Collect Profile TimeTime taken to collect the profile.
IsProfileAsyncIndicates if the profile collection was asynchronous.

Planner Metrics

It provides a comprehensive overview of the planner. Typically, if the total time spent on the planner is less than 10ms, it is not a cause for concern.

In certain scenarios, the planner may require more time:

  1. Complex queries may necessitate additional time for parsing and optimization to ensure an optimal execution plan.
  2. The presence of numerous materialized views can increase the time required for query rewriting.
  3. When multiple concurrent queries exhaust system resources and the query queue is utilized, the Pending time may be prolonged.
  4. Queries involving external tables may incur additional time for communication with the external metadata server.

Example:

     - -- Parser[1] 0
- -- Total[1] 3ms
- -- Analyzer[1] 0
- -- Lock[1] 0
- -- AnalyzeDatabase[1] 0
- -- AnalyzeTemporaryTable[1] 0
- -- AnalyzeTable[1] 0
- -- Transformer[1] 0
- -- Optimizer[1] 1ms
- -- MVPreprocess[1] 0
- -- MVTextRewrite[1] 0
- -- RuleBaseOptimize[1] 0
- -- CostBaseOptimize[1] 0
- -- PhysicalRewrite[1] 0
- -- DynamicRewrite[1] 0
- -- PlanValidate[1] 0
- -- InputDependenciesChecker[1] 0
- -- TypeChecker[1] 0
- -- CTEUniqueChecker[1] 0
- -- ColumnReuseChecker[1] 0
- -- ExecPlanBuild[1] 0
- -- Pending[1] 0
- -- Prepare[1] 0
- -- Deploy[1] 2ms
- -- DeployLockInternalTime[1] 2ms
- -- DeploySerializeConcurrencyTime[2] 0
- -- DeployStageByStageTime[6] 0
- -- DeployWaitTime[6] 1ms
- -- DeployAsyncSendTime[2] 0
- DeployDataSize: 10916
Reason:

Execution Overview Metrics

High-level execution statistics:

MetricDescriptionRule of Thumb
FrontendProfileMergeTimeFE-side profile processing time< 10ms normal
QueryAllocatedMemoryUsageTotal allocated memory across nodes
QueryDeallocatedMemoryUsageTotal deallocated memory across nodes
QueryPeakMemoryUsagePerNodeMaximum peak memory per node< 80% capacity normal
QuerySumMemoryUsageTotal peak memory across nodes
QueryExecutionWallTimeWall time of execution
QueryCumulativeCpuTimeTotal CPU time across nodesCompare with walltime * totalCpuCores
QueryCumulativeOperatorTimeTotal operator execution timeDenominator for operator time percentages
QueryCumulativeNetworkTimeTotal Exchange node network time
QueryCumulativeScanTimeTotal Scan node IO time
QueryPeakScheduleTimeMaximum Pipeline ScheduleTime< 1s normal for simple queries
QuerySpillBytesData spilled to disk< 1GB normal

Fragment Metrics

Fragment-level execution details:

MetricDescription
InstanceNumNumber of FragmentInstances
InstanceIdsIDs of all FragmentInstances
BackendNumNumber of participating BEs
BackendAddressesBE addresses
FragmentInstancePrepareTimeFragment Prepare phase duration
InstanceAllocatedMemoryUsageTotal allocated memory for instances
InstanceDeallocatedMemoryUsageTotal deallocated memory for instances
InstancePeakMemoryUsagePeak memory across instances

Pipeline Metrics

Pipeline execution details and relationships:

profile_pipeline_time_relationship

Key relationships:

  • DriverTotalTime = ActiveTime + PendingTime + ScheduleTime
  • ActiveTime = ∑ OperatorTotalTime + OverheadTime
  • PendingTime = InputEmptyTime + OutputFullTime + PreconditionBlockTime + PendingFinishTime
  • InputEmptyTime = FirstInputEmptyTime + FollowupInputEmptyTime
MetricDescription
DegreeOfParallelismDegree of pipeline execution parallelism.
TotalDegreeOfParallelismSum of degrees of parallelism. Since the same Pipeline may execute on multiple machines, this item aggregates all values.
DriverPrepareTimeTime taken by the Prepare phase. This metric is not included in DriverTotalTime.
DriverTotalTimeTotal execution time of the Pipeline, excluding the time spent in the Prepare phase.
ActiveTimeExecution time of the Pipeline, including the execution time of each operator and the overall framework overhead, such as time spent in invoking methods like has_output, need_input, etc.
PendingTimeTime the Pipeline is blocked from being scheduled for various reasons.
InputEmptyTimeTime the Pipeline is blocked due to an empty input queue.
FirstInputEmptyTimeTime the Pipeline is first blocked due to an empty input queue. The first blocking time is separately calculated because the first blocking is mainly caused by Pipeline dependencies.
FollowupInputEmptyTimeTime the Pipeline is subsequently blocked due to an empty input queue.
OutputFullTimeTime the Pipeline is blocked due to a full output queue.
PreconditionBlockTimeTime the Pipeline is blocked due to unmet dependencies.
PendingFinishTimeTime the Pipeline is blocked waiting for asynchronous tasks to finish.
ScheduleTimeScheduling time of the Pipeline, from entering the ready queue to being scheduled for execution.
BlockByInputEmptyNumber of times the pipeline is blocked due to InputEmpty.
BlockByOutputFullNumber of times the pipeline is blocked due to OutputFull.
BlockByPreconditionNumber of times the pipeline is blocked due to unmet preconditions.

Operator Metrics

MetricDescription
PrepareTimeTime spent on preparation.
OperatorTotalTimeTotal time consumed by the Operator. It satisfies the equation: OperatorTotalTime = PullTotalTime + PushTotalTime + SetFinishingTime + SetFinishedTime + CloseTime. It excludes time spent on preparation.
PullTotalTimeTotal time the Operator spends executing push_chunk.
PushTotalTimeTotal time the Operator spends executing pull_chunk.
SetFinishingTimeTotal time the Operator spends executing set_finishing.
SetFinishedTimeTotal time the Operator spends executing set_finished.
PushRowNumCumulative number of input rows for the Operator.
PullRowNumCumulative number of output rows for the Operator.
JoinRuntimeFilterEvaluateNumber of times Join Runtime Filter is evaluated.
JoinRuntimeFilterHashTimeTime spent computing hash for Join Runtime Filter.
JoinRuntimeFilterInputRowsNumber of input rows for Join Runtime Filter.
JoinRuntimeFilterOutputRowsNumber of output rows for Join Runtime Filter.
JoinRuntimeFilterTimeTime spent on Join Runtime Filter.

Scan Operator

OLAP Scan Operator

The OLAP_SCAN Operator is responsible for reading data from StarRocks native tables.

MetricDescription
TableTable name.
RollupMaterialized view name. If no materialized view is hit, it is equivalent to the table name.
SharedScanWhether the enable_shared_scan session variable is enabled.
TabletCountNumber of tablets.
MorselsCountNumber of morsels, which is the basic IO execution unit.
PushdownPredicatesNumber of pushdown predicates.
PredicatesPredicate expressions.
BytesReadSize of data read.
CompressedBytesReadSize of compressed data read from disk.
UncompressedBytesReadSize of uncompressed data read from disk.
RowsReadNumber of rows read (after predicate filtering).
RawRowsReadNumber of raw rows read (before predicate filtering).
ReadPagesNumNumber of pages read.
CachedPagesNumNumber of cached pages.
ChunkBufferCapacityCapacity of the Chunk Buffer.
DefaultChunkBufferCapacityDefault capacity of the Chunk Buffer.
PeakChunkBufferMemoryUsagePeak memory usage of the Chunk Buffer.
PeakChunkBufferSizePeak size of the Chunk Buffer.
PrepareChunkSourceTimeTime spent preparing the Chunk Source.
ScanTimeCumulative scan time. Scan operations are completed in an asynchronous I/O thread pool.
IOTaskExecTimeExecution time of IO tasks.
IOTaskWaitTimeWaiting time from successful submission to scheduled execution of IO tasks.
SubmitTaskCountNumber of times IO tasks are submitted.
SubmitTaskTimeTime spent on task submission.
PeakIOTasksPeak number of IO tasks.
PeakScanTaskQueueSizePeak size of the IO task queue.

Connector Scan Operator

It's similar to OLAP_SCAN operator but used for scan external tables like Iceberg/Hive/Hudi/Detal.

MetricDescription
DataSourceTypeData source type, can be HiveDataSource, ESDataSource, and so on.
TableTable name.
TabletCountNumber of tablets.
MorselsCountNumber of morsels.
PredicatesPredicate expression.
PredicatesPartitionPredicate expression applied to partitions.
SharedScanWhether the enable_shared_scan Session variable is enabled.
ChunkBufferCapacityCapacity of the Chunk Buffer.
DefaultChunkBufferCapacityDefault capacity of the Chunk Buffer.
PeakChunkBufferMemoryUsagePeak memory usage of the Chunk Buffer.
PeakChunkBufferSizePeak size of the Chunk Buffer.
PrepareChunkSourceTimeTime taken to prepare the Chunk Source.
ScanTimeCumulative time for scanning. Scan operation is completed in the asynchronous I/O thread pool.
IOTaskExecTimeExecution time of I/O tasks.
IOTaskWaitTimeWaiting time from successful submission to scheduled execution of IO tasks.
SubmitTaskCountNumber of times IO tasks are submitted.
SubmitTaskTimeTime taken to submit tasks.
PeakIOTasksPeak number of IO tasks.
PeakScanTaskQueueSizePeak size of the IO task queue.

Exchange Operator

Exchange Operator is responsible for transmitting data between BE nodes. There can be several kinds of exchange operations: GATHER/BROADCAST/SHUFFLE.

Typical scenarios that can make Exchange Operator the bottleneck of a query:

  1. Broadcast Join: This is a suitable method for a small table. However, in exceptional cases when the optimizer chooses a suboptimal query plan, it can lead to a significant increase in network bandwidth.
  2. Shuffle Aggregation/Join: Shuffling a large table can result in a significant increase in network bandwidth.

Exchange Sink Operator

MetricDescription
ChannelNumNumber of channels. Generally, the number of channels is equal to the number of receivers.
DestFragmentsList of destination FragmentInstance IDs.
DestIDDestination node ID.
PartTypeData distribution mode, including: UNPARTITIONED, RANDOM, HASH_PARTITIONED, and BUCKET_SHUFFLE_HASH_PARTITIONED.
SerializeChunkTimeTime taken to serialize chunks.
SerializedBytesSize of serialized data.
ShuffleChunkAppendCounterNumber of Chunk Append operations when PartType is HASH_PARTITIONED or BUCKET_SHUFFLE_HASH_PARTITIONED.
ShuffleChunkAppendTimeTime taken for Chunk Append operations when PartType is HASH_PARTITIONED or BUCKET_SHUFFLE_HASH_PARTITIONED.
ShuffleHashTimeTime taken to calculate hash when PartType is HASH_PARTITIONED or BUCKET_SHUFFLE_HASH_PARTITIONED.
RequestSentNumber of data packets sent.
RequestUnsentNumber of unsent data packets. This metric is non-zero when there is a short-circuit logic; otherwise, it is zero.
BytesSentSize of sent data.
BytesUnsentSize of unsent data. This metric is non-zero when there is a short-circuit logic; otherwise, it is zero.
BytesPassThroughIf the destination node is the current node, data will not be transmitted over the network, which is called passthrough data. This metric indicates the size of such passthrough data. Passthrough is controlled by enable_exchange_pass_through.
PassThroughBufferPeakMemoryUsagePeak memory usage of the PassThrough Buffer.
CompressTimeCompression time.
CompressedBytesSize of compressed data.
OverallThroughputThroughput rate.
NetworkTimeTime taken for data packet transmission (excluding post-reception processing time).
NetworkBandwidthEstimated network bandwidth.
WaitTimeWaiting time due to a full sender queue.
OverallTimeTotal time for the entire transmission process, i.e., from sending the first data packet to confirming the correct reception of the last data packet.
RpcAvgTimeAverage time for RPC.
RpcCountTotal number of RPCs.

Exchange Source Operator

MetricDescription
RequestReceivedSize of received data packets.
BytesReceivedSize of received data.
DecompressChunkTimeTime taken to decompress chunks.
DeserializeChunkTimeTime taken to deserialize chunks.
ClosureBlockCountNumber of blocked Closures.
ClosureBlockTimeBlocked time for Closures.
ReceiverProcessTotalTimeTotal time taken for receiver-side processing.
WaitLockTimeLock waiting time.

Aggregate Operator

Metrics List

MetricDescription
GroupingKeysGROUP BY columns.
AggregateFunctionsTime taken for aggregate function calculations.
AggComputeTimeTime for AggregateFunctions + Group By.
ChunkBufferPeakMemPeak memory usage of the Chunk Buffer.
ChunkBufferPeakSizePeak size of the Chunk Buffer.
ExprComputeTimeTime for expression computation.
ExprReleaseTimeTime for expression release.
GetResultsTimeTime to extract aggregate results.
HashTableSizeSize of the Hash Table.
HashTableMemoryUsageMemory size of the Hash Table.
InputRowCountNumber of input rows.
PassThroughRowCountIn Auto mode, the number of data rows processed in streaming mode after low aggregation leads to degradation to streaming mode.
ResultAggAppendTimeTime taken to append aggregate result columns.
ResultGroupByAppendTimeTime taken to append Group By columns.
ResultIteratorTimeTime to iterate over the Hash Table.
StreamingTimeProcessing time in streaming mode.

Join Operator

Metrics List

MetricDescription
DistributionModeDistribution type, including: BROADCAST, PARTITIONED, COLOCATE, etc.
JoinPredicatesJoin predicates.
JoinTypeJoin type.
BuildBucketsNumber of buckets in the Hash Table.
BuildKeysPerBucketNumber of keys per bucket in the Hash Table.
BuildConjunctEvaluateTimeTime taken for conjunct evaluation during build phase.
BuildHashTableTimeTime taken to build the Hash Table.
ProbeConjunctEvaluateTimeTime taken for conjunct evaluation during probe phase.
SearchHashTableTimerTime taken to search the Hash Table.
CopyRightTableChunkTimeTime taken to copy chunks from the right table.
OutputBuildColumnTimeTime taken to output the column of build side.
OutputProbeColumnTimeTime taken to output the column of probe side.
HashTableMemoryUsageMemory usage of the Hash Table.
RuntimeFilterBuildTimeTime taken to build runtime filters.
RuntimeFilterNumNumber of runtime filters.

Window Function Operator

MetricDescription
ProcessModeExecution mode, including two parts: the first part includes Materializing and Streaming; the second part includes Cumulative, RemovableCumulative, ByDefinition.
ComputeTimeTime taken for window function calculations.
PartitionKeysPartition columns.
AggregateFunctionsAggregate functions.
ColumnResizeTimeTime taken for column resizing.
PartitionSearchTimeTime taken to search partition boundaries.
PeerGroupSearchTimeTime taken to search Peer Group boundaries. Meaningful only when the window type is RANGE.
PeakBufferedRowsPeak number of rows in the buffer.
RemoveUnusedRowsCountNumber of times unused buffers are removed.
RemoveUnusedTotalRowsTotal number of rows removed from unused buffers.

Sort Operator

MetricDescription
SortKeysSorting keys.
SortTypeQuery result sorting method: full sorting or sorting the top N results.
MaxBufferedBytesPeak size of buffered data.
MaxBufferedRowsPeak number of buffered rows.
NumSortedRunsNumber of sorted runs.
BuildingTimeTime taken to maintain internal data structures during sorting.
MergingTimeTime taken to merge sorted runs during sorting.
SortingTimeTime taken for sorting.
OutputTimeTime taken to build the output sorted sequence.

Merge Operator

MetricDescriptionLevel
LimitLimit.Primary
OffsetOffset.Primary
StreamingBatchSizeSize of data processed per Merge operation when Merge is performed in Streaming modePrimary
LateMaterializationMaxBufferChunkNumMaximum number of chunks in the buffer when late materialization is enabled.Primary
OverallStageCountTotal execution count of all stages.Primary
OverallStageTimeTotal execution time for each stage.Primary
1-InitStageCountExecution count of the Init stage.Secondary
2-PrepareStageCountExecution count of the Prepare stage.Secondary
3-ProcessStageCountExecution count of the Process stage.Secondary
4-SplitChunkStageCountExecution count of the SplitChunk stage.Secondary
5-FetchChunkStageCountExecution count of the FetchChunk stage.Secondary
6-PendingStageCountExecution count of the Pending stage.Secondary
7-FinishedStageCountExecution count of the Finished stage.Secondary
1-InitStageTimeExecution time for the Init stage.Secondary
2-PrepareStageTimeExecution time for the Prepare stage.Secondary
3-ProcessStageTimeExecution time for the Process stage.Secondary
4-SplitChunkStageTimeTime taken for the Split stage.Secondary
5-FetchChunkStageTimeTime taken for the Fetch stage.Secondary
6-PendingStageTimeTime taken for the Pending stage.Secondary
7-FinishedStageTimeTime taken for the Finished stage.Secondary
LateMaterializationGenerateOrdinalTimeTime taken for generating ordinal columns during late materialization.Tertiary
SortedRunProviderTimeTime taken to retrieve data from the provider during the Process stage.Tertiary

TableFunction Operator

MetricDescription
TableFunctionExecTimeComputation time for the Table Function.
TableFunctionExecCountNumber of executions for the Table Function.

Project Operator

Project Operator is responsible for performing SELECT <expr>. If there're some expensive expressions in the query, this operator can take significant time.

MetricDescription
ExprComputeTimeComputation time for expressions.
CommonSubExprComputeTimeComputation time for common sub-expressions.

LocalExchange Operator

MetricDescription
TypeType of Local Exchange, including: Passthrough, Partition, and Broadcast.
ShuffleNumNumber of shuffles. This metric is only valid when Type is Partition.
LocalExchangePeakMemoryUsagePeak memory usage.
LocalExchangePeakBufferSizePeak size of the buffer.
LocalExchangePeakBufferMemoryUsagePeak memory usage of the buffer.
LocalExchangePeakBufferChunkNumPeak number of chunks in the buffer.
LocalExchangePeakBufferRowNumPeak number of rows in the buffer.
LocalExchangePeakBufferBytesPeak size of data in the buffer.
LocalExchangePeakBufferChunkSizePeak size of chunks in the buffer.
LocalExchangePeakBufferChunkRowNumPeak number of rows per chunk in the buffer.
LocalExchangePeakBufferChunkBytesPeak size of data per chunk in the buffer.

OlapTableSink Operator

OlapTableSink Operator is responsible for performing the INSERT INTO <table> operation.

tip
  • An excessive difference between the Max and Min values of the PushChunkNum metric of OlapTableSink indicates data skew in the upstream operators, which may lead to a bottleneck in loading performance.
  • RpcClientSideTime equals RpcServerSideTime plus network transmission time plus RPC framework processing time. If there is a significant difference between RpcClientSideTime and RpcServerSideTime, consider enabling compression to reduce transmission time.
MetricDescription
IndexNumNumber of the synchronous materialized views created for the destination table.
ReplicatedStorageWhether Single Leader Replication is enabled.
TxnIDID of the loading transaction.
RowsReadNumber of rows read from upstream operators.
RowsFilteredNumber of rows filtered out due to inadequate data quality.
RowsReturnedNumber of rows written to the destination table.
RpcClientSideTimeTotal RPC time consumption for loading recorded by the client side.
RpcServerSideTimeTotal RPC time consumption for loading recorded by the server side.
PrepareDataTimeTotal time consumption for the data preparation phase, including data format conversion and data quality check.
SendDataTimeLocal time consumption for sending the data, including time for serializing and compressing data, and for submitting tasks to the sender queue.