diff --git a/docs/content/maintenance/metrics.md b/docs/content/maintenance/metrics.md new file mode 100644 index 000000000000..6a8542f29100 --- /dev/null +++ b/docs/content/maintenance/metrics.md @@ -0,0 +1,399 @@ +--- +title: "Metrics" +weight: 9 +type: docs +aliases: +- /maintenance/metrics.html +--- + + +# Paimon Metrics + +Paimon has built a metrics system to measure the behaviours of reading and writing, like how many manifest files it scanned in the last planning, how long it took in the last commit operation, how many files it deleted in the last compact operation. + +In Paimon's metrics system, metrics are updated and reported at different levels of granularity. Currently, the levels of **table** and **bucket** are provided, which means you can get metrics per table or bucket. + +There are three types of metrics provided in the Paimon metric system, `Gauge`, `Counter`, `Histogram`. +- `Gauge`: Provides a value of any type at a point in time. +- `Counter`: Used to count values by incrementing and decrementing. +- `Histogram`: Measure the statistical distribution of a set of values including the min, max, mean, standard deviation and percentile. + +Paimon has supported built-in metrics to measure operations of **commits**, **scans** and **compactions**, which can be bridged to any computing engine that supports, like Flink, Spark etc. + +## Metrics List + +Below is lists of Paimon built-in metrics. They are summarized into three types of metrics, scan metrics, commit metrics and compaction metrics. + +### Scan Metrics + +
Metrics Name | +Level | +Type | +Description | +
---|---|---|---|
lastScanDuration | +Table | +Gauge | +The time it took to complete the last scan. | +
scanDuration | +Table | +Histogram | +Distributions of the time taken by the last few scans. | +
lastScannedManifests | +Table | +Gauge | +Number of scanned manifest files in the last scan. | +
lastSkippedByPartitionAndStats | +Table | +Gauge | +Skipped table files by partition filter and value / key stats information in the last scan. | +
lastSkippedByBucketAndLevelFilter | +Table | +Gauge | +Skipped table files by bucket, bucket key and level filter in the last scan. | +
lastSkippedByWholeBucketFilesFilter | +Table | +Gauge | +Skipped table files by bucket level value filter (only primary key table) in the last scan. | +
lastScanSkippedTableFiles | +Table | +Gauge | +Total skipped table files in the last scan. | +
lastScanResultedTableFiles | +Table | +Gauge | +Resulted table files in the last scan. | +
Metrics Name | +Level | +Type | +Description | +
---|---|---|---|
lastCommitDuration | +Table | +Gauge | +The time it took to complete the last commit. | +
commitDuration | +Table | +Histogram | +Distributions of the time taken by the last few commits. | +
lastCommitAttempts | +Table | +Gauge | +The number of attempts the last commit made. | +
lastTableFilesAdded | +Table | +Gauge | +Number of added table files in the last commit, including newly created data files and compacted after. | +
lastTableFilesDeleted | +Table | +Gauge | +Number of deleted table files in the last commit, which comes from compacted before. | +
lastTableFilesAppended | +Table | +Gauge | +Number of appended table files in the last commit, which means the newly created data files. | +
lastTableFilesCommitCompacted | +Table | +Gauge | +Number of compacted table files in the last commit, including compacted before and after. | +
lastChangelogFilesAppended | +Table | +Gauge | +Number of appended changelog files in last commit. | +
lastChangelogFileCommitCompacted | +Table | +Gauge | +Number of compacted changelog files in last commit. | +
lastGeneratedSnapshots | +Table | +Gauge | +Number of snapshot files generated in the last commit, maybe 1 snapshot or 2 snapshots. | +
lastDeltaRecordsAppended | +Table | +Gauge | +Delta records count in last commit with APPEND commit kind. | +
lastChangelogRecordsAppended | +Table | +Gauge | +Changelog records count in last commit with APPEND commit kind. | +
lastDeltaRecordsCommitCompacted | +Table | +Gauge | +Delta records count in last commit with COMPACT commit kind. | +
lastChangelogRecordsCommitCompacted | +Table | +Gauge | +Changelog records count in last commit with COMPACT commit kind. | +
lastPartitionsWritten | +Table | +Gauge | +Number of partitions written in the last commit. | +
lastBucketsWritten | +Table | +Gauge | +Number of buckets written in the last commit. | +
Metrics Name | +Level | +Type | +Description | +
---|---|---|---|
lastCompactionDuration | +Bucket | +Gauge | +The time it took to complete the last compaction. | +
compactionDuration | +Bucket | +Histogram | +Distributions of the time taken by the last few compaction. | +
lastTableFilesCompactedBefore | +Bucket | +Gauge | +Number of deleted files in the last compaction. | +
lastTableFilesCompactedAfter | +Bucket | +Gauge | +Number of added files in the last compaction. | +
lastChangelogFilesCompacted | +Bucket | +Gauge | +Number of changelog files compacted in last compaction. | +
lastRewriteInputFileSize | +Bucket | +Gauge | +Size of deleted files in the last compaction. | +
lastRewriteOutputFileSize | +Bucket | +Gauge | +Size of added files in the last compaction. | +
lastRewriteChangelogFileSize | +Bucket | +Gauge | +Size of changelog files compacted in last compaction. | +
+ | Scope | +Infix | +
---|---|---|
Scan Metrics | +<host>.jobmanager.<job_name> | +<source_operator_name>.coordinator. enumerator.paimon.table.<table_name>.scan | +
Commit Metrics | +<host>.taskmanager.<tm_id>.<job_name>.<committer_operator_name>.<subtask_index> | +paimon.table.<table_name>.commit | +
Compaction Metrics | +<host>.taskmanager.<tm_id>.<job_name>.<writer_operator_name>.<subtask_index> | +paimon.table.<table_name>.partition.<partition_string>.bucket.<bucket_index>.compaction | +
Flink Source Metrics | +<host>.taskmanager.<tm_id>.<job_name>.<source_operator_name>.<subtask_index> | +- | +
Flink Sink Metrics | +<host>.taskmanager.<tm_id>.<job_name>.<committer_operator_name>.<subtask_index> | +- | +
Metrics Name | +Level | +Type | +Description | +
---|---|---|---|
currentFetchEventTimeLag | +Flink Source Operator | +Gauge | +Time difference between reading the data file and file creation. | +
Metrics Name | +Level | +Type | +Description | +
---|---|---|---|
numBytesOut | +Table | +Counter | +The total number of output bytes. | +
numBytesOutPerSecond | +Table | +Meter | +The output bytes per second. | +
numRecordsOut | +Table | +Counter | +The total number of output records. | +
numRecordsOutPerSecond | +Table | +Meter | +The output records per second. | +