From b541076c8b08a706fba91566d1d69421498619e1 Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Mon, 30 Mar 2026 14:37:59 +0200 Subject: [PATCH 01/10] statistics: document analyze v1 removal fix: correct the job info Signed-off-by: 0xPoe fix Signed-off-by: 0xPoe --- .../information-schema-analyze-status.md | 4 +-- .../sql-statement-show-analyze-status.md | 33 ++++--------------- statistics.md | 29 ++++++++-------- system-variables.md | 13 +++++--- 4 files changed, 30 insertions(+), 49 deletions(-) diff --git a/information-schema/information-schema-analyze-status.md b/information-schema/information-schema-analyze-status.md index 4a57f1433ed7d..7ada4cea9a8e9 100644 --- a/information-schema/information-schema-analyze-status.md +++ b/information-schema/information-schema-analyze-status.md @@ -64,7 +64,7 @@ Fields in the `ANALYZE_STATUS` table are described as follows: * `TABLE_SCHEMA`: The name of the database to which the table belongs. * `TABLE_NAME`: The name of the table. * `PARTITION_NAME`: The name of the partitioned table. -* `JOB_INFO`: The information of the `ANALYZE` task. If an index is analyzed, this information will include the index name. When `tidb_analyze_version = 2`, this information will include configuration items such as sample rate. +* `JOB_INFO`: A brief description of the `ANALYZE` subtask. It shows the `ANALYZE` scope, such as columns, indexes, or global statistics merge, and might include the effective options used, such as `buckets`, `topn`, `samplerate`, or `samples`. * `PROCESSED_ROWS`: The number of rows that have been processed. * `START_TIME`: The start time of the `ANALYZE` task. * `END_TIME`: The end time of the `ANALYZE` task. @@ -79,4 +79,4 @@ Fields in the `ANALYZE_STATUS` table are described as follows: ## See also - [`ANALYZE TABLE`](/sql-statements/sql-statement-analyze-table.md) -- [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md) \ No newline at end of file +- [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md) diff --git a/sql-statements/sql-statement-show-analyze-status.md b/sql-statements/sql-statement-show-analyze-status.md index ea82ae75fcbd2..09a97923e829b 100644 --- a/sql-statements/sql-statement-show-analyze-status.md +++ b/sql-statements/sql-statement-show-analyze-status.md @@ -21,7 +21,7 @@ Currently, the `SHOW ANALYZE STATUS` statement returns the following columns: | `Table_schema` | The database name | | `Table_name` | The table name | | `Partition_name` | The partition name | -| `Job_info` | The task information. If an index is analyzed, this information will include the index name. When `tidb_analyze_version =2`, this information will include configuration items such as sample rate. | +| `Job_info` | A brief description of the `ANALYZE` subtask. It shows the `ANALYZE` scope, such as columns, indexes, or global statistics merge, and might include the effective options used, such as `buckets`, `topn`, `samplerate`, or `samples`. | | `Processed_rows` | The number of rows that have been analyzed | | `Start_time` | The time at which the task starts | | `State` | The state of a task, including `pending`, `running`, `finished`, and `failed` | @@ -39,31 +39,14 @@ ShowLikeOrWhereOpt ::= 'LIKE' SimpleExpr | 'WHERE' Expression ## Examples +> **Note:** +> +> Statistics Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. The following example shows the current `ANALYZE` behavior with Statistics Version 2. + ```sql mysql> create table t(x int, index idx(x)) partition by hash(x) partitions 2; Query OK, 0 rows affected (0.69 sec) -mysql> set @@tidb_analyze_version = 1; -Query OK, 0 rows affected (0.00 sec) - -mysql> analyze table t; -Query OK, 0 rows affected (0.20 sec) - -mysql> show analyze status; -+--------------+------------+----------------+-------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+------------------+----------+---------------------+ -| Table_schema | Table_name | Partition_name | Job_info | Processed_rows | Start_time | End_time | State | Fail_reason | Instance | Process_ID | Remaining_seconds| Progress | Estimated_total_rows| -+--------------+------------+----------------+-------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+------------------+----------+---------------------+ -| test | t | p1 | analyze index idx | 0 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | -| test | t | p0 | analyze index idx | 0 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | -| test | t | p1 | analyze columns | 0 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | -| test | t | p0 | analyze columns | 0 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | -| test | t1 | p0 | analyze columns | 28523259 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | running | NULL | 127.0.0.1:4000 | 690208308 | 0s | 0.9843 | 28978290 | -+--------------+------------+----------------+-------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+------------------+----------+---------------------+ -4 rows in set (0.01 sec) - -mysql> set @@tidb_analyze_version = 2; -Query OK, 0 rows affected (0.00 sec) - mysql> analyze table t; Query OK, 0 rows affected, 2 warnings (0.03 sec) @@ -73,12 +56,8 @@ mysql> show analyze status; +--------------+------------+----------------+--------------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+--------------------+----------+----------------------+ | test | t | p1 | analyze table all columns with 256 buckets, 500 topn, 1 samplerate | 0 | 2022-05-27 11:30:12 | 2022-05-27 11:30:12 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | | test | t | p0 | analyze table all columns with 256 buckets, 500 topn, 1 samplerate | 0 | 2022-05-27 11:30:12 | 2022-05-27 11:30:12 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | -| test | t | p1 | analyze index idx | 0 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | -| test | t | p0 | analyze index idx | 0 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | -| test | t | p1 | analyze columns | 0 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | -| test | t | p0 | analyze columns | 0 | 2022-05-27 11:29:46 | 2022-05-27 11:29:46 | finished | NULL | 127.0.0.1:4000 | NULL | NULL | NULL | NULL | +--------------+------------+----------------+--------------------------------------------------------------------+----------------+---------------------+---------------------+----------+-------------+----------------+------------+--------------------+----------+----------------------+ -6 rows in set (0.00 sec) +2 rows in set (0.00 sec) ``` ## MySQL compatibility diff --git a/statistics.md b/statistics.md index 30265ab448f73..0fc16a35cccbc 100644 --- a/statistics.md +++ b/statistics.md @@ -300,7 +300,7 @@ TiDB will overwrite the previously recorded persistent configuration using the n ### Disable ANALYZE configuration persistence -To disable the `ANALYZE` configuration persistence feature, set the `tidb_persist_analyze_options` system variable to `OFF`. Because the `ANALYZE` configuration persistence feature is not applicable to `tidb_analyze_version = 1`, setting `tidb_analyze_version = 1` can also disable the feature. +To disable the `ANALYZE` configuration persistence feature, set the `tidb_persist_analyze_options` system variable to `OFF`. After disabling the `ANALYZE` configuration persistence feature, TiDB does not clear the persisted configuration records. Therefore, if you enable this feature again, TiDB continues to collect statistics using the previously recorded persistent configurations. @@ -356,13 +356,17 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; ## Versions of statistics -The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. +> **Warning:** +> +> Statistics Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. TiDB keeps reading existing Version 1 statistics for upgrade compatibility, but all new `ANALYZE` operations use Statistics Version 2 (`tidb_analyze_version = 2`). It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). + +The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, TiDB supports two statistics versions: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. - For TiDB Self-Managed, the default value of this variable changes from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. -- If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade. +- When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. -Version 2 is preferred, and will continue to be enhanced to ultimately replace Version 1 completely. Compared to Version 1, Version 2 improves the accuracy of many of the statistics collected for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, and also supporting automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). +Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, and it supports automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). For new statistics collection, Version 2 is the only supported statistics version. The following table lists the information collected by each version for usage in the optimizer estimates: @@ -377,11 +381,11 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended to ensure that all tables/indexes (and partitions) utilize statistics collection from the same version. Version 2 is recommended, however, it is not recommended to switch from one version to another without a justifiable reason such as an issue experienced with the version in use. A switch between versions might take a period of time when no statistics are available until all tables have been analyzed with the new version, which might negatively affect the optimizer plan choices if statistics are not available. +It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Until Version 2 statistics are collected for an object (such as a table, an index, or a partition), TiDB continues to use the existing Version 1 statistics for that object. -Examples of justifications to switch might include - with Version 1, there could be inaccuracies in equal/IN predicate estimation due to hash collisions when collecting Count-Min sketch statistics. Solutions are listed in the [Count-Min Sketch](#count-min-sketch) section. Alternatively, setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects is also a solution. In the early release of Version 2, there was a risk of memory overflow after `ANALYZE`. This issue is resolved, but initially, one solution was to set `tidb_analyze_version = 1` and rerun `ANALYZE` on all objects. +One major reason to migrate is that Version 1 might produce inaccurate estimates for equal/IN predicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. -To prepare `ANALYZE` for switching between versions: +To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2: - If the `ANALYZE` statement is executed manually, manually analyze every table to be analyzed. @@ -389,17 +393,10 @@ To prepare `ANALYZE` for switching between versions: SELECT DISTINCT(CONCAT('ANALYZE TABLE ', table_schema, '.', table_name, ';')) FROM information_schema.tables JOIN mysql.stats_histograms ON table_id = tidb_table_id - WHERE stats_ver = 2; + WHERE stats_ver = 1; ``` -- If TiDB automatically executes the `ANALYZE` statement because the auto-analysis has been enabled, execute the following statement that generates the [`DROP STATS`](/sql-statements/sql-statement-drop-stats.md) statement: - - ```sql - SELECT DISTINCT(CONCAT('DROP STATS ', table_schema, '.', table_name, ';')) - FROM information_schema.tables JOIN mysql.stats_histograms - ON table_id = tidb_table_id - WHERE stats_ver = 2; - ``` +- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. To speed up the migration for important objects, run `ANALYZE` on them manually. - If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this: diff --git a/system-variables.md b/system-variables.md index 4d56f6afa0371..ccd6cc27e3982 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1181,16 +1181,21 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a ### tidb_analyze_version New in v5.1.0 +> **Warning:** +> +> Statistics Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. TiDB keeps reading existing Version 1 statistics for upgrade compatibility, but all new `ANALYZE` operations use Statistics Version 2 (`tidb_analyze_version = 2`). It is recommended that you use `tidb_analyze_version = 2`. + - Scope: SESSION | GLOBAL - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Integer - Default value: `2` -- Range: `[1, 2]` +- Range: `[1, 2]`. Only `2` is supported for new statistics collection. - Controls how TiDB collects statistics. - - For TiDB Self-Managed, the default value of this variable changes from `1` to `2` starting from v5.3.0. - - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. - - If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade. + - If you try to set this variable to `1`, TiDB returns an error. + - For TiDB Self-Managed, the default value of this variable changed from `1` to `2` starting from v5.3.0. + - For TiDB Cloud, the default value of this variable changed from `1` to `2` starting from v6.5.0. + - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. - For detailed introduction about this variable, see [Introduction to Statistics](/statistics.md). ### tidb_analyze_skip_column_types New in v7.2.0 From b82e6fbd9f30d9c6f833062fdb98f94a829c1af0 Mon Sep 17 00:00:00 2001 From: Dongpo Liu Date: Tue, 28 Apr 2026 10:33:54 +0200 Subject: [PATCH 02/10] Update information-schema/information-schema-analyze-status.md Co-authored-by: Grace Cai --- information-schema/information-schema-analyze-status.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/information-schema/information-schema-analyze-status.md b/information-schema/information-schema-analyze-status.md index 7ada4cea9a8e9..875cccb3f5bb1 100644 --- a/information-schema/information-schema-analyze-status.md +++ b/information-schema/information-schema-analyze-status.md @@ -64,7 +64,7 @@ Fields in the `ANALYZE_STATUS` table are described as follows: * `TABLE_SCHEMA`: The name of the database to which the table belongs. * `TABLE_NAME`: The name of the table. * `PARTITION_NAME`: The name of the partitioned table. -* `JOB_INFO`: A brief description of the `ANALYZE` subtask. It shows the `ANALYZE` scope, such as columns, indexes, or global statistics merge, and might include the effective options used, such as `buckets`, `topn`, `samplerate`, or `samples`. +* `JOB_INFO`: A brief description of the `ANALYZE` subtask. It shows the `ANALYZE` scope, such as columns, indexes, or global statistics merging, and might include the effective options used, such as `buckets`, `topn`, `samplerate`, or `samples`. * `PROCESSED_ROWS`: The number of rows that have been processed. * `START_TIME`: The start time of the `ANALYZE` task. * `END_TIME`: The end time of the `ANALYZE` task. From 0482877856fdf772020cca7d527bd182c03b6435 Mon Sep 17 00:00:00 2001 From: Dongpo Liu Date: Tue, 28 Apr 2026 10:34:18 +0200 Subject: [PATCH 03/10] Update statistics.md Co-authored-by: Grace Cai --- statistics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/statistics.md b/statistics.md index 0fc16a35cccbc..27a651512dfe4 100644 --- a/statistics.md +++ b/statistics.md @@ -383,7 +383,7 @@ The following table lists the information collected by each version for usage in It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Until Version 2 statistics are collected for an object (such as a table, an index, or a partition), TiDB continues to use the existing Version 1 statistics for that object. -One major reason to migrate is that Version 1 might produce inaccurate estimates for equal/IN predicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. +One major reason to migrate is that Version 1 might produce inaccurate estimates for equality and `IN` predicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2: From 903d4b2b9605431bfd7799aceedd3cbd7024a8ac Mon Sep 17 00:00:00 2001 From: Dongpo Liu Date: Tue, 28 Apr 2026 10:34:53 +0200 Subject: [PATCH 04/10] Update statistics.md Co-authored-by: Grace Cai --- statistics.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/statistics.md b/statistics.md index 27a651512dfe4..3682d4e9078c0 100644 --- a/statistics.md +++ b/statistics.md @@ -356,11 +356,14 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; ## Versions of statistics +The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls how TiDB collects statistics. + +Before v9.0.0, TiDB supported two statistics versions for new statistics collection: Version 1 (`tidb_analyze_version = 1`) and Version 2 (`tidb_analyze_version = 2`). Starting from v9.0.0, Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. TiDB only supports Version 2 (`tidb_analyze_version = 2`) for collecting new statistics. + > **Warning:** > -> Statistics Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. TiDB keeps reading existing Version 1 statistics for upgrade compatibility, but all new `ANALYZE` operations use Statistics Version 2 (`tidb_analyze_version = 2`). It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). +> If your TiDB cluster is upgraded from an earlier version and still has existing Version 1 statistics, TiDB can continue to read these Version 1 statistics for upgrade compatibility. However, TiDB can no longer collect new statistics using Version 1. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). -The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, TiDB supports two statistics versions: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. - For TiDB Self-Managed, the default value of this variable changes from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. From c3c7ee8e8bb38ec0c9386023806655f2f2bd3d76 Mon Sep 17 00:00:00 2001 From: Dongpo Liu Date: Tue, 28 Apr 2026 10:35:08 +0200 Subject: [PATCH 05/10] Update system-variables.md Co-authored-by: Grace Cai --- system-variables.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/system-variables.md b/system-variables.md index ccd6cc27e3982..e80e76a86b0e0 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1190,7 +1190,7 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Integer - Default value: `2` -- Range: `[1, 2]`. Only `2` is supported for new statistics collection. +- Range: `[2]`. - Controls how TiDB collects statistics. - If you try to set this variable to `1`, TiDB returns an error. - For TiDB Self-Managed, the default value of this variable changed from `1` to `2` starting from v5.3.0. From a3c919ae7f8674c2e31993c6bf3b9c07909843fc Mon Sep 17 00:00:00 2001 From: Dongpo Liu Date: Tue, 28 Apr 2026 10:35:25 +0200 Subject: [PATCH 06/10] Update system-variables.md Co-authored-by: Grace Cai --- system-variables.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/system-variables.md b/system-variables.md index e80e76a86b0e0..d32d2c06dad2c 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1192,7 +1192,7 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a - Default value: `2` - Range: `[2]`. - Controls how TiDB collects statistics. - - If you try to set this variable to `1`, TiDB returns an error. + - Starting from v9.0.0, Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. If you try to set this variable to `1`, TiDB returns an error. For more information, see [Versions of statistics](/statistics.md#versions-of-statistics). - For TiDB Self-Managed, the default value of this variable changed from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changed from `1` to `2` starting from v6.5.0. - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. From b99377c0e0a962c93869cfa996669fd5cca782b7 Mon Sep 17 00:00:00 2001 From: Dongpo Liu Date: Tue, 28 Apr 2026 10:35:54 +0200 Subject: [PATCH 07/10] Update system-variables.md Co-authored-by: Grace Cai --- system-variables.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/system-variables.md b/system-variables.md index d32d2c06dad2c..476264229688f 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1195,7 +1195,7 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a - Starting from v9.0.0, Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. If you try to set this variable to `1`, TiDB returns an error. For more information, see [Versions of statistics](/statistics.md#versions-of-statistics). - For TiDB Self-Managed, the default value of this variable changed from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changed from `1` to `2` starting from v6.5.0. - - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. + - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. Note that after the upgrade, the existing Version 1 statistics are not converted to Version 2 statistics automatically. It is recommended that you [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). - For detailed introduction about this variable, see [Introduction to Statistics](/statistics.md). ### tidb_analyze_skip_column_types New in v7.2.0 From d78459f5667988df669b2dcaddf25f06f43560d5 Mon Sep 17 00:00:00 2001 From: Dongpo Liu Date: Tue, 28 Apr 2026 10:38:09 +0200 Subject: [PATCH 08/10] Update system-variables.md --- system-variables.md | 1 - 1 file changed, 1 deletion(-) diff --git a/system-variables.md b/system-variables.md index 476264229688f..21a9d00667666 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1190,7 +1190,6 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Integer - Default value: `2` -- Range: `[2]`. - Controls how TiDB collects statistics. - Starting from v9.0.0, Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. If you try to set this variable to `1`, TiDB returns an error. For more information, see [Versions of statistics](/statistics.md#versions-of-statistics). - For TiDB Self-Managed, the default value of this variable changed from `1` to `2` starting from v5.3.0. From 5b9d8550ab80e098fd296e54c55ee050a2ded64d Mon Sep 17 00:00:00 2001 From: 0xPoe Date: Tue, 28 Apr 2026 10:55:02 +0200 Subject: [PATCH 09/10] fix: correct analyze v1 doc links --- statistics.md | 1 - system-variables.md | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/statistics.md b/statistics.md index 3682d4e9078c0..71a921aa7c0eb 100644 --- a/statistics.md +++ b/statistics.md @@ -364,7 +364,6 @@ Before v9.0.0, TiDB supported two statistics versions for new statistics collect > > If your TiDB cluster is upgraded from an earlier version and still has existing Version 1 statistics, TiDB can continue to read these Version 1 statistics for upgrade compatibility. However, TiDB can no longer collect new statistics using Version 1. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). - - For TiDB Self-Managed, the default value of this variable changes from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. diff --git a/system-variables.md b/system-variables.md index 21a9d00667666..ce311ad7e1366 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1194,7 +1194,7 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a - Starting from v9.0.0, Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. If you try to set this variable to `1`, TiDB returns an error. For more information, see [Versions of statistics](/statistics.md#versions-of-statistics). - For TiDB Self-Managed, the default value of this variable changed from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changed from `1` to `2` starting from v6.5.0. - - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. Note that after the upgrade, the existing Version 1 statistics are not converted to Version 2 statistics automatically. It is recommended that you [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). + - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. Note that after the upgrade, the existing Version 1 statistics are not converted to Version 2 statistics automatically. It is recommended that you [migrate existing objects that use Statistics Version 1 to Version 2](/statistics.md#switch-between-statistics-versions). - For detailed introduction about this variable, see [Introduction to Statistics](/statistics.md). ### tidb_analyze_skip_column_types New in v7.2.0 From aa61fddf4a95f05ddbf7cd076f98ee043f8a112e Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Wed, 29 Apr 2026 18:15:22 +0800 Subject: [PATCH 10/10] minor wording updates --- sql-statements/sql-statement-show-analyze-status.md | 2 +- statistics.md | 2 +- system-variables.md | 3 +-- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/sql-statements/sql-statement-show-analyze-status.md b/sql-statements/sql-statement-show-analyze-status.md index 09a97923e829b..bd47e39a547e2 100644 --- a/sql-statements/sql-statement-show-analyze-status.md +++ b/sql-statements/sql-statement-show-analyze-status.md @@ -41,7 +41,7 @@ ShowLikeOrWhereOpt ::= 'LIKE' SimpleExpr | 'WHERE' Expression > **Note:** > -> Statistics Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. The following example shows the current `ANALYZE` behavior with Statistics Version 2. +> Starting from v9.0.0, TiDB no longer support using Statistics Version 1 (`tidb_analyze_version = 1`) for new statistics collection. The following example shows the current `ANALYZE` behavior with Statistics Version 2. ```sql mysql> create table t(x int, index idx(x)) partition by hash(x) partitions 2; diff --git a/statistics.md b/statistics.md index 71a921aa7c0eb..d1a3b54f9dd4a 100644 --- a/statistics.md +++ b/statistics.md @@ -387,7 +387,7 @@ It is recommended that all tables, indexes, and partitions use the same statisti One major reason to migrate is that Version 1 might produce inaccurate estimates for equality and `IN` predicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. -To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2: +To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2, do the following: - If the `ANALYZE` statement is executed manually, manually analyze every table to be analyzed. diff --git a/system-variables.md b/system-variables.md index ce311ad7e1366..0dc0a73186ae1 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1183,7 +1183,7 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a > **Warning:** > -> Statistics Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. TiDB keeps reading existing Version 1 statistics for upgrade compatibility, but all new `ANALYZE` operations use Statistics Version 2 (`tidb_analyze_version = 2`). It is recommended that you use `tidb_analyze_version = 2`. +> Starting from v9.0.0, TiDB no longer support using Statistics Version 1 (`tidb_analyze_version = 1`) for new statistics collection. If you try to set this variable to `1`, TiDB returns an error. For more information, see [Versions of statistics](/statistics.md#versions-of-statistics). TiDB still supports reading existing Version 1 statistics for upgrade compatibility, but all new `ANALYZE` operations use Statistics Version 2 (`tidb_analyze_version = 2`). It is recommended that you use `tidb_analyze_version = 2`. - Scope: SESSION | GLOBAL - Persists to cluster: Yes @@ -1191,7 +1191,6 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a - Type: Integer - Default value: `2` - Controls how TiDB collects statistics. - - Starting from v9.0.0, Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. If you try to set this variable to `1`, TiDB returns an error. For more information, see [Versions of statistics](/statistics.md#versions-of-statistics). - For TiDB Self-Managed, the default value of this variable changed from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changed from `1` to `2` starting from v6.5.0. - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. Note that after the upgrade, the existing Version 1 statistics are not converted to Version 2 statistics automatically. It is recommended that you [migrate existing objects that use Statistics Version 1 to Version 2](/statistics.md#switch-between-statistics-versions).