-
Notifications
You must be signed in to change notification settings - Fork 709
tidb: document analyze v1 removal #22655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
0xPoe
wants to merge
10
commits into
pingcap:master
Choose a base branch
from
0xPoe:poe-patch-analyze-v1-master-docs
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 9 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
b541076
statistics: document analyze v1 removal
0xPoe b82e6fb
Update information-schema/information-schema-analyze-status.md
0xPoe 0482877
Update statistics.md
0xPoe 903d4b2
Update statistics.md
0xPoe c3c7ee8
Update system-variables.md
0xPoe a3c919a
Update system-variables.md
0xPoe b99377c
Update system-variables.md
0xPoe d78459f
Update system-variables.md
0xPoe 5b9d855
fix: correct analyze v1 doc links
0xPoe aa61fdd
minor wording updates
qiancai File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -300,7 +300,7 @@ | |||||
|
|
||||||
| ### Disable ANALYZE configuration persistence | ||||||
|
|
||||||
| To disable the `ANALYZE` configuration persistence feature, set the `tidb_persist_analyze_options` system variable to `OFF`. Because the `ANALYZE` configuration persistence feature is not applicable to `tidb_analyze_version = 1`, setting `tidb_analyze_version = 1` can also disable the feature. | ||||||
| To disable the `ANALYZE` configuration persistence feature, set the `tidb_persist_analyze_options` system variable to `OFF`. | ||||||
|
|
||||||
| After disabling the `ANALYZE` configuration persistence feature, TiDB does not clear the persisted configuration records. Therefore, if you enable this feature again, TiDB continues to collect statistics using the previously recorded persistent configurations. | ||||||
|
|
||||||
|
|
@@ -356,13 +356,19 @@ | |||||
|
|
||||||
| ## Versions of statistics | ||||||
|
|
||||||
| The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. | ||||||
| The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls how TiDB collects statistics. | ||||||
|
|
||||||
| Before v9.0.0, TiDB supported two statistics versions for new statistics collection: Version 1 (`tidb_analyze_version = 1`) and Version 2 (`tidb_analyze_version = 2`). Starting from v9.0.0, Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. TiDB only supports Version 2 (`tidb_analyze_version = 2`) for collecting new statistics. | ||||||
|
|
||||||
| > **Warning:** | ||||||
| > | ||||||
| > If your TiDB cluster is upgraded from an earlier version and still has existing Version 1 statistics, TiDB can continue to read these Version 1 statistics for upgrade compatibility. However, TiDB can no longer collect new statistics using Version 1. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| - For TiDB Self-Managed, the default value of this variable changes from `1` to `2` starting from v5.3.0. | ||||||
| - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. | ||||||
| - If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade. | ||||||
| - When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade. | ||||||
|
|
||||||
| Version 2 is preferred, and will continue to be enhanced to ultimately replace Version 1 completely. Compared to Version 1, Version 2 improves the accuracy of many of the statistics collected for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, and also supporting automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). | ||||||
| Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, and it supports automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). For new statistics collection, Version 2 is the only supported statistics version. | ||||||
|
Check warning on line 371 in statistics.md
|
||||||
|
|
||||||
| The following table lists the information collected by each version for usage in the optimizer estimates: | ||||||
|
|
||||||
|
|
@@ -377,29 +383,22 @@ | |||||
|
|
||||||
| ### Switch between statistics versions | ||||||
|
|
||||||
| It is recommended to ensure that all tables/indexes (and partitions) utilize statistics collection from the same version. Version 2 is recommended, however, it is not recommended to switch from one version to another without a justifiable reason such as an issue experienced with the version in use. A switch between versions might take a period of time when no statistics are available until all tables have been analyzed with the new version, which might negatively affect the optimizer plan choices if statistics are not available. | ||||||
| It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Until Version 2 statistics are collected for an object (such as a table, an index, or a partition), TiDB continues to use the existing Version 1 statistics for that object. | ||||||
|
|
||||||
| Examples of justifications to switch might include - with Version 1, there could be inaccuracies in equal/IN predicate estimation due to hash collisions when collecting Count-Min sketch statistics. Solutions are listed in the [Count-Min Sketch](#count-min-sketch) section. Alternatively, setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects is also a solution. In the early release of Version 2, there was a risk of memory overflow after `ANALYZE`. This issue is resolved, but initially, one solution was to set `tidb_analyze_version = 1` and rerun `ANALYZE` on all objects. | ||||||
| One major reason to migrate is that Version 1 might produce inaccurate estimates for equality and `IN` predicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. | ||||||
|
|
||||||
| To prepare `ANALYZE` for switching between versions: | ||||||
| To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2: | ||||||
|
qiancai marked this conversation as resolved.
Outdated
|
||||||
|
|
||||||
| - If the `ANALYZE` statement is executed manually, manually analyze every table to be analyzed. | ||||||
|
|
||||||
| ```sql | ||||||
| SELECT DISTINCT(CONCAT('ANALYZE TABLE ', table_schema, '.', table_name, ';')) | ||||||
| FROM information_schema.tables JOIN mysql.stats_histograms | ||||||
| ON table_id = tidb_table_id | ||||||
| WHERE stats_ver = 2; | ||||||
| WHERE stats_ver = 1; | ||||||
| ``` | ||||||
|
|
||||||
| - If TiDB automatically executes the `ANALYZE` statement because the auto-analysis has been enabled, execute the following statement that generates the [`DROP STATS`](/sql-statements/sql-statement-drop-stats.md) statement: | ||||||
|
|
||||||
| ```sql | ||||||
| SELECT DISTINCT(CONCAT('DROP STATS ', table_schema, '.', table_name, ';')) | ||||||
| FROM information_schema.tables JOIN mysql.stats_histograms | ||||||
| ON table_id = tidb_table_id | ||||||
| WHERE stats_ver = 2; | ||||||
| ``` | ||||||
| - If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. To speed up the migration for important objects, run `ANALYZE` on them manually. | ||||||
|
|
||||||
| - If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this: | ||||||
|
|
||||||
|
|
||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.