Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

17.6. Query Planning

17.6.1. Planner Method Configuration

These configuration parameters provide a crude method of influencing the query plans chosen by the query optimizer. If the default plan chosen by the optimizer for a particular query is not optimal, a temporary solution may be found by using one of these configuration parameters to force the optimizer to choose a different plan. Turning one of these settings off permanently is seldom a good idea, however. Better ways to improve the quality of the plans chosen by the optimizer include adjusting the Planner Cost Constants , running ANALYZE more frequently, increasing the value of the default_statistics_target configuration parameter, and increasing the amount of statistics collected for specific columns using ALTER TABLE SET STATISTICS.

enable_bitmapscan (boolean)

Enables or disables the query planner's use of bitmap-scan plan types. The default is on.

enable_hashagg (boolean)

Enables or disables the query planner's use of hashed aggregation plan types. The default is on.

enable_hashjoin (boolean)

Enables or disables the query planner's use of hash-join plan types. The default is on.

enable_indexscan (boolean)

Enables or disables the query planner's use of index-scan plan types. The default is on.

enable_mergejoin (boolean)

Enables or disables the query planner's use of merge-join plan types. The default is on.

enable_nestloop (boolean)

Enables or disables the query planner's use of nested-loop join plans. It's not possible to suppress nested-loop joins entirely, but turning this variable off discourages the planner from using one if there are other methods available. The default is on.

enable_seqscan (boolean)

Enables or disables the query planner's use of sequential scan plan types. It's not possible to suppress sequential scans entirely, but turning this variable off discourages the planner from using one if there are other methods available. The default is on.

enable_sort (boolean)

Enables or disables the query planner's use of explicit sort steps. It's not possible to suppress explicit sorts entirely, but turning this variable off discourages the planner from using one if there are other methods available. The default is on.

enable_tidscan (boolean)

Enables or disables the query planner's use of TID scan plan types. The default is on.

17.6.2. Planner Cost Constants

Note: Unfortunately, there is no well-defined method for determining ideal values for the family of "cost" variables that appear below. You are encouraged to experiment and share your findings.

effective_cache_size (floating point)

Sets the planner's assumption about the effective size of the disk cache that is available to a single index scan. This is factored into estimates of the cost of using an index; a higher value makes it more likely index scans will be used, a lower value makes it more likely sequential scans will be used. When setting this parameter you should consider both PostgreSQL's shared buffers and the portion of the kernel's disk cache that will be used for PostgreSQL data files. Also, take into account the expected number of concurrent queries using different indexes, since they will have to share the available space. This parameter has no effect on the size of shared memory allocated by PostgreSQL, nor does it reserve kernel disk cache; it is used only for estimation purposes. The value is measured in disk pages, which are normally 8192 bytes each. The default is 1000.

random_page_cost (floating point)

Sets the planner's estimate of the cost of a nonsequentially fetched disk page. This is measured as a multiple of the cost of a sequential page fetch. A higher value makes it more likely a sequential scan will be used, a lower value makes it more likely an index scan will be used. The default is four.

cpu_tuple_cost (floating point)

Sets the planner's estimate of the cost of processing each row during a query. This is measured as a fraction of the cost of a sequential page fetch. The default is 0.01.

cpu_index_tuple_cost (floating point)

Sets the planner's estimate of the cost of processing each index row during an index scan. This is measured as a fraction of the cost of a sequential page fetch. The default is 0.001.

cpu_operator_cost (floating point)

Sets the planner's estimate of the cost of processing each operator in a WHERE clause. This is measured as a fraction of the cost of a sequential page fetch. The default is 0.0025.

17.6.3. Genetic Query Optimizer

geqo (boolean)

Enables or disables genetic query optimization, which is an algorithm that attempts to do query planning without exhaustive searching. This is on by default. The geqo_threshold variable provides a more granular way to disable GEQO for certain classes of queries.

geqo_threshold (integer)

Use genetic query optimization to plan queries with at least this many FROM items involved. (Note that an outer JOIN construct counts as only one FROM item.) The default is 12. For simpler queries it is usually best to use the deterministic, exhaustive planner, but for queries with many tables the deterministic planner takes too long.

geqo_effort (integer)

Controls the trade off between planning time and query plan efficiency in GEQO. This variable must be an integer in the range from 1 to 10. The default value is 5. Larger values increase the time spent doing query planning, but also increase the likelihood that an efficient query plan will be chosen.

geqo_effort doesn't actually do anything directly; it is only used to compute the default values for the other variables that influence GEQO behavior (described below). If you prefer, you can set the other parameters by hand instead.

geqo_pool_size (integer)

Controls the pool size used by GEQO. The pool size is the number of individuals in the genetic population. It must be at least two, and useful values are typically 100 to 1000. If it is set to zero (the default setting) then a suitable default is chosen based on geqo_effort and the number of tables in the query.

geqo_generations (integer)

Controls the number of generations used by GEQO. Generations specifies the number of iterations of the algorithm. It must be at least one, and useful values are in the same range as the pool size. If it is set to zero (the default setting) then a suitable default is chosen based on geqo_pool_size.

geqo_selection_bias (floating point)

Controls the selection bias used by GEQO. The selection bias is the selective pressure within the population. Values can be from 1.50 to 2.00; the latter is the default.

17.6.4. Other Planner Options

default_statistics_target (integer)

Sets the default statistics target for table columns that have not had a column-specific target set via ALTER TABLE SET STATISTICS. Larger values increase the time needed to do ANALYZE, but may improve the quality of the planner's estimates. The default is 10. For more information on the use of statistics by the PostgreSQL query planner, refer to Section 13.2.

constraint_exclusion (boolean)

Enables or disables the query planner's use of table constraints to optimize queries. The default is off.

When this parameter is on, the planner compares query conditions with table CHECK constraints, and omits scanning tables for which the conditions contradict the constraints. (Presently this is done only for child tables of inheritance scans.) For example:

CREATE TABLE parent(key integer, ...);
CREATE TABLE child1000(check (key between 1000 and 1999)) INHERITS(parent);
CREATE TABLE child2000(check (key between 2000 and 2999)) INHERITS(parent);
...
SELECT * FROM parent WHERE key = 2400;

With constraint exclusion enabled, this SELECT will not scan child1000 at all. This can improve performance when inheritance is used to build partitioned tables.

Currently, constraint_exclusion is disabled by default because it risks incorrect results if query plans are cached — if a table constraint is changed or dropped, the previously generated plan might now be wrong, and there is no built-in mechanism to force re-planning. (This deficiency will probably be addressed in a future PostgreSQL release.) Another reason for keeping it off is that the constraint checks are relatively expensive, and in many circumstances will yield no savings. It is recommended to turn this on only if you are actually using partitioned tables designed to take advantage of the feature.

Refer to Section 5.9 for more information on using constraint exclusion and partitioning.

from_collapse_limit (integer)

The planner will merge sub-queries into upper queries if the resulting FROM list would have no more than this many items. Smaller values reduce planning time but may yield inferior query plans. The default is 8. It is usually wise to keep this less than geqo_threshold.

join_collapse_limit (integer)

The planner will rewrite explicit inner JOIN constructs into lists of FROM items whenever a list of no more than this many items in total would result. Prior to PostgreSQL 7.4, joins specified via the JOIN construct would never be reordered by the query planner. The query planner has subsequently been improved so that inner joins written in this form can be reordered; this configuration parameter controls the extent to which this reordering is performed.

Note: At present, the order of outer joins specified via the JOIN construct is never adjusted by the query planner; therefore, join_collapse_limit has no effect on this behavior. The planner may be improved to reorder some classes of outer joins in a future release of PostgreSQL.

By default, this variable is set the same as from_collapse_limit, which is appropriate for most uses. Setting it to 1 prevents any reordering of inner JOINs. Thus, the explicit join order specified in the query will be the actual order in which the relations are joined. The query planner does not always choose the optimal join order; advanced users may elect to temporarily set this variable to 1, and then specify the join order they desire explicitly. Another consequence of setting this variable to 1 is that the query planner will behave more like the PostgreSQL 7.3 query planner, which some users might find useful for backward compatibility reasons.

Setting this variable to a value between 1 and from_collapse_limit might be useful to trade off planning time against the quality of the chosen plan (higher values produce better plans).


 
 
  Published courtesy of The PostgreSQL Global Development Group Design by Interspire