Next: SELECT INTO Up: Reference Manual Previous: ROLLBACK

Subsections

SELECT

Name

SELECT -- Retrieve rows from a table or view.

SELECT [ ALL | DISTINCT [ ON ( expression [, ...] ) ] ]

expression [ AS name ] [, ...]

[ INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table ]

[ FROM table [ alias ] [, ...] ]

[ WHERE condition ]

[ GROUP BY column [, ...] ]

[ HAVING condition [, ...] ]

[ { UNION [ ALL ] | INTERSECT | EXCEPT } select ]

[ ORDER BY column [ ASC | DESC | USING operator ] [, ...] ]

[ FOR UPDATE [ OF class_name [, ...] ] ]

[ LIMIT { count | ALL } [ { OFFSET | , } start ] ]

Inputs

expression: The name of a table's column or an expression.
name: Specifies another name for a column or an expression using the AS clause. This name is primarily used to label the column for display. It can also be used to refer to the column's value in ORDER BY and GROUP BY clauses. But the name cannot be used in the WHERE or HAVING clauses; write out the expression instead.
TEMPORARY, TEMP: If TEMPORARY or TEMP is specified, the table is created unique to this session, and is automatically dropped on session exit.
new_table: If the INTO TABLE clause is specified, the result of the query will be stored in a new table with the indicated name. The target table (new_table ) will be created automatically and must not exist before this command. Refer to SELECT INTO for more information. Note: The CREATE TABLE AS statement will also create a new table from a select query.
table: The name of an existing table referenced by the FROM clause.
alias: An alternate name for the preceding table. It is used for brevity or to eliminate ambiguity for joins within a single table.
condition: A boolean expression giving a result of true or false. See the WHERE clause.
column: The name of a table's column.
select: A select statement with all features except the ORDER BY and LIMIT clauses.

Outputs

Rows: The complete set of rows resulting from the query specification.
count: The count of rows returned by the query.

Description

SELECT will return rows from one or more tables. Candidates for selection are rows which satisfy the WHERE condition; if WHERE is omitted, all rows are candidates. (See WHERE Clause.)

DISTINCT will eliminate duplicate rows from the result. ALL (the default) will return all candidate rows, including duplicates.

DISTINCT ON eliminates rows that match on all the specified expressions, keeping only the first row of each set of duplicates. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY items; see below. Note that "the first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example,

SELECT DISTINCT ON (location) location, time, report FROM weatherReports ORDER BY location, time DESC;

retrieves the most recent weather report for each location. But if we had not used ORDER BY to force descending order of time values for each location, we'd have gotten a report of unpredictable age for each location.

The GROUP BY clause allows a user to divide a table into groups of rows that match on one or more values. (See GROUP BY Clause.)

The HAVING clause allows selection of only those groups of rows meeting the specified condition. (See HAVING Clause.)

The ORDER BY clause causes the returned rows to be sorted in a specified order. If ORDER BY is not given, the rows are returned in whatever order the system finds cheapest to produce. (See ORDER BY Clause.)

The UNION operator allows the result to be the collection of rows returned by the queries involved. (See UNION Clause.)

The INTERSECT operator gives you the rows that are common to both queries. (See INTERSECT Clause.)

The EXCEPT operator gives you the rows returned by the first query but not the second query. (See EXCEPT Clause.)

The FOR UPDATE clause allows the SELECT statement to perform exclusive locking of selected rows.

The LIMIT clause allows a subset of the rows produced by the query to be returned to the user. (See LIMIT Clause.)

You must have SELECT privilege to a table to read its values (See the GRANT /REVOKE statements).

WHERE Clause

The optional WHERE condition has the general form:

WHERE boolean_expr

boolean_expr can consist of any expression which evaluates to a boolean value. In many cases, this expression will be:

expr cond_op expr

or

log_op expr

where cond_op can be one of: =, <, <=, >, >= or <>, a conditional operator like ALL, ANY, IN, LIKE, or a locally defined operator, and log_op can be one of: AND, OR, NOT. SELECT will ignore all rows for which the WHERE condition does not return TRUE.

GROUP BY Clause

GROUP BY specifies a grouped table derived by the application of this clause:

GROUP BY column [, ...]

GROUP BY will condense into a single row all selected rows that share the same values for the grouped columns. Aggregate functions, if any, are computed across all rows making up each group, producing a separate value for each group (whereas without GROUP BY, an aggregate produces a single value computed across all the selected rows). When GROUP BY is present, it is not valid for the SELECT output expression(s) to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column.

An item in GROUP BY can also be the name or ordinal number of an output column (SELECT expression), or it can be an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name.

HAVING Clause

The optional HAVING condition has the general form:

HAVING cond_expr

where cond_expr is the same as specified for the WHERE clause.

HAVING specifies a grouped table derived by the elimination of group rows that do not satisfy the cond_expr. HAVING is different from WHERE: WHERE filters individual rows before application of GROUP BY, while HAVING filters group rows created by GROUP BY.

Each column referenced in cond_expr shall unambiguously reference a grouping column, unless the reference appears within an aggregate function.

ORDER BY Clause

ORDER BY column [ ASC | DESC ] [, ...]

column can be either a result column name or an ordinal number.

The ordinal numbers refers to the ordinal (left-to-right) position of the result column. This feature makes it possible to define an ordering on the basis of a column that does not have a proper name. This is never absolutely necessary because it is always possible to assign a name to a result column using the AS clause, e.g.:

SELECT title, date_prod + 1 AS newlen FROM films ORDER BY newlen;

It is also possible to ORDER BY arbitrary expressions (an extension to SQL92), including fields that do not appear in the SELECT result list. Thus the following statement is legal:

SELECT name FROM distributors ORDER BY code;

Note that if an ORDER BY item is a simple name that matches both a result column name and an input column name, ORDER BY will interpret it as the result column name. This is the opposite of the choice that GROUP BY will make in the same situation. This inconsistency is mandated by the SQL92 standard.

Optionally one may add the keyword DESC (descending) or ASC (ascending) after each column name in the ORDER BY clause. If not specified, ASC is assumed by default. Alternatively, a specific ordering operator name may be specified. ASC is equivalent to USING '< ' and DESC is equivalent to USING '> '.

UNION Clause

table_query UNION [ ALL ] table_query

[ ORDER BY column [ ASC | DESC ] [, ...] ] where table_query specifies any select expression without an ORDER BY or LIMIT clause.

The UNION operator allows the result to be the collection of rows returned by the queries involved. The two SELECTs that represent the direct operands of the UNION must produce the same number of columns, and corresponding columns must be of compatible data types.

By default, the result of UNION does not contain any duplicate rows unless the ALL clause is specified.

Multiple UNION operators in the same SELECT statement are evaluated left to right. Note that the ALL keyword is not global in nature, being applied only for the current pair of table results.

INTERSECT Clause

table_query INTERSECT table_query

[ ORDER BY column [ ASC | DESC ] [, ...] ] where table_query specifies any select expression without an ORDER BY or LIMIT clause.

The INTERSECT operator gives you the rows that are common to both queries. The two SELECTs that represent the direct operands of the INTERSECT must produce the same number of columns, and corresponding columns must be of compatible data types.

Multiple INTERSECT operators in the same SELECT statement are evaluated left to right, unless parentheses dictate otherwise.

EXCEPT Clause

table_query EXCEPT table_query

[ ORDER BY column [ ASC | DESC ] [, ...] ] where table_query specifies any select expression without an ORDER BY or LIMIT clause.

The EXCEPT operator gives you the rows returned by the first query but not the second query. The two SELECTs that represent the direct operands of the EXCEPT must produce the same number of columns, and corresponding columns must be of compatible data types.

Multiple EXCEPT operators in the same SELECT statement are evaluated left to right, unless parentheses dictate otherwise.

LIMIT Clause

LIMIT { count | ALL } [ { OFFSET | , } start ] OFFSET start

where count specifies the maximum number of rows to return, and start specifies the number of rows to skip before starting to return rows.

LIMIT allows you to retrieve just a portion of the rows that are generated by the rest of the query. If a limit count is given, no more than that many rows will be returned. If an offset is given, that many rows will be skipped before starting to return rows.

When using LIMIT, it is a good idea to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query's rows --you may be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering? You don't know what ordering, unless you specified ORDER BY.

As of Postgres 7.0, the query optimizer takes LIMIT into account when generating a query plan, so you are very likely to get different plans (yielding different row orders) depending on what you give for LIMIT and OFFSET. Thus, using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used to constrain the order.

Usage

To join the table films with the table distributors:

SELECT f.title, f.did, d.name, f.date_prod, f.kind FROM distributors d, films f WHERE f.did = d.did title | did | name | date_prod | kind ---------------------------+-----+------------------+------------+---------- The Third Man | 101 | British Lion | 1949-12-23 | Drama The African Queen | 101 | British Lion | 1951-08-11 | Romantic Une Femme est une Femme | 102 | Jean Luc Godard | 1961-03-12 | Romantic Vertigo | 103 | Paramount | 1958-11-14 | Action Becket | 103 | Paramount | 1964-02-03 | Drama 48 Hrs | 103 | Paramount | 1982-10-22 | Action War and Peace | 104 | Mosfilm | 1967-02-12 | Drama West Side Story | 105 | United Artists | 1961-01-03 | Musical Bananas | 105 | United Artists | 1971-07-13 | Comedy Yojimbo | 106 | Toho | 1961-06-16 | Drama There's a Girl in my Soup | 107 | Columbia | 1970-06-11 | Comedy Taxi Driver | 107 | Columbia | 1975-05-15 | Action Absence of Malice | 107 | Columbia | 1981-11-15 | Action Storia di una donna | 108 | Westward | 1970-08-15 | Romantic The King and I | 109 | 20th Century Fox | 1956-08-11 | Musical Das Boot | 110 | Bavaria Atelier | 1981-11-11 | Drama Bed Knobs and Broomsticks | 111 | Walt Disney | | Musical (17 rows)

To sum the column len of all films and group the results by kind:

SELECT kind, SUM(len) AS total FROM films GROUP BY kind HAVING SUM(len) < INTERVAL '5 hour'; kind | total ----------+------- Comedy | 02:58 Romantic | 04:38 (2 rows)

The following two examples are identical ways of sorting the individual results according to the contents of the second column (name):

distributors: actors: did | name id | name -----+-------------- ----+---------------- 108 | Westward 1 | Woody Allen 111 | Walt Disney 2 | Warren Beatty 112 | Warner Bros. 3 | Walter Matthau ... ... SELECT distributors.name FROM distributors WHERE distributors.name LIKE 'W%' UNION SELECT actors.name FROM actors WHERE actors.name LIKE 'W%' name ---------------- Walt Disney Walter Matthau Warner Bros. Warren Beatty Westward Woody Allen

Compatibility

Extensions

Postgres allows one to omit the FROM clause from a query. This feature was retained from the original PostQuel query language:

SELECT distributors.* WHERE name = 'Westwood'; did | name -----+---------- 108 | Westward

SQL92

SELECT Clause

In the SQL92 standard, the optional keyword "AS" is just noise and can be omitted without affecting the meaning. The Postgres parser requires this keyword when renaming columns because the type extensibility features lead to parsing ambiguities in this context.

The DISTINCT ON phrase is not part of SQL92. Nor are LIMIT and OFFSET.

In SQL92, an ORDER BY clause may only use result column names or numbers, while a GROUP BY clause may only use input column names. Postgres extends each of these clauses to allow the other choice as well (but it uses the standard's interpretation if there is ambiguity). Postgres also allows both clauses to specify arbitrary expressions. Note that names appearing in an expression will always be taken as input-column names, not as result-column names.

UNION Clause

The SQL92 syntax for UNION allows an additional CORRESPONDING BY clause:

table_query UNION [ALL] [CORRESPONDING [BY (column [,...])]] table_query

The CORRESPONDING BY clause is not supported by Postgres.

Next: SELECT INTO Up: Reference Manual Previous: ROLLBACK

Bruce Momjian
2005-04-21