Command Prompt, Inc.
Arrays

Arrays

As documented in Chapter 3, PostgreSQL supports non-atomic values in individual table columns through data constructs called arrays. An array itself is not a data type, but an extension of any PostgreSQL data type.

Creating an Array Column

A simple array column is created by appending a pair of square brackets to the data type of the intended array column within a CREATE TABLE or ALTER TABLE statement. These brackets indicate that more than a single value of the described data type may be inserted without limit into that column. For example:

  single_array type[]   -- A single array of values.

Additional square brackets may be added to create multidimensional arrays, which may store an array of array values. For example:

  multi_array  type[][] -- A multidimensional array of values.

In theory, an integer value n could be supplied within the square brackets to produce a fixed-length array (one which always has n members, and no more). As of PostgreSQL 7.1.x, however, this restriction is not enforced, and there is no practical difference between an array created with a fixed length and one created without.

Example 7-16 creates a table named favorite_books. This table associates an integer value of an employee with a one-dimensional character string array of type text called books.

Example 7-16. Creating a table with an array column

booktown=# CREATE TABLE favorite_books
booktown-#              (employee_id integer, books text[]);
CREATE

The table created by Example 7-16 allows any number of book titles to be stored in a single array column, for each employee. The advantage of such an array of discrete text values over a single text string (which also, of course, could contain multiple titles) is that each title is kept physically separate from each other title in the array column. Since the system knows where each array value begins and ends, you can choose titles by their subscript, rather than having to manually parse them out of a long text string.

Creating a multidimensional array column is very similar. The only distinction is that another pair of square brackets follows the first pair, as shown earlier in this section. Example 7-17 creates a table called favorite_authors, with an employee_id column of type integer, and multidimensional text array of author_and_titles. This essentially creates an array of text arrays.

Example 7-17. Creating a table with a multidimensional array column

booktown=# CREATE TABLE favorite_authors (employee_id integer,
booktown(#                                authors_and_titles text[][]);
CREATE

Inserting Values into Array Columns

A special kind of syntax is used in order to insert multiple values into a single column. This syntax allows you to describe an array constant. As documented in Chapter 3, the syntax of an array constant (for referring to PostgreSQL array values in SQL statements) is a special arrangement of curly braces, double-quotes and commas, all bound by single-quotes. Double quotes are required only when working with an array of character strings. Therefore, the general forms of array constants are:

  '{ "text1" [, ...] }' -- A character string array.
  '{ numeric [, ...] }' -- A numeric array.

These syntax forms illustrate how to handle string and numeric arrays, but a column may be defined as an array of any arbitrary type (including boolean, date, and time types). Generally, if you would ordinarily use single-quotes to describe a value in a non-array context (such as with a string constant, or timestamp value), double-quotes should be used for that value in an array constant.

Example 7-18 inserts a pair of records into the favorite_books table. The first statement inserts a single favorite book for the employee with id 102, and the second statement inserts two titles for the employee with id 103.

Example 7-18 executes two SQL INSERT statements, which insert a pair of array constant values.

Example 7-18. Inserting array constants

booktown=# INSERT INTO favorite_books VALUES 
booktown-#             (102, '{"The Hitchhiker\'s Guide to the Galaxy"}');
INSERT 3628399 1
booktown=# INSERT INTO favorite_books VALUES 
booktown-#             (103, '{"The Hobbit", "Kitten, Squared"}');
INSERT 3628400 1

Notice that, in Example 7-18, curly braces are still required to insert a single value into an array. Notice also that the single-quote in the title (first INSERT statement) still requires a backslash preceding it, even though it is surrounded by double-quotes. This is because the array constant itself is parsed as if it were one long string constant, and subsequently interpreted as an array based on the context of its target column.

The insertion of values into a multidimensional array requires a pair of curly braces for each array; an array of arrays must therefore itself be bound in curly braces, while each of its member arrays should be separated by one another with commas. Example 7-19 inserts a single row containing a multidimensional array constant into the favorite_authors table, created in Example 7-17.

Example 7-19. Inserting values into multidimensional arrays

booktown=# INSERT INTO favorite_authors
booktown-#             VALUES (102,
booktown(#             '{{"J.R.R. Tolkien", "The Silmarillion"},
booktown'#               {"Charles Dickens", "Great Expectations"},
booktown'#               {"Ariel Denham", "Attic Lives"}}');
INSERT 3727961 1

Notice that the inserted multidimensional array in Example 7-19 contains three text arrays, which each have two members. There is no systematic relationship between these arrays, though the implied relationship from the context is that the first members of each array are authors corresponding to the second members of each array, which are the favorite titles from the associated author.

Selecting Values From Array Columns

Selecting an array column from a table will result in the entire array being returned in the same constant format described in the previous section. Example 7-20 retrieves the entire arrays for inserted rows in the books column of the favorite_books table.

Example 7-20. Selecting entire array values

booktown=# SELECT books FROM favorite_books;
                  books
------------------------------------------
 {"The Hitchhiker's Guide to the Galaxy"}
 {"The Hobbit","Kitten, Squared"}
(2 rows)

While it is helpful to be able to return the entire array, the ability to retrieve only a specific portion of an array is often more useful. To this end, you need to learn how to work with array subscripts and slices.

Array subscripts

The usefulness of arrays lies largely in the fact that you can use subscripts to specify the value that you wish to view. A subscript is an integer value surrounded by square brackets, that describes the value you want to select. This number describes the precedence of the value you wish to select, from left to right in the array.

Unlike arrays in programming languages such as C, PostgreSQL begins counting array elements at 1, not 0. Example 7-21 uses the [1] subscript on the books column of the favorite_books table to select only the first of an employee's favorite titles. Notice that the query returns values without braces or double-quotes. This is because a single text value need only be returned as a single text constant, not as an array.

Example 7-21. Selecting array values with subscripts

booktown=# SELECT books[1] FROM favorite_books;
                books
--------------------------------------
 The Hitchhiker's Guide to the Galaxy
 The Hobbit
(2 rows)

Specifying a subscript for an array element which holds no value results in a NULL value being selected. The IS NOT NULL keywords may be useful in conjunction with such selections. Example 7-22 demonstrates two queries; the first returns two rows, NULL value as well as a title. The second query only returns the title (and not the row with the NULL value as a result of its use of the WHERE clause, with the IS NOT NULL condition.

Example 7-22. Avoiding NULL values in arrays

booktown=# SELECT books[2] FROM favorite_books;
      books
-----------------
 
 Kitten, Squared
(2 rows)

booktown=# SELECT books[2] FROM favorite_books 
booktown-#                 WHERE books[2] IS NOT NULL;
      books
-----------------
 Kitten, Squared
(1 row)

Selecting from a multidimensional array requires an additional subscript following the initial subscript. The first subscript refers to which array that you are retrieving data from, while the second subscript refers to which member of the specified array is to be retrieved. Example 7-23 demonstrates selecting the first author, and associated title for that author, from the favorite_authors table created in Example 7-19.

Example 7-23. Selecting From a Multi-Dimensional Array

booktown=# SELECT authors_and_titles[1][1] AS author,
booktown-#        authors_and_titles[1][2] AS title
booktown-#        FROM favorite_authors;
     author     |      title
----------------+------------------
 J.R.R. Tolkien | The Silmarillion
(1 row)

Array slices

PostgreSQL also supports slices in array selection. These are similar to array subscripts, but describe a range of values to be returned. The syntax of a slice is a pair of integers, separated by a colon (:), surrounded by square brackets. For example, [2:5] specifies the second, third, fourth, and fifth array values of a given array. The result of a slice selection is returned as an array constant that is essentially a sub-array of the entire array (though a slice may extend from the beginning to the end of an array).

Example 7-24 selects the range of the first two book titles in the books test array column from the favorite_books table. Even though the first returned row has only one title, it is still returned as an array with one member value.

Example 7-24. Selecting array values with slices

booktown=# SELECT books[1:2] FROM favorite_books;
                  books
------------------------------------------
 {"The Hitchhiker's Guide to the Galaxy"}
 {"The Hobbit","Kitten, Squared"}
(2 rows)

Array slices can be somewhat unpredictable with multidimensional arrays as of PostgreSQL 7.1.x. It is therefore recommended to stick to exact subscript values when working with multidimensional arrays until this support is improved.

Array dimensions

It can be useful to know the number of values stored in an array. You may use the array_dims() function to accomplish this. It accepts as a parameter a single identifier, which is the name of the array column that you wish to perform the function on. The result is returned as a character string describing the array with the same syntax used in array slices. Example 7-25 calls the array_dims() function on the books column of the favorite_books table.

Example 7-25. Using array_dims( )

booktown=# SELECT array_dims(books) FROM favorite_books;
 array_dims
------------
 [1:1]
 [1:2]
(2 rows)

Updating Values in Array Columns

Values in array columns may be modified in one of three ways:

Complete modification

The entire array may be replaced with a new array constant.

Slice modification

A slice of an array (range between two values) may be replaced with a new array constant. The new array constant should have the same number of values within it as the splice to be updated.

Element modification

An individual value in the array may be replaced with a new constant of the base type of the array. You use a subscript to specify which array value to replace.

Replacing an array value with a new array puts no restriction on the number of values within the new array. There need not be the same number of values in the new array as in the existing one. For instance, suppose that the employee with id 102 wishes to add another favorite book to his list in the favorite_books table. This is achieved with an UPDATE statement in Example 7-26 which completely overwrites the previous value.

Example 7-26. Completely modifying an array

booktown=# UPDATE favorite_books
booktown-#        SET books='{"The Hitchhiker\'s Guide to the Galaxy",
booktown'#                    "The Restaurant at the End of the Universe"}'
booktown-#        WHERE employee_id = 102;
UPDATE 1

The same approach used in Example 7-26 can be used to set a slice of an array by attaching a slice descriptor to the end of the target identifier (e.g., books[1:3] would refer to the first, second and third values in the books array column). More commonly, though, situations arise where a single value within an array needs to be modified, instead of the entire array, or a slice of an array.

Updating a single value in an array is done by attaching a subscript to the target identifier to indicate the specific value to be modified. Example 7-27 updates the first array value of the books column, in the favorite_books table.

Example 7-27. Modifying an array subscript

booktown=# SELECT books[1] FROM favorite_books;
                  
                  books
------------------------------------------
 The Hitchhiker's Guide to the Galaxy
 The Hobbit
(2 rows)

booktown=# UPDATE favorite_books
booktown-#        SET books[1] = 'There and Back Again: A Hobbit\'s Holiday'
booktown-#        WHERE books[1] = 'The Hobbit';
UPDATE 1
booktown=# SELECT books[1] FROM favorite_books;
                  
                  books
------------------------------------------
 The Hitchhiker's Guide to the Galaxy
 There and Back Again: A Hobbit's Holiday
(2 rows)

Powered by Mammoth PostgreSQL
Copyright © 2000-2007 Command Prompt, Inc. All Rights Reserved. All trademarks property of their respective owners.