Asked  7 Months ago    Answers:  5   Viewed   25 times

I have a table that looks like this:

id    feh    bar
1     10     A
2     20     A
3      3     B
4      4     B
5      5     C
6      6     D
7      7     D
8      8     D

And I want it to look like this:

bar  val1   val2   val3
A     10     20 
B      3      4 
C      5        
D      6      7     8

I have this query that does this:

SELECT bar, 
   MAX(CASE WHEN abc."row" = 1 THEN feh ELSE NULL END) AS "val1",
   MAX(CASE WHEN abc."row" = 2 THEN feh ELSE NULL END) AS "val2",
   MAX(CASE WHEN abc."row" = 3 THEN feh ELSE NULL END) AS "val3"
FROM
(
  SELECT bar, feh, row_number() OVER (partition by bar) as row
  FROM "Foo"
 ) abc
GROUP BY bar

This is a very make-shifty approach and gets unwieldy if there are a lot of new columns to be created. I was wondering if the CASE statements can be made better to make this query more dynamic? Also, I'd love to see other approaches to doing this.

 Answers

48

If you have not installed the additional module tablefunc, run this command once per database:

CREATE EXTENSION tablefunc;

Answer to question

A very basic crosstab solution for your case:

SELECT * FROM crosstab(
  'SELECT bar, 1 AS cat, feh
   FROM   tbl_org
   ORDER  BY bar, feh')
 AS ct (bar text, val1 int, val2 int, val3 int);  -- more columns?

The special difficulty here is, that there is no category (cat) in the base table. For the basic 1-parameter form we can just provide a dummy column with a dummy value serving as category. The value is ignored anyway.

This is one of the rare cases where the second parameter for the crosstab() function is not needed, because all NULL values only appear in dangling columns to the right by definition of this problem. And the order can be determined by the value.

If we had an actual category column with names determining the order of values in the result, we'd need the 2-parameter form of crosstab(). Here I synthesize a category column with the help of the window function row_number(), to base crosstab() on:

SELECT * FROM crosstab(
   $$
   SELECT bar, val, feh
   FROM  (
      SELECT *, 'val' || row_number() OVER (PARTITION BY bar ORDER BY feh) AS val
      FROM tbl_org
      ) x
   ORDER BY 1, 2
   $$
 , $$VALUES ('val1'), ('val2'), ('val3')$$         -- more columns?
) AS ct (bar text, val1 int, val2 int, val3 int);  -- more columns?

The rest is pretty much run-of-the-mill. Find more explanation and links in these closely related answers.

Basics:
Read this first if you are not familiar with the crosstab() function!

  • PostgreSQL Crosstab Query

Advanced:

  • Pivot on Multiple Columns using Tablefunc
  • Merge a table and a change log into a view in PostgreSQL

Proper test setup

That's how you should provide a test case to begin with:

CREATE TEMP TABLE tbl_org (id int, feh int, bar text);
INSERT INTO tbl_org (id, feh, bar) VALUES
   (1, 10, 'A')
 , (2, 20, 'A')
 , (3,  3, 'B')
 , (4,  4, 'B')
 , (5,  5, 'C')
 , (6,  6, 'D')
 , (7,  7, 'D')
 , (8,  8, 'D');

Dynamic crosstab?

Not very dynamic, yet, as @Clodoaldo commented. Dynamic return types are hard to achieve with plpgsql. But there are ways around it - with some limitations.

So not to further complicate the rest, I demonstrate with a simpler test case:

CREATE TEMP TABLE tbl (row_name text, attrib text, val int);
INSERT INTO tbl (row_name, attrib, val) VALUES
   ('A', 'val1', 10)
 , ('A', 'val2', 20)
 , ('B', 'val1', 3)
 , ('B', 'val2', 4)
 , ('C', 'val1', 5)
 , ('D', 'val3', 8)
 , ('D', 'val1', 6)
 , ('D', 'val2', 7);

Call:

SELECT * FROM crosstab('SELECT row_name, attrib, val FROM tbl ORDER BY 1,2')
AS ct (row_name text, val1 int, val2 int, val3 int);

Returns:

 row_name | val1 | val2 | val3
----------+------+------+------
 A        | 10   | 20   |
 B        |  3   |  4   |
 C        |  5   |      |
 D        |  6   |  7   |  8

Built-in feature of tablefunc module

The tablefunc module provides a simple infrastructure for generic crosstab() calls without providing a column definition list. A number of functions written in C (typically very fast):

crosstabN()

crosstab1() - crosstab4() are pre-defined. One minor point: they require and return all text. So we need to cast our integer values. But it simplifies the call:

SELECT * FROM crosstab4('SELECT row_name, attrib, val::text  -- cast!
                         FROM tbl ORDER BY 1,2')

Result:

 row_name | category_1 | category_2 | category_3 | category_4
----------+------------+------------+------------+------------
 A        | 10         | 20         |            |
 B        | 3          | 4          |            |
 C        | 5          |            |            |
 D        | 6          | 7          | 8          |

Custom crosstab() function

For more columns or other data types, we create our own composite type and function (once).
Type:

CREATE TYPE tablefunc_crosstab_int_5 AS (
  row_name text, val1 int, val2 int, val3 int, val4 int, val5 int);

Function:

CREATE OR REPLACE FUNCTION crosstab_int_5(text)
  RETURNS SETOF tablefunc_crosstab_int_5
AS '$libdir/tablefunc', 'crosstab' LANGUAGE c STABLE STRICT;

Call:

SELECT * FROM crosstab_int_5('SELECT row_name, attrib, val   -- no cast!
                              FROM tbl ORDER BY 1,2');

Result:

 row_name | val1 | val2 | val3 | val4 | val5
----------+------+------+------+------+------
 A        |   10 |   20 |      |      |
 B        |    3 |    4 |      |      |
 C        |    5 |      |      |      |
 D        |    6 |    7 |    8 |      |

One polymorphic, dynamic function for all

This goes beyond what's covered by the tablefunc module.
To make the return type dynamic I use a polymorphic type with a technique detailed in this related answer:

  • Refactor a PL/pgSQL function to return the output of various SELECT queries

1-parameter form:

CREATE OR REPLACE FUNCTION crosstab_n(_qry text, _rowtype anyelement)
  RETURNS SETOF anyelement AS
$func$
BEGIN
   RETURN QUERY EXECUTE 
   (SELECT format('SELECT * FROM crosstab(%L) t(%s)'
                , _qry
                , string_agg(quote_ident(attname) || ' ' || atttypid::regtype
                           , ', ' ORDER BY attnum))
    FROM   pg_attribute
    WHERE  attrelid = pg_typeof(_rowtype)::text::regclass
    AND    attnum > 0
    AND    NOT attisdropped);
END
$func$  LANGUAGE plpgsql;

Overload with this variant for the 2-parameter form:

CREATE OR REPLACE FUNCTION crosstab_n(_qry text, _cat_qry text, _rowtype anyelement)
  RETURNS SETOF anyelement AS
$func$
BEGIN
   RETURN QUERY EXECUTE 
   (SELECT format('SELECT * FROM crosstab(%L, %L) t(%s)'
                , _qry, _cat_qry
                , string_agg(quote_ident(attname) || ' ' || atttypid::regtype
                           , ', ' ORDER BY attnum))
    FROM   pg_attribute
    WHERE  attrelid = pg_typeof(_rowtype)::text::regclass
    AND    attnum > 0
    AND    NOT attisdropped);
END
$func$  LANGUAGE plpgsql;

pg_typeof(_rowtype)::text::regclass: There is a row type defined for every user-defined composite type, so that attributes (columns) are listed in the system catalog pg_attribute. The fast lane to get it: cast the registered type (regtype) to text and cast this text to regclass.

Create composite types once:

You need to define once every return type you are going to use:

CREATE TYPE tablefunc_crosstab_int_3 AS (
    row_name text, val1 int, val2 int, val3 int);

CREATE TYPE tablefunc_crosstab_int_4 AS (
    row_name text, val1 int, val2 int, val3 int, val4 int);

...

For ad-hoc calls, you can also just create a temporary table to the same (temporary) effect:

CREATE TEMP TABLE temp_xtype7 AS (
    row_name text, x1 int, x2 int, x3 int, x4 int, x5 int, x6 int, x7 int);

Or use the type of an existing table, view or materialized view if available.

Call

Using above row types:

1-parameter form (no missing values):

SELECT * FROM crosstab_n(
   'SELECT row_name, attrib, val FROM tbl ORDER BY 1,2'
 , NULL::tablefunc_crosstab_int_3);

2-parameter form (some values can be missing):

SELECT * FROM crosstab_n(
   'SELECT row_name, attrib, val FROM tbl ORDER BY 1'
 , $$VALUES ('val1'), ('val2'), ('val3')$$
 , NULL::tablefunc_crosstab_int_3);

This one function works for all return types, while the crosstabN() framework provided by the tablefunc module needs a separate function for each.
If you have named your types in sequence like demonstrated above, you only have to replace the bold number. To find the maximum number of categories in the base table:

SELECT max(count(*)) OVER () FROM tbl  -- returns 3
GROUP  BY row_name
LIMIT  1;

That's about as dynamic as this gets if you want individual columns. Arrays like demonstrated by @Clocoaldo or a simple text representation or the result wrapped in a document type like json or hstore can work for any number of categories dynamically.

Disclaimer:
It's always potentially dangerous when user input is converted to code. Make sure this cannot be used for SQL injection. Don't accept input from untrusted users (directly).

Call for original question:

SELECT * FROM crosstab_n('SELECT bar, 1, feh FROM tbl_org ORDER BY 1,2'
                       , NULL::tablefunc_crosstab_int_3);
Tuesday, June 1, 2021
 
borrible
answered 7 Months ago
39

The problem with your query is that b and c share the same timestamp 2012-01-02 00:00:00, and you have the timestamp column timeof first in your query, so - even though you added bold emphasis - b and c are just extra columns that fall in the same group 2012-01-02 00:00:00. Only the first (b) is returned since (quoting the manual):

The row_name column must be first. The category and value columns must be the last two columns, in that order. Any columns between row_name and category are treated as "extra". The "extra" columns are expected to be the same for all rows with the same row_name value.

Bold emphasis mine.
Just revert the order of the first two columns to make entity the row name and it works as desired:

SELECT * FROM crosstab(
      'SELECT entity, timeof, status, ct
       FROM   t4
       ORDER  BY 1'
      ,'VALUES (1), (0)')
 AS ct (
    "Attribute" character
   ,"Section" timestamp
   ,"status_1" int
   ,"status_0" int);

entity must be unique, of course.

Reiterate

  • row_name first
  • (optional) extra columns next
  • category (as defined by the second parameter) and value last.

Extra columns are filled from the first row from each row_name partition. Values from other rows are ignored, there is only one column per row_name to fill. Typically those would be the same for every row of one row_name, but that's up to you.

For the different setup in your answer:

SELECT localt, entity
     , msrmnt01, msrmnt02, msrmnt03, msrmnt04, msrmnt05  -- , more?
FROM   crosstab(
        'SELECT dense_rank() OVER (ORDER BY localt, entity)::int AS row_name
              , localt, entity -- additional columns
              , msrmnt, val
         FROM   test
         -- WHERE  ???   -- instead of LIMIT at the end
         ORDER  BY localt, entity, msrmnt
         -- LIMIT ???'   -- instead of LIMIT at the end
     , $$SELECT generate_series(1,5)$$)  -- more?
     AS ct (row_name int, localt timestamp, entity int
          , msrmnt01 float8, msrmnt02 float8, msrmnt03 float8, msrmnt04 float8, msrmnt05 float8 -- , more?
            )
LIMIT 1000  -- ??!!

No wonder the queries in your test perform terribly. Your test setup has 14M rows and you process all of them before throwing most of it away with LIMIT 1000. For a reduced result set add WHERE conditions or a LIMIT to the source query!

Plus, the array you work with is needlessly expensive on top of it. I generate a surrogate row name with dense_rank() instead.

db<>fiddle here - with a simpler test setup and fewer rows.

Sunday, June 6, 2021
 
SJain
answered 6 Months ago
57

UPDATE - 6/11/2015

Support-v7 library now includes PreferenceFragmentCompat. So it will be a better idea to use it.


Add the following project as a library project to your application.

https://github.com/kolavar/android-support-v4-preferencefragment

You can keep everything including your fragment transaction as it is. When importing the PreferenceFragment class, make sure the correct import header is user.

import android.support.v4.preference.PreferenceFragment;

instead of

import android.preference.PreferenceFragment;
Thursday, June 10, 2021
 
skrilled
answered 6 Months ago
53

Can someone explain to me why you can't pass a key as reference?

Because the language does not support this. You'd be hard-pressed to find this ability in most languages, hence the term key.

So am I stuck with something like this?

Yes. The best way is to create a new array with the appropriate keys.

Any alternatives?

The only way to provide better alternatives is to know your specific situation. If your keys map to table column names, then the best approach is to leave the keys as is and escape them at their time of use in your SQL.

Sunday, August 22, 2021
 
Keat
answered 4 Months ago
93

You need to ORDER BY the first query accordingly. I use the simplified syntax ORDER BY <ordinal number> here.

SELECT *
FROM   crosstab(
        'SELECT client_id
               ,extract(year from date)
               ,sum(amount)
         FROM   orders
         GROUP  BY 1,2
         ORDER  BY 1,2',

        'SELECT extract(year from date)
         FROM   orders
         GROUP  BY 1
         ORDER  BY 1')
AS orders(
    row_name integer,
    year_2001 text,
    year_2002 text,
    year_2003 text,
    year_2004 text,
    year_2005 text,
    year_2006 text,
    year_2007 text,
    year_2008 text,
    year_2009 text,
    year_2010 text,
    year_2011 text);

The crosstab() function is not included in standard PostgreSQL but comes with the additional module tablefunc.

Edit for additional request

Version without crosstab() function: Only group by client_id or you will end up with multiple rows per client_id.

SELECT client_id
      ,sum(CASE WHEN extract(year from date) = 2001 THEN amount ELSE 0 END) AS year_2001
      ,sum(CASE WHEN extract(year from date) = 2002 THEN amount ELSE 0 END) AS year_2002
       -- ...
FROM   orders o
GROUP  BY 1
ORDER  BY 1;
Tuesday, November 23, 2021
 
Kasun Sandaruwan
answered 1 Week ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share