Postgresql 处理高级SQL - 1

/ Postgres / 没有评论 / 87浏览

表 + 数据

```sql
CREATE TABLE public.t_oil (
	region text NULL,
	country text NULL,
	"year" int4 NULL,
	production int4 NULL,
	consumption int4 NULL
);
```

下载并导入 数据文件

分组

  1. GROUP BY ROLLUP 在分组时, 附加所有分组总计记录

    SELECT
    	region,
    	AVG(production)
    FROM
    	t_oil
    GROUP BY
    	ROLLUP (region)
    

    结果

    region      |avg                  |
    ------------|---------------------|
                |2596.2942206654991243|
    MiddleEast  |1992.6036866359447005|
    NorthAmerica|4508.7153284671532847|
    
  2. GROUP BY CUBE 每一个列组合的小计记录,同时在末尾加上总计记录

    SELECT
    	region,
    	country,
    	AVG(production)
    FROM
    	t_oil
    GROUP BY
    	CUBE (region, country)
    

    结果 GROUP BY region + GROUP BY country + GROUP BY region, country + 总体平均值

    region      |country           |avg                  |
    ------------|------------------|---------------------|
                |                  |2596.2942206654991243|
    MiddleEast  |OtherMiddleEast   |  74.2173913043478261|
    MiddleEast  |Kuwait            |2083.6956521739130435|
    MiddleEast  |Yemen             | 307.2800000000000000|
    MiddleEast  |Syria             | 335.9767441860465116|
    MiddleEast  |Iraq              |1780.4130434782608696|
    MiddleEast  |SaudiArabien      |7641.8260869565217391|
    NorthAmerica|Canada            |2123.2173913043478261|
    NorthAmerica|USA               |9144.1777777777777778|
    MiddleEast  |UnitedArabEmirates|1936.0434782608695652|
    MiddleEast  |Israel            |                     |
    MiddleEast  |Oman              | 586.4545454545454545|
    MiddleEast  |Iran              |3631.6956521739130435|
    NorthAmerica|Mexico            |2359.5217391304347826|
    MiddleEast  |Qatar             | 609.8695652173913043|
    MiddleEast  |                  |1992.6036866359447005|
    NorthAmerica|                  |4508.7153284671532847|
                |Yemen             | 307.2800000000000000|
                |Iraq              |1780.4130434782608696|
                |Oman              | 586.4545454545454545|
                |OtherMiddleEast   |  74.2173913043478261|
                |Iran              |3631.6956521739130435|
                |USA               |9144.1777777777777778|
                |Qatar             | 609.8695652173913043|
                |Israel            |                     |
                |Mexico            |2359.5217391304347826|
                |Syria             | 335.9767441860465116|
                |SaudiArabien      |7641.8260869565217391|
                |UnitedArabEmirates|1936.0434782608695652|
                |Kuwait            |2083.6956521739130435|
                |Canada            |2123.2173913043478261|
    
  3. GRUOPING SETS 在一个 group by 查询中,根据不同的维度组合进行聚合,等价于将不同维度的 group by 结果集进行 union all

    SELECT
    	region,
    	country,
    	AVG(production)
    FROM
    	t_oil
    GROUP BY
    	GROUPING SETS ((), region, country)
    

    结果

    region      |country           |avg                  |
    ------------|------------------|---------------------|
                |                  |2596.2942206654991243|
    MiddleEast  |                  |1992.6036866359447005|
    NorthAmerica|                  |4508.7153284671532847|
                |Yemen             | 307.2800000000000000|
                |Iraq              |1780.4130434782608696|
                |Oman              | 586.4545454545454545|
                |OtherMiddleEast   |  74.2173913043478261|
                |Iran              |3631.6956521739130435|
                |USA               |9144.1777777777777778|
                |Qatar             | 609.8695652173913043|
                |Israel            |                     |
                |Mexico            |2359.5217391304347826|
                |Syria             | 335.9767441860465116|
                |SaudiArabien      |7641.8260869565217391|
                |UnitedArabEmirates|1936.0434782608695652|
                |Kuwait            |2083.6956521739130435|
                |Canada            |2123.2173913043478261|
    
  4. FILTER 子句, 部分聚集

    SELECT
    	region,
    	AVG(production) FILTER (WHERE "year" < 1990) AS OLD,
    	AVG(production) FILTER (WHERE "year" >= 1990) AS OLD,
    	AVG(production)
    FROM
    	t_oil
    GROUP BY ROLLUP (region)
    

    结果

    region      |old                  |old                  |avg                  |
    ------------|---------------------|---------------------|---------------------|
                |2408.5939597315436242|2801.1831501831501832|2596.2942206654991243|
    MiddleEast  |1747.3258928571428571|2254.2333333333333333|1992.6036866359447005|
    NorthAmerica|4410.2702702702702703|4624.3492063492063492|4508.7153284671532847|
    
  5. WITHIN GROUP 有序集

    数据被正常分组, 然后将每个分组中的数据按照给定的条件排序, 最后在这种排序好的数据上做计算

    SELECT
    	region,
    	percentile_disc(0.5) WITHIN GROUP (ORDER BY production DESC )
    FROM
    	t_oil
    GROUP BY 1
    

    结果

    region      |percentile_disc|
    ------------|---------------|
    MiddleEast  |           1106|
    NorthAmerica|           3054|
    
    SELECT
    	region,
    	percentile_disc(0.5) WITHIN GROUP (ORDER BY production )
    FROM
    	t_oil
    GROUP BY 1
    

    结果

    region      |percentile_disc|
    ------------|---------------|
    MiddleEast  |           1082|
    NorthAmerica|           3054|
    

    percentile_disc 将跳过该分组的 50% 并且返回想要的值, 取值范围 0 ~ 1

    Postgresql为用户提供两种 percentile_函数, percentile_disc 将返回一个值, 它是数据集中实际包含的值。 percentile_cont 将在找不到精确匹配时插值。

    SELECT
    	PERCENTILE_DISC(0.62) WITHIN GROUP ( ORDER BY id),
    	PERCENTILE_CONT(0.62) WITHIN GROUP ( ORDER BY id)
    FROM
    	GENERATE_SERIES(1, 5) AS id;
    

    结果

    percentile_disc|percentile_cont|
    ---------------|---------------|
                  4|           3.48|
    
  6. rank 假象聚集

    SELECT
    	region,
    	RANK(9000) WITHIN GROUP (
    	ORDER BY production DESC NULLS LAST)
    FROM
    	t_oil
    GROUP BY
    	ROLLUP(1)
    

    结果 如果某地区日产9000桶, 那将是北美地区第27好的年份以及中东地区21好

    region      |rank|
    ------------|----|
    MiddleEast  |  21|
    NorthAmerica|  26|
                |  46|
    

窗口函数

  1. OVER

    SELECT
    	region,
    	production,
    	country,
    	"year",
    	consumption,
    	AVG(production) OVER()
    FROM
    	t_oil
    LIMIT 5
    

    结果 增加一列统计所有均值

    region      |production|country|year|consumption|avg                  |
    ------------|----------|-------|----|-----------|---------------------|
    NorthAmerica|      9579|USA    |1966|      12100|2596.2942206654991243|
    NorthAmerica|     10219|USA    |1967|      12567|2596.2942206654991243|
    NorthAmerica|     10600|USA    |1968|      13405|2596.2942206654991243|
    NorthAmerica|     10828|USA    |1969|      14153|2596.2942206654991243|
    NorthAmerica|     11297|USA    |1970|      14710|2596.2942206654991243|
    
  2. PARTITION BY 划分数据

    SELECT
    	region,
    	production,
    	country,
    	"year",
    	consumption,
    	AVG(production) OVER(PARTITION BY country)
    FROM
    	t_oil
    LIMIT 5
    

    结果 增加一列按照country划分数据统计均值

    region      |production|country|year|consumption|avg                  |
    ------------|----------|-------|----|-----------|---------------------|
    NorthAmerica|      1012|Canada |1966|       1167|2123.2173913043478261|
    NorthAmerica|      3332|Canada |2010|       2316|2123.2173913043478261|
    NorthAmerica|      3202|Canada |2009|       2190|2123.2173913043478261|
    NorthAmerica|      3207|Canada |2008|       2315|2123.2173913043478261|
    NorthAmerica|      3290|Canada |2007|       2361|2123.2173913043478261|
    
  3. 窗口中排序与数据

    	SELECT
    	region,
    	production,
    	country,
    	"year",
    	consumption,
    	AVG(production) OVER(PARTITION BY country ORDER BY "year")
    FROM
    	t_oil
    LIMIT 5
    

    结果

    region      |production|country|year|consumption|avg                  |
    ------------|----------|-------|----|-----------|---------------------|
    NorthAmerica|       920|Canada |1965|       1108| 920.0000000000000000|
    NorthAmerica|      1012|Canada |1966|       1167| 966.0000000000000000|
    NorthAmerica|      1106|Canada |1967|       1246|1012.6666666666666667|
    NorthAmerica|      1194|Canada |1968|       1322|1058.0000000000000000|
    NorthAmerica|      1306|Canada |1969|       1380|1107.6000000000000000|
    

待完成

  1. 滑动窗口

  2. 提取窗口子句

  3. 内建窗口函数 rank, dense_rank, ntile, lead, lag, first_value, nth_value, last_value, row_number