//
you're reading...
SQL

A Short Guide to Oracle SQL [VII] – Aggregate Functions

Group function takes an entire column of data or set of data as its arguments and produces a single result that summarizes the set of data. The most commonly used Aggregate functions are COUNT, SUM, AVG, MAX, and MIN. Aggregate functions can appear in SELECT lists and in ORDER BY and HAVING clauses.

Here we talk about group functions mention above as well as GROUP BY and HAVING clause of SELECT statement and ROLLUP and CUBE extension of GROUP BY clause.

NOTE:

    • Group functions do not process NULL values except the COUNT(*), GROUPING, and GROUPING_ID function. Use NVL function to include null values.
    • The arguments for AVG, SUM, VARIANCE and STDDEV must be of any number data type or any data type that can be implicitly converted to numeric data type.
    • Commonly used with the GROUP BY clause.
    • Use DISTINCT or UNIQUE to avoid duplicate values in calculation.
    • Unlike Scalar function, Group function can be nested only two levels deep.

COUNT
COUNT({ * | [ DISTINCT | ALL ] expr }) function returns the number of row selected by the query.
COUNT(*) counts the number of row in a table/in a group irrespective of NULL values. If there are no rows in table, it returns zero.

 
hr@O11R2> SELECT COUNT(*) "All Rows" 
       FROM employees;

  All Rows 
---------- 
       107 

hr@O11R2> SELECT COUNT(*) 
     FROM employees
     WHERE commission_pct IS NOT NULL;

  COUNT(*)
----------
        35

hr@O11R2> SELECT COUNT(*) 
     FROM employees
     WHERE commission_pct IS NULL;

  COUNT(*)
----------
        72

Specifying a expression with COUNT [COUNT(expr)] only counts the rows where expression is not null.

 
hr@O11R2> SELECT COUNT(*) "All Rows", COUNT(commission_pct) "Rows with Value"  
FROM employees;

  All Rows  Rows with Value
----------  ---------------
       107               35

You can also use DISTINCT or UNIQUE keyword with column/expr inside the COUNT function to determine the number of distinct values for the argument. Default is ALL.

 

hr@O11R2> SELECT DISTINCT sal 
      FROM scott.emp;

       SAL
----------
      2450
      5000
      1300
      1250
      2850
      2975
      1100
      3000
       800
      1600
      1500
       950

12 rows selected.

hr@O11R2> SELECT DISTINCT COUNT(sal) 
      FROM scott.emp;

COUNT(SAL)
----------
        14

hr@O11R2> SELECT COUNT(DISTINCT sal) 
      FROM scott.emp;
      
COUNT(DISTINCTSAL)
------------------
                12

hr@O11R2> SELECT COUNT(*) "All Rows", COUNT(DISTINCT commission_pct) "Distinct Val", COUNT(commission_pct) "Rows with Value"  
      FROM employees;

  All Rows Distinct Val Rows with Value
---------- ------------ ---------------
       107            7              35

SUM
SUM([ DISTINCT | ALL ] expr) returns the sum of the expression x.

hr@O11R2> SELECT SUM(salary)
     FROM employees
     WHERE department_id=60;

SUM(SALARY)
-----------
      28800

hr@O11R2> SELECT SUM(salary)
     FROM employees;

SUM(SALARY)
-----------
     691416

Rows with NULL values for expression are ignored.

AVG
AVG([ DISTINCT | ALL ] expr) computes the average of expression. DISTINCT and ALL keyword have the same meaning here.


hr@O11R2> SELECT AVG(salary)
     FROM employees
     WHERE department_id=60;

AVG(SALARY)
-----------
       5760

hr@O11R2> SELECT AVG(salary)
     FROM employees;

AVG(SALARY)
-----------
 6461.83178

hr@O11R2> SELECT ROUND(AVG(salary))
     FROM employees;

ROUND(AVG(SALARY))
------------------
              6462

Rows having null value will be ignored. Use NULL handling functions to include/substitute null value.

MAX
MAX([ DISTINCT | ALL ] expr) returns the highest value of the given expression. The data type for expr is not limited to nUMBER and can be of character, date time data type.

hr@O11R2> SELECT MAX(salary)
     FROM employees;

MAX(SALARY)
-----------
      24000

hr@O11R2> SELECT MAX(salary)
     FROM employees
     WHERE department_id=60;

MAX(SALARY)
-----------
       9000

It also ignore rows having null values. Specifying DISTINCT or ALL with MAX and MIN function does not affect their functionality.

MIN
MIN([ DISTINCT | ALL ] expr) returns the lowest value of the given expression. Like MAX, it also work with data type other than numeric.

hr@O11R2> SELECT MIN(salary)
     FROM employees
     WHERE department_id=60;

MIN(SALARY)
-----------
       4200

hr@O11R2> SELECT MIN(salary)
     FROM employees;

MIN(SALARY)
-----------
       2100

MIN also ignore null values just like MAX, AVG etc.

GROUP BY Clause
Group by is used to divide the table’s row into smaller group. When used with group functions it returns a single row of summary information for each group. The expression specified in the GROUP BY clause is used to collects each group of rows.

SELECT  [DISTINCT | UNIQUE] [*|columnname1 [ AS alias], …]
FROM      tablename;
[WHERE    condition]
[GROUP BY group_by_expression]
[ORDER BY columnname];
hr@O11R2> SELECT department_name "Department", MAX(salary) "Max Salary", 
                 MIN(salary) "Min Salary", AVG(salary) "Average", COUNT(employee_id) "Employees" 
      FROM employees
      JOIN departments
      USING (department_id)
      GROUP BY department_name;

Department                     Max Salary Min Salary    Average  Employees
------------------------------ ---------- ---------- ---------- ----------
Administration                       4400       4400       4400          1
Accounting                          12008       8300      10154          2
Purchasing                          11000       2500       4150          6
Human Resources                      6500       6500       6500          1
IT                                   9000       4200       5760          5
Public Relations                    10000      10000      10000          1
Executive                           24000      17000 19333.3333          3
Shipping                             8200       2100 3475.55556         45
Sales                               14000       6100 8955.88235         34
Finance                             12008       6900 8601.33333          6
Marketing                           13000       6000       9500          2

11 rows selected.

NOTE:

    • A GROUP BY clause cannot contains column aliases.
    • If GROUP BY clause is present in the SELECT statement, then SELECT list can contains only grouping column (attribute appearing in the group by clause), aggregate functions (on any columns) or literals/constants.
      But the GROUP BY clause contains attributes that are not in the SELECT list.

Having Clause

A HAVING clause specify a condition on the GROUP function. It is different from WHERE clause in the sense that WHERE clause removes the rows (apply condition) before grouping. Another reason to specify HAVING clause is, WHERE clause does not work with Aggregate functions.

SELECT  [DISTINCT | UNIQUE] [*|columnname1 [ AS alias], …]
FROM      tablename;
[WHERE    condition]
[GROUP BY group_by_expression]
[HAVING   group_condition]
[ORDER BY columnname];
hr@O11R2> SELECT department_name "Department", MAX(salary) "Max Salary",
                         MIN(salary) "Min Salary", AVG(salary) "Average", COUNT(employee_id) "Employees"
              FROM employees
              JOIN departments
              USING (department_id)
              WHERE location_id <> 2500
              GROUP BY department_name
              HAVING COUNT(employee_id)>1;

Department                     Max Salary Min Salary    Average  Employees
------------------------------ ---------- ---------- ---------- ----------
Accounting                          12008       8300      10154          2
Purchasing                          11000       2500       4150          6
IT                                   9000       4200       5760          5
Executive                           24000      17000 19333.3333          3
Shipping                             8200       2100 3475.55556         45
Finance                             12008       6900 8601.33333          6
Marketing                           13000       6000       9500          2

7 rows selected.

ROLLUP
ROLLUP produces (group) subtotal (at multiple levels, aggregating from right to left) and a grand total.

Look at these examples:

hr@O11R2> SELECT region_name, count(employee_id) "No. of Employees"
     FROM emp_details_view
     GROUP BY region_name;

REGION_NAME               No. of Employees
------------------------- ----------------
Europe                                  36
Americas                                70

hr@O11R2> SELECT region_name, city, count(employee_id) "No. of Employees"
     FROM emp_details_view
     GROUP BY region_name, city;

REGION_NAME               CITY                           No. of Employees
------------------------- ------------------------------ ----------------
Europe                    London                                        1
Europe                    Munich                                        1
Americas                  Southlake                                     5
Americas                  South San Francisco                          45
Europe                    Oxford                                       34
Americas                  Toronto                                       2
Americas                  Seattle                                      18

7 rows selected.

hr@O11R2> SELECT region_name, city, count(employee_id) "No. of Employees"
     FROM emp_details_view
     GROUP BY ROLLUP (region_name, city);                                      --ROLLUP op

REGION_NAME               CITY                           No. of Employees
------------------------- ------------------------------ ----------------
Europe                    London                                        1
Europe                    Munich                                        1
Europe                    Oxford                                       34
Europe                                                                 36       -- First Group subtotal
Americas                  Seattle                                      18
Americas                  Toronto                                       2
Americas                  Southlake                                     5
Americas                  South San Francisco                          45
Americas                                                               70       -- Second Group subtotal
                                                                      106       -- Grand total

10 rows selected.

hr@O11R2> SELECT region_name, country_name, city, count(employee_id) "No. of Employees"
     FROM emp_details_view
     GROUP BY ROLLUP (region_name, country_name, city);

REGION_NAME        COUNTRY_NAME                 CITY                  No. of Employees
-----------------  ---------------------------- --------------------  ---------------
Europe             Germany                      Munich                  1              -- (region, country, city)
Europe             Germany                                              1              -- (region, country)
Europe             United Kingdom               London                  1              -- (region, country, city)                 
Europe             United Kingdom               Oxford                 34
Europe             United Kingdom                                      35              -- (region, country)
Europe                                                                 36              -- (region)
Americas           Canada                       Toronto                 2              -- (region, country, city)
Americas           Canada                                               2              -- (region, country)
Americas           United States of America     Seattle                18              -- (region, country, city)
Americas           United States of America     Southlake               5
Americas           United States of America     South San Francisco    45
Americas           United States of America                            68              -- (region, country)
Americas                                                               70              -- (region)
                                                                      106              -- Grand Total
14 rows selected.

CUBE
CUBE generates a subtotal for every possible combinations (of columns) specified in the grouped field as well as a grand total.

hr@O11R2> SELECT region_name, city, count(employee_id) "No. of Employees"
     FROM emp_details_view
     GROUP BY CUBE (region_name, city)
     ORDER BY region_name, city;

REGION_NAME               CITY                           No. of Employees
------------------------- ------------------------------ ----------------
Americas                  Seattle                                      18
Americas                  South San Francisco                          45
Americas                  Southlake                                     5
Americas                  Toronto                                       2
Americas                                                               70          
Europe                    London                                        1
Europe                    Munich                                        1
Europe                    Oxford                                       34
Europe                                                                 36          
                          London                                        1          
                          Munich                                        1
                          Oxford                                       34
                          Seattle                                      18
                          South San Francisco                          45
                          Southlake                                     5
                          Toronto                                       2
                                                                      106

17 rows selected.

hr@O11R2> SELECT region_name, country_name, city, count(employee_id) "No. of Employees"
  2  FROM emp_details_view
  3  GROUP BY CUBE (region_name, country_name, city)
  4  ORDER BY region_name;

REGION_NAME COUNTRY_NAME                CITY                        No. of Employees
----------- --------------------------- --------------------------- ----------------
Americas    Canada                      Toronto                                    2
Americas    Canada                                                                 2
Americas    United States of America    Seattle                                   18
Americas    United States of America    South San Francisco                       45
Americas    United States of America    Southlake                                  5
Americas    United States of America                                              68
Americas                                Seattle                                   18
Americas                                South San Francisco                       45
Americas                                Southlake                                  5
Americas                                Toronto                                    2
Americas                                                                          70
Europe      Germany                     Munich                                     1
Europe      Germany                                                                1
Europe      United Kingdom              London                                     1
Europe      United Kingdom              Oxford                                    34
Europe      United Kingdom                                                        35
Europe                                  London                                     1
Europe                                  Munich                                     1
Europe                                  Oxford                                    34
Europe                                                                            36
            Canada                      Toronto                                    2
            Canada                                                                 2
            Germany                     Munich                                     1
            Germany                                                                1
            United Kingdom              London                                     1
            United Kingdom              Oxford                                    34
            United Kingdom                                                        35
            United States of America    Seattle                                   18
            United States of America    South San Francisco                       45
            United States of America    Southlake                                  5
            United States of America                                              68
                                        London                                     1
                                        Munich                                     1
                                        Oxford                                    34
                                        Seattle                                   18
                                        South San Francisco                       45
                                        Southlake                                  5
                                        Toronto                                    2
                                                                                 106

39 rows selected.
Advertisements

Discussion

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: