Db2 UDB Developer Tips: Query Tuning Tips

1) E.g. Table1 has 1000 rows and Table2 has 1 row.
Select * from table1, table2 will give faster result than Select * from table2, table1

2) If 3 tables are joined, select the intersection table as the driving table. Intersection table is the table that has many tables dependent on it.

3) Joins: Table joins should be written first before any condition in the Where clause. The condition that filters the max records should be at the end below the joins. Parsing is done from BOTTOM to TOP.

4) Avoid using Select * from

5) Use DECODE to improve performance. (DECODE is like COALESCE)

6) Count(*) is faster than Count(1). Count(pkey) is the fastest though.

7) Restrict the records with WHERE clause instead of using HAVING clause.

8) Minimize Table lookup in Query:
e.g. Select * from tab1
where col1 = (select col1 from tab2 where colx = 3)
and col2 = (select col2 from tab2 where colx = 4)
an efficient way to this is:
Select * from tab1
where (col1, col2) = (select col1, col2 from tab2 where col2 = 3)
The same approach can be used for updates.

9) Use Exists clause instead of In clause and Not Exists instead of Not In clause

10) Use Exists in place of Distinct clause
E.g. Select Distinct a.col1, a.col2 From tab1 a, tab2 b
Where a.col3 = b.col3
Instead the query can be written as:
Select a.col1, a.col2
From tab1 a
Where Exists (select 'X' from tab2 b
Where a.col3 = b.col3)

11) Use Explain Plan to get the query execution process.

12) Use Indexes for faster retrieval of data.

13) Index will be used if the query has the column on which the index is created. If the columns that are not present in the index are selected, the index is not used.

14) Avoid use of UNION clause as far as possible

15) Avoid Is Null and Is Not Null on indexed columns

16) Using Hints helps in performance improvement

17) Avoid typecasting of indexed columns

18) Not, !=, <, || will disable use of Indexes

19) Arithmetic operations in the Where clause will disable indexes

20) Use OR clause instead of In clause

e.g. Select * from tab where col1 in ('a','b')
instead use: Select * from tab where col1 = 'a' or col1 = 'b'

21) Avoid unnecessary use of Union, Distinct, minus, intersect, order by and group by

22) DISTINCT - always results in a sort

UNION - always results in a sort

UNION ALL - does not sort, but retains any duplicates

23) ORDER BY

–may be faster if columns are indexed

–use it to guarantee the sequence of the data

24) GROUP BY

–specify only columns that need to be grouped

–may be faster if the columns are indexed

–do not include extra columns in SELECT list or GROUP BY because DB2 must sort the rows

24) Create indexes for columns you frequently:

–ORDER BY

–GROUP BY (better than a DISTINCT)

–SELECT DISTINCT

–JOIN

25) When the results of a join must be sorted -

–limiting the ORDER BY to columns of a single table can avoid a sort

–specifying columns from multiple tables causes a sort

26) Favor coding explicit INNER and LEFT OUT joins over RIGHT OUTER joins

–EXPLAIN converts RIGHT to LEFT join

27)BETWEEN is usually more efficient than <= predicate and the >= predicate . Except when comparing a host variable to 2 columns

28) Avoid the % or the _ at the beginning because it prevents DB2 from using a matching index and may cause a scan. Use the % or the _ at the end to encourage index usage

29) For Subquery - when using negation logic:

–Use NOT Exists (DB2 tests non-existence)

–Instead of NOT IN (DB2 must materialize the complete result set)

30) Use EXISTS to test for a condition and get a True or False returned by DB2 and not return any rows to the query

31) After the indexes, place the predicate that will eliminate the greatest number of rows first

32) Hi,

Check this points...It may help U.

1) Avoid distinct where ever possible. Check whether distinct is

required or not. No distinct when PK or UK are retrieved.

2) One can consider usage of union where OR condition exits &

eliminate distincts.

3) Conditions which are likely to fail should be kept first in a set

of conditions separated by AND.

4) Always use aliases.

5) Do not involve columns in an expression.

select * from emp where salary/12 >= 4000;

The query should be:

select * from emp where salary >= 4000 * 12;

i.e. Avoid using Arithmetic within SQL statements.Arithmetic in a SQL

statement will cause DB2 to avoid the use of an index.

6) Try to avoid usage of in-built or user defined functions.

select * from employee where substr(name,1,1) = 'G';

The query should be:

select * from employee where name = 'G%';

7) Avoid datatype mismatch since it may lead to implicit/explicit

casting.

select * from emp where sal = '1000';

The query should be:

select * from emp where sal = 1000;

8) Substitute unnecessary group by & having with where clause.

select avg(salary) as avgsalary, dept from employee group by dept

having dept = 'information systems';

The query should be:

select avg(salary) as avgsalary, dept from employee where dept =

'information systems';

9)Saving CPU in your multiple counts SQL statements

When requiring multiple counts you can choose to write multiple SQL statements (or write a program) such as:

SELECT COUNT(*) AS UNDER_40K WHERE SALARY < 40000

AND

SELECT COUNT(*) AS ABOVE_40K WHERE SALARY >= 40000

Or you can choose to simulate these multiple counts in one SQL using a combination of SUM and CASE in one pass of the data. Note SUM has been used here due to prior limitations with COUNT.

SELECT SUM(CASE WHEN SALARY < 40000
THEN 1 ELSE 0 END) AS UNDER_40K
,SUM(CASE WHEN SALARY >= 40000
THEN 1 ELSE 0 END) AS ABOVE_40K
FROM DSN8710.EMP

The theory is by tagging the row with a 1 if the WHEN condition is true, the SUM will add all of the 1’s, resulting in a final count of the number of rows matching the WHEN condition. A failure of the WHEN clause will result in the row being tagged with a zero or a NULL (my example defaults to zero). Neither will impact the final count.

ELSE 0 vs ELSE NULL

Click More to find out how you can save CPU!

To save CPU when running these queries you may want to consider changing the ELSE 0 to ELSE NULL (or leave out the ELSE, which defaults to NULL).

SELECT SUM(CASE WHEN SALARY < 40000
THEN 1 ELSE NULL END) AS UNDER_40K,
SUM(CASE WHEN SALARY >= 40000 THEN 1 ELSE NULL END) AS ABOVE_40K
FROM DSN8710.EMP

Since column functions ignore NULL, by specifying ELSE NULL for the false conditions, DB2 will not attempt to SUM these rows. Whereas specifying ELSE 0, DB2 will attempt to add the zero to the final SUM.

Specifying ELSE NULL can save you significant CPU if many conditions are optional.

Using COUNT instead of SUM

To take the CPU improvement a step further, DB2 V7 for z/OS expands the capabilities of the COUNT function to support expressions. This functionality is already there for DB2 UDB on other platforms.

The COUNT function can therefore be applied to the multiple counts problem, and is more logical than using SUM to simulate counts.

SELECT COUNT(CASE WHEN SALARY < 40000
THEN ‘’ ELSE NULL END) AS UNDER_40K
,COUNT(CASE WHEN SALARY >= 40000
THEN ‘’ ELSE NULL END) AS ABOVE_40K
FROM DSN8710.EMP

When using the COUNT function, the ELSE condition must assign NULLs, since COUNT will count the occurrences of non-NULL values, whereas SUM would sum up the one’s. It is therefore not important what is assigned for the THEN condition within the COUNT. The example above uses the empty string ‘’ because it requires less internal storage than the integer value 1.

In a sample test, usage of this COUNT syntax outperformed the SUM (with ELSE NULL) by approximately 4%

Db2 UDB Developer Tips

Monday, April 29, 2013

Query Tuning Tips

No comments:

Post a Comment