Db2 UDB Developer Tips: April 2013

Monday, April 29, 2013

How can I delete x number of rows

How can I delete x number of rows?

delete from (select 1 from howardg.prod_bal fetch first 10 rows only)

How can I delete x number of rows in a loop?

CREATE PROCEDURE DELETE_IN_LOOP()

LANGUAGE SQL

BEGIN

DECLARE at_end INTEGER DEFAULT 0;

DECLARE not_found CONDITION FOR SQLSTATE '02000';

DECLARE CONTINUE HANDLER FOR not_found SET at_end = 1;

REPEAT

delete from (select * from TABSCHEMA.TABNAME_TB a where a.CIL1 = 'Y' and a.DATE = '2012-09-04' fetch first 1000000 rows only );

UNTIL ( at_end = 1)

END REPEAT;

END

The goal is to avoid log file full. If using table scan, the SP may scan the table 5 times of the original. May consider creating index before the delete statement.

How can I delete x number of rows in a loop for any table?

CREATE PROCEDURE DELETE_WITH_COND(IN tbname varchar(100), IN vpred varchar(200))

MODIFIES SQL DATA

NOT DETERMINISTIC

NULL CALL

LANGUAGE SQL

BEGIN

DECLARE not_found CONDITION FOR SQLSTATE '02000';

DECLARE at_end INTEGER DEFAULT 0;

DECLARE txt VARCHAR(2000);

DECLARE stmt STATEMENT;

DECLARE CONTINUE HANDLER FOR not_found SET at_end = 1;

SET txt = 'delete from (select * from ' || tbname || ' where ' || vpred || ' fetch first 1000000 rows only)';

PREPARE stmt FROM txt;

REPEAT

EXECUTE stmt;

UNTIL ( at_end = 1)

END REPEAT;

END

Query Tuning Tips

1) E.g. Table1 has 1000 rows and Table2 has 1 row.
Select * from table1, table2 will give faster result than Select * from table2, table1

2) If 3 tables are joined, select the intersection table as the driving table. Intersection table is the table that has many tables dependent on it.

3) Joins: Table joins should be written first before any condition in the Where clause. The condition that filters the max records should be at the end below the joins. Parsing is done from BOTTOM to TOP.

4) Avoid using Select * from

5) Use DECODE to improve performance. (DECODE is like COALESCE)

6) Count(*) is faster than Count(1). Count(pkey) is the fastest though.

7) Restrict the records with WHERE clause instead of using HAVING clause.

8) Minimize Table lookup in Query:
e.g. Select * from tab1
where col1 = (select col1 from tab2 where colx = 3)
and col2 = (select col2 from tab2 where colx = 4)
an efficient way to this is:
Select * from tab1
where (col1, col2) = (select col1, col2 from tab2 where col2 = 3)
The same approach can be used for updates.

9) Use Exists clause instead of In clause and Not Exists instead of Not In clause

10) Use Exists in place of Distinct clause
E.g. Select Distinct a.col1, a.col2 From tab1 a, tab2 b
Where a.col3 = b.col3
Instead the query can be written as:
Select a.col1, a.col2
From tab1 a
Where Exists (select 'X' from tab2 b
Where a.col3 = b.col3)

11) Use Explain Plan to get the query execution process.

12) Use Indexes for faster retrieval of data.

13) Index will be used if the query has the column on which the index is created. If the columns that are not present in the index are selected, the index is not used.

14) Avoid use of UNION clause as far as possible

15) Avoid Is Null and Is Not Null on indexed columns

16) Using Hints helps in performance improvement

17) Avoid typecasting of indexed columns

18) Not, !=, <, || will disable use of Indexes

19) Arithmetic operations in the Where clause will disable indexes

20) Use OR clause instead of In clause

e.g. Select * from tab where col1 in ('a','b')
instead use: Select * from tab where col1 = 'a' or col1 = 'b'

21) Avoid unnecessary use of Union, Distinct, minus, intersect, order by and group by

22) DISTINCT - always results in a sort

UNION - always results in a sort

UNION ALL - does not sort, but retains any duplicates

23) ORDER BY

–may be faster if columns are indexed

–use it to guarantee the sequence of the data

24) GROUP BY

–specify only columns that need to be grouped

–may be faster if the columns are indexed

–do not include extra columns in SELECT list or GROUP BY because DB2 must sort the rows

24) Create indexes for columns you frequently:

–ORDER BY

–GROUP BY (better than a DISTINCT)

–SELECT DISTINCT

–JOIN

25) When the results of a join must be sorted -

–limiting the ORDER BY to columns of a single table can avoid a sort

–specifying columns from multiple tables causes a sort

26) Favor coding explicit INNER and LEFT OUT joins over RIGHT OUTER joins

–EXPLAIN converts RIGHT to LEFT join

27)BETWEEN is usually more efficient than <= predicate and the >= predicate . Except when comparing a host variable to 2 columns

28) Avoid the % or the _ at the beginning because it prevents DB2 from using a matching index and may cause a scan. Use the % or the _ at the end to encourage index usage

29) For Subquery - when using negation logic:

–Use NOT Exists (DB2 tests non-existence)

–Instead of NOT IN (DB2 must materialize the complete result set)

30) Use EXISTS to test for a condition and get a True or False returned by DB2 and not return any rows to the query

31) After the indexes, place the predicate that will eliminate the greatest number of rows first

32) Hi,

Check this points...It may help U.

1) Avoid distinct where ever possible. Check whether distinct is

required or not. No distinct when PK or UK are retrieved.

2) One can consider usage of union where OR condition exits &

eliminate distincts.

3) Conditions which are likely to fail should be kept first in a set

of conditions separated by AND.

4) Always use aliases.

5) Do not involve columns in an expression.

select * from emp where salary/12 >= 4000;

The query should be:

select * from emp where salary >= 4000 * 12;

i.e. Avoid using Arithmetic within SQL statements.Arithmetic in a SQL

statement will cause DB2 to avoid the use of an index.

6) Try to avoid usage of in-built or user defined functions.

select * from employee where substr(name,1,1) = 'G';

The query should be:

select * from employee where name = 'G%';

7) Avoid datatype mismatch since it may lead to implicit/explicit

casting.

select * from emp where sal = '1000';

The query should be:

select * from emp where sal = 1000;

8) Substitute unnecessary group by & having with where clause.

select avg(salary) as avgsalary, dept from employee group by dept

having dept = 'information systems';

The query should be:

select avg(salary) as avgsalary, dept from employee where dept =

'information systems';

9)Saving CPU in your multiple counts SQL statements

When requiring multiple counts you can choose to write multiple SQL statements (or write a program) such as:

SELECT COUNT(*) AS UNDER_40K WHERE SALARY < 40000

AND

SELECT COUNT(*) AS ABOVE_40K WHERE SALARY >= 40000

Or you can choose to simulate these multiple counts in one SQL using a combination of SUM and CASE in one pass of the data. Note SUM has been used here due to prior limitations with COUNT.

SELECT SUM(CASE WHEN SALARY < 40000
THEN 1 ELSE 0 END) AS UNDER_40K
,SUM(CASE WHEN SALARY >= 40000
THEN 1 ELSE 0 END) AS ABOVE_40K
FROM DSN8710.EMP

The theory is by tagging the row with a 1 if the WHEN condition is true, the SUM will add all of the 1’s, resulting in a final count of the number of rows matching the WHEN condition. A failure of the WHEN clause will result in the row being tagged with a zero or a NULL (my example defaults to zero). Neither will impact the final count.

ELSE 0 vs ELSE NULL

Click More to find out how you can save CPU!

To save CPU when running these queries you may want to consider changing the ELSE 0 to ELSE NULL (or leave out the ELSE, which defaults to NULL).

SELECT SUM(CASE WHEN SALARY < 40000
THEN 1 ELSE NULL END) AS UNDER_40K,
SUM(CASE WHEN SALARY >= 40000 THEN 1 ELSE NULL END) AS ABOVE_40K
FROM DSN8710.EMP

Since column functions ignore NULL, by specifying ELSE NULL for the false conditions, DB2 will not attempt to SUM these rows. Whereas specifying ELSE 0, DB2 will attempt to add the zero to the final SUM.

Specifying ELSE NULL can save you significant CPU if many conditions are optional.

Using COUNT instead of SUM

To take the CPU improvement a step further, DB2 V7 for z/OS expands the capabilities of the COUNT function to support expressions. This functionality is already there for DB2 UDB on other platforms.

The COUNT function can therefore be applied to the multiple counts problem, and is more logical than using SUM to simulate counts.

SELECT COUNT(CASE WHEN SALARY < 40000
THEN ‘’ ELSE NULL END) AS UNDER_40K
,COUNT(CASE WHEN SALARY >= 40000
THEN ‘’ ELSE NULL END) AS ABOVE_40K
FROM DSN8710.EMP

When using the COUNT function, the ELSE condition must assign NULLs, since COUNT will count the occurrences of non-NULL values, whereas SUM would sum up the one’s. It is therefore not important what is assigned for the THEN condition within the COUNT. The example above uses the empty string ‘’ because it requires less internal storage than the integer value 1.

In a sample test, usage of this COUNT syntax outperformed the SUM (with ELSE NULL) by approximately 4%

Monday, April 22, 2013

How to create a incremental replicated MQT

How to create a incremental replicated MQT :

There are 2 ways, If you are inserting into you table then follow steps 1 .if u have load then follow setp 2

STEP1:

CREATE TABLE SCHEMA.SAMPLE LIKE SCHEMA.SAMPLE ;

CREATE UNIQUE INDEX "SCHEMA"."XUIt_30" ON "SCHEMA"."SAMPLE"
                ("RUN_ID" ASC,
                 "TABLE_ID" ASC,
                 "BUSINESS_DATE" ASC,
                 "ROW_ID" ASC)
                CLUSTER ALLOW REVERSE SCANS;
-- DDL Statements for primary key on Table "SCHEMA"."SAMPLE"

ALTER TABLE "SCHEMA"."SAMPLE"
        ADD CONSTRAINT "XUIHUBt_30" PRIMARY KEY
                ("RUN_ID",
                 "TABLE_ID",
                 "BUSINESS_DATE",
                 "ROW_ID");
DELETE FROM SCHEMA.SAMPLE

INSERT INTO SCHEMA.SAMPLE SELECT * FROM SCHEMA.SAMPLE FETCH FIRST 1000 ROWS ONLY;

CREATE SUMMARY TABLE SCHEMA.R_SAMPLE AS (SELECT *
FROM SCHEMA.SAMPLE ) DATA INITIALLY DEFERRED REFRESH DEFERRED;

REFRESH TABLE SCHEMA.R_SAMPLE ;

CREATE TABLE SCHEMA.STG_SAMPLE FOR SCHEMA.R_SAMPLE PROPAGATE IMMEDIATE;
SET INTEGRITY FOR SCHEMA.STG_SAMPLE IMMEDIATE CHECKED;

REFRESH TABLE SCHEMA.R_SAMPLE NOT INCREMENTAL;

SELECT COUNT(1) FROM SCHEMA.R_SAMPLE;

INSERT INTO SCHEMA.SAMPLE SELECT a.* FROM SCHEMA.SAMPLE a WHERE (RUN_ID,BUSINESS_DATE,ROW_ID)
NOT IN (SELECT RUN_ID,BUSINESS_DATE,ROW_ID FROM SCHEMA.SAMPLE b );

SELECT COUNT(1) FROM SCHEMA.STG_SAMPLE;
SELECT COUNT(1) FROM SCHEMA.R_SAMPLE;

REFRESH TABLE SCHEMA.R_SAMPLE INCREMENTAL;

SELECT COUNT(1) FROM SCHEMA.R_SAMPLE;
SELECT COUNT(1) FROM SCHEMA.STG_SAMPLE;

STEP 2:

db2 "export to SCHEMA.SAMPLE of del select * FROM SCHEMA.SAMPLE"

CREATE TABLE SCHEMA.SAMPLE LIKE SCHEMA.HUB_TAPS_CASH ;

CREATE UNIQUE INDEX "SCHEMA"."XUIHUBt_30" ON "SCHEMA"."SAMPLE"
                ("RUN_ID" ASC,
                 "TABLE_ID" ASC,
                 "BUSINESS_DATE" ASC,
                 "ROW_ID" ASC)
                CLUSTER ALLOW REVERSE SCANS;
-- DDL Statements for primary key on Table "SCHEMA"."SAMPLE"

ALTER TABLE "SCHEMA"."SAMPLE"
        ADD CONSTRAINT "XUIHUBt_30" PRIMARY KEY
                ("RUN_ID",
                 "TABLE_ID",
                 "BUSINESS_DATE",
                 "ROW_ID");

select count(1) from SCHEMA.SAMPLE;

db2 " load client from /user/SCHEMA.SAMPLE of del rowcount 1000 INSERT INTO SCHEMA.SAMPLE "

CREATE SUMMARY TABLE SCHEMA.R_SAMPLE AS (SELECT *
FROM SCHEMA.SAMPLE ) DATA INITIALLY DEFERRED REFRESH DEFERRED;

REFRESH TABLE SCHEMA.R_SAMPLE ;

CREATE TABLE SCHEMA.STG_SAMPLE FOR SCHEMA.R_SAMPLE PROPAGATE IMMEDIATE;
SET INTEGRITY FOR SCHEMA.STG_SAMPLE IMMEDIATE CHECKED;

REFRESH TABLE SCHEMA.R_SAMPLE NOT INCREMENTAL;

SELECT COUNT(1) FROM SCHEMA.R_SAMPLE;

db2 " load client from /user/SCHEMA.HUB_TAPS_CASH of del   INSERT INTO SCHEMA.SAMPLE "
SET INTEGRITY FOR SCHEMA.SAMPLE IMMEDIATE CHECKED;

----
SET INTEGRITY FOR SCHEMA.STG_SAMPLE IMMEDIATE CHECKED;
REFRESH TABLE SCHEMA.R_SAMPLE INCREMENTAL;

SELECT COUNT(1) FROM SCHEMA.R_SAMPLE;
SELECT COUNT(1) FROM SCHEMA.STG_SAMPLE;

How to run dynamic SQL using creator's authorizations

We can have 2 types of SQLs in a proc.

1) Static

2) Dynamic

One very important difference bw them is that while executing the procedure,for Static SQLs DB2 uses authorization ID of the package owner as the value to be used for authorization . But in dynamic SQL , DB2 uses authorization ID of the package executor as the value to be used for authorization.

We can override this behaviorfor dynamic SQLs by using dymanic rules bind.

Bind behavior
At run-time, DB2 uses all the rules that apply to static SQL for authorization and qualification.
That is, take the authorization ID of the package owner as the value to be used for authorization checking of dynamic SQL
statements and the package default qualifier for implicit qualification of unqualified object references within dynamic SQL statements.

call set_routine_opts(get_routine_opts() || ' DYNAMICRULES BIND')@

DROP PROCEDURE SCHEMA.REFRESH_MQT@

CREATE PROCEDURE SCHEMA.REFRESH_MQT(IN IN_TABLE_NAME VARCHAR(128),
                OUT o_sqlcode integer,
                OUT o_sqlstate CHAR(5),
                OUT o_MESSAGE VARCHAR(255))
LANGUAGE SQL
MODIFIES SQL DATA
DYNAMIC RESULT SETS 0
P1:BEGIN
DECLARE SQLCODE INTEGER DEFAULT 0;
DECLARE SQLSTATE CHAR(5) DEFAULT '     ';
DECLARE stmt VARCHAR(228) ;

DECLARE CONTINUE HANDLER FOR SQLEXCEPTION, SQLWARNING, NOT FOUND
EH: BEGIN
get diagnostics EXCEPTION 1 o_MESSAGE = MESSAGE_TEXT;
        VALUES (SQLSTATE,SQLCODE) INTO o_sqlstate, o_sqlcode;
END EH;

SET stmt='SET INTEGRITY FOR SCHEMA.'||TRIM(IN_TABLE_NAME)||' IMMEDIATE CHECKED';
EXECUTE IMMEDIATE stmt;
SET stmt='REFRESH TABLE SCHEMA.R_'||TRIM(IN_TABLE_NAME);
EXECUTE IMMEDIATE stmt;

END P1
@

-- Restore the default bindrules
call set_routine_opts(NULL)@

How to pass parameters in command line

How to pass parameters in command line:

We can use this to test procedure code from command prompt.

DECLARE GLOBAL TEMPORARY TABLE SESSION.MFS (
    TABLE_ID                   INTEGER      NOT NULL,
    RUN_ID                     DECIMAL(10) NOT NULL,
    BUSINESS_DATE              DATE         NOT NULL,
    LOAD_TIME_START            TIMESTAMP,
    LOAD_TIME_END              TIMESTAMP
    )
WITH REPLACE
ON COMMIT PRESERVE ROWS
NOT LOGGED;

BEGIN ATOMIC
DECLARE busdate DATE ;
SET busdate = '2013-01-31' ;
insert into SESSION.MFS select table_id,run_id,business_date,load_time_start,load_time_end from liquidity.met_feed_status where business_date= busdate;
END

select * From SESSION.MFS

How to covert all date values with ISO date format

How to covert all date values with ISO date format "YYYY-MM-DD":

/user/home>db2 "values(current date)"

1
----------
04/13/2011

1 record(s) selected.

which db2
/root/ibmdb2/PROJ/sqllib/9.5.4/exec/sqllib/bin/db2
/user/home>cd /root/ibmdb2/PROJ/sqllib/9.5.4/exec/sqllib//bnd
/root/ibmdb2/PROJ/sqllib/9.5.4/exec/sqllib//bnd@>db2 bind @db2ubind.lst datetime ISO

LINE    MESSAGES FOR db2ubind.lst
------ --------------------------------------------------------------------
        SQL0061W The binder is in progress.

LINE    MESSAGES FOR db2ueiwi.bnd
------ --------------------------------------------------------------------
2239   SQL0204N "SYSTEM.SYSUSERAUTH" is an undefined name.
                  SQLSTATE=42704
2243   SQL0204N "SYSTEM.SYSUSERAUTH" is an undefined name.
                  SQLSTATE=42704

LINE    MESSAGES FOR db2clpnc.bnd
------ --------------------------------------------------------------------
        SQL0595W Isolation level "NC" has been escalated to "UR".
                  SQLSTATE=01526

LINE    MESSAGES FOR db2arxnc.bnd
------ --------------------------------------------------------------------
        SQL0595W Isolation level "NC" has been escalated to "UR".
                  SQLSTATE=01526

LINE    MESSAGES FOR db2ats_sps.bnd
------ --------------------------------------------------------------------
1168   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1198   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1229   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1477   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1494   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1512   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1550   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1674   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1691   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1710   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1727   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1890   SQL0204N "SYSTOOLS.ADMINTASKSTATUS" is an undefined name.
                  SQLSTATE=01532
1945   SQL0204N "SYSTOOLS.ADMINTASKSTATUS" is an undefined name.
                  SQLSTATE=01532
1957   SQL0204N "SYSTOOLS.ADMINTASKS" is an undefined name.
                  SQLSTATE=01532
1974   SQL0204N "SYSTOOLS.ADMINTASKSTATUS" is an undefined name.
                  SQLSTATE=01532

LINE    MESSAGES FOR db2ubind.lst
------ --------------------------------------------------------------------
        SQL0091N Binding was ended with "0" errors and "19"
                  warnings.

/user/home@>db2 "values(current date)"

1
----------
2011-04-13

1 record(s) selected.

How to covert rows to columns

How to covert rows to columns:

We have created a sample table for this exercise:

CREATE TABLE SCHEMA.column_to_row (line# int, word varchar(100));

insert into SCHEMA.column_to_row values (1,'The'), (1,'Horse'),(1,'is'), (1,'white');
insert into SCHEMA.column_to_row values (2,'The'), (2,'Rose'),(2,'is'), (2,'Red');
select * from SCHEMA.column_to_row with ur;

You can use XML elements for converting rows to columns:

SELECT line#, replace(replace(xml2clob(xmlagg(xmlelement(NAME a, s.word))),'<A>',''),'</A>',' ,')
FROM SCHEMA.column_to_row s
where line#=1
GROUP BY line#