Insert into GTT: bulk with APPEND

This post is about the case I had where a GTT was generating too much undo. The effects were amplified by the fact that the database was in flashback logging (see Jonathan Lewis answer in my question on OTN forum about that, but that’s for the next post.

Nothing here is specific to GTT, but generating undo (and the related redo) is even more awkward on GTT. Here is the query:

INSERT INTO SCDAT.INLISTS_INT_SINGLE(INCOUNT,INNUM) VALUES (:v1 ,:v2 )

And SCDAT.INLISTS_INT_SINGLE is a GTT with only two columns and a primary key on it. Maybe they would have created it as an IOT but that’s not (yet?) possible for a GTT. Why is that table used a lot? It’s the application way to pass a list of value of variable size. Is it a good design?

Yes when there is a very large list of values to pass.
Maybe not when the list size is under control: passing a collection is better. Blog about that soon.
Not at all when having just a few values as inserting into a GTT has to allocate a segment, generate undo, etc.

But every software editor must keep code maintainable and cannot have three different SQL statements, depending on the input. Always using collections brings the risk to allocate large structures in PGA. Queries with 1000 values or variables in several IN list is even worse. And Oracle does not offer an in memory temporary table as other RDBMS do. So the design is ok but we need to optimize it.

test case

Let’s build the test case:

SQL> connect demo/demo
Connected.
SQL> 
SQL> DROP TABLE "INLISTS_INT_SINGLE"
  2  /
Table dropped.

SQL> CREATE GLOBAL TEMPORARY TABLE "INLISTS_INT_SINGLE"
  2  (	     "INCOUNT" NUMBER(10,0) CONSTRAINT "C_INLISTS_INT_SINGLE_INCOUNT" NOT NULL ENABLE,
  3  "INNUM" NUMBER(14,0) DEFAULT 0 CONSTRAINT "C_INLISTS_INT_SINGLE_INNUM" NOT NULL ENABLE,
  4   CONSTRAINT "P_INLISTS_INT_SINGLE" PRIMARY KEY ("INCOUNT", "INNUM") RELY ENABLE
  5  ) ON COMMIT DELETE ROWS
  6  /
Table created.

I’ve kept the names because it may help if someone is having the same issue and is googling for it. And – once again – there is no problem about that software. It is globally using oracle in the right way.

conventional row-by-row insert

I run 1000 single row inserts from a pl/sql loop, and I include that in another loop in order to do it 100 times and have more significant statistics.

SQL> set timing on echo on pagesize 1000 trimspool on
SQL> connect demo/demo
Connected.
SQL> begin
  2   for n in 1..1e2 loop
  3    for i in 1..1e3 loop insert into INLISTS_INT_SINGLE values(mod(i,4),i); end loop;
  4    commit;
  5   end loop;
  6  end;
  7  /
PL/SQL procedure successfully completed.
Elapsed: 00:00:13.55

I check my session statistics:

SQL> select name,value from v$mystat join v$statname using(statistic#) where ( name like 'redo%' or name like '%undo%' or name like '%split%' ) and value>0;

NAME                             VALUE
--------------------------- ----------
redo entries                    203077
redo size                     36491420
undo change vector size       20672820
leaf node splits                   400
leaf node 90-10 splits             100

That’s lot of undo, which is the major part of the redo. We don’t expect that for an insert because there is no previous value to store. Except for the index.

Here is the tkprof result:

INSERT INTO INLISTS_INT_SINGLE
VALUES
(MOD(:B1 ,4),:B1 )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 100000     24.18      25.11          0       1918     314937      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   100001     24.18      25.11          0       1918     314937      100000

And the raw dump trace because I want to compare the bind variable trace with the array insert later.

PARSING IN CURSOR #140716771910880 len=55 dep=1 uid=111 oct=2 lid=111 tim=372733000350 hv=593926914 ad='b7a458a8' sqlid='8s7ru50jqd6s2'
BINDS #140716771910880:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=00 csi=00 siz=48 off=0
  kxsbbbfp=7ffb2d341f28  bln=22  avl=02  flg=05
  value=1
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=7ffb2d341f40  bln=22  avl=02  flg=01
  value=1
BINDS #140716771910880:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=00 csi=00 siz=48 off=0
  kxsbbbfp=7ffb2d341f28  bln=22  avl=02  flg=05
  value=2
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1206001 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=7ffb2d341f40  bln=22  avl=02  flg=01
  value=2
EXEC #140716771910880:c=47000,e=87289,p=1,cr=121,cu=20,mis=1,r=1,dep=1,og=1,plh=0,tim=372733087698

conventional array insert

One optimization is to insert in bulk:

SQL> connect demo/demo
Connected.
SQL> declare
  2   type c_type is table of INLISTS_INT_SINGLE%rowtype;
  3   c c_type;
  4  begin
  5   select mod(rownum,4),rownum bulk collect into c from dual connect by 1e3>level;
  6   for n in 1..1e2 loop
  7    forall i in 1 .. c.count insert into INLISTS_INT_SINGLE values c(i);
  8    commit;
  9   end loop;
 10  end;
 11  /
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.83

This is much better with 3x less undo generated:

SQL> select name,value from v$mystat join v$statname using(statistic#) where ( name like 'redo%' or name like '%undo%' or name like '%split%' ) and value>0;

NAME                             VALUE
-------------------------- ----------
redo entries                    31787
redo size                     9674492
undo change vector size       6892060
leaf node splits                  400
leaf node 90-10 splits            100

And this is what was doing the application I way looking at. And that can be seen from tkprof where the number of rows per execution shows the average array size:

INSERT INTO INLISTS_INT_SINGLE
VALUES
 (:B1 ,:B2 )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    100      0.78       0.78          0       2009      68277      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      101      0.79       0.78          0       2009      68277      100000

However, I expected to see it from the bind trace but I can see only the first value of each array:

BINDS #140157072517608:
 Bind#0
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=204001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f78dc973530  bln=22  avl=02  flg=09
  value=1
 Bind#1
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=204001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f78dc973548  bln=22  avl=02  flg=09
  value=1
EXEC #140157072517608:c=7000,e=7144,p=0,cr=23,cu=682,mis=1,r=1000,dep=1,og=1,plh=0,tim=372772065407

The only different thing is the internal flag oacflg=13 instead of oacflg=03, but unfortunately this is not documented.

direct-path array insert

Ok, since 11g there is one way of improvement with direct-path insert. The hint APPEND_VALUES is the equivalent to the APPEND one except that it can be used with INSERT … VALUES insert instead of INSERT from SELECT. Of course it makes sense only with array insert – not when inserting only one row.

SQL> connect demo/demo
Connected.
SQL> declare
  2   type c_type is table of INLISTS_INT_SINGLE%rowtype;
  3   c c_type;
  4  begin
  5   select mod(rownum,4),rownum bulk collect into c from dual connect by 1e3>level;
  6   for n in 1..1e2 loop
  7    forall i in 1 .. c.count insert /*+ APPEND_VALUES */ into INLISTS_INT_SINGLE values c(i);
  8    commit;
  9   end loop;
 10  end;
 11  /
PL/SQL procedure successfully completed.
Elapsed: 00:00:04.35

SQL> select name,value from v$mystat join v$statname using(statistic#) where ( name like 'redo%' or name like '%undo%' or name like '%split%' ) and value>0;

NAME                                VALUE
------------------------------ ----------
redo entries                         6509
redo size                         4113096
undo change vector size           3005700
leaf node splits                      200
leaf node 90-10 splits                200

This is much better. And I don’t think we can expect any drawback when inserting into an empty GTT.

INSERT /*+ APPEND_VALUES */ INTO INLISTS_INT_SINGLE
VALUES
 (:B1 ,:B2 )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      100      0.03       0.04          0          0          0           0
Execute    100      0.85       1.08          0       6908       7421      100000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      200      0.89       1.13          0       6908       7421      100000

BINDS #139994276856992:
 Bind#0
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=204001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f52f52501b8  bln=22  avl=02  flg=09
  value=1
 Bind#1
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=204001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f52f52501d0  bln=22  avl=02  flg=09
  value=1
EXEC #139994276856992:c=541000,e=561632,p=29,cr=3145,cu=99,mis=1,r=1000,dep=1,og=1,plh=3581094869,tim=372773523105

conclusion

The first thing to do when we insert several rows it to insert them in bulk. The second one is to used the APPEND_VALUES. There are other way to avoid passing lot of values for an IN list, and that’s for a future post.

Insert into GTT: bulk with APPEND_VALUES

test case

conventional row-by-row insert

conventional array insert

direct-path array insert

conclusion