This was first published on https://blog.dbi-services.com/12c-extended-datatypes-better-than-clob (2014-05-23)
Republishing here for new followers. The content is related to the the versions available at the publication date
12c has introduced character strings that can can go above 4000 bytes. In the previous versions, in PL/SQL only we were allowed to have VARCHAR2 up to 32k. In SQL the VARCHAR2 datatype was limited to 4000 and CHAR was limited to 2000. That became a bit small especially when lot of applications needed to store unicode characters.
From 12c we can have SQL datatypes having up to 32k bytes for VARCHAR2, CHAR and RAW. It’s not allowed by default. We need to set max_string_size=extended and recompile views with utl32k. Nice improvement. But is it a good idea to use that new feature when we already have CLOB for large character strings ? The New Features documentation is clear about that: extended datatypes have been introduced to be compatible with other databases – not to replace existing features.
I will not go into the details how they are stored. Information about that is available elsewhere. See for example @ludodba recent blog post Where are Extended Data Types stored?. Extended datatypes are stored as chained rows if you just extend an existing table, or as a LOB if you defined them on a new table. Chained rows is clearly not a good option, so, given that you (re)create the tables, their storage is similar to CLOB.
But there is something that I don’t like with LOBS: they are fetched row by row. When you select a row you get only the handle. And you get the CLOB later when you access to it through the handle. Did you ever try to datapump a table with LOBs through network_link? Huge amount of roundtrips and very bad performance. It’s one rare case where doing expdp/impdp with a dumpfile is better. For very large objects, you will do several roundtrips anyway, so this is not an issue. But with character strings that are just a few kilobytes having them as LOB introduces an ineffective overhead.
Let’s compare the fetch behaviour with those new extended datatypes. For my demo, I’ll use a table with a clob column “C” and an extended varchar2 column “E”, and insert same data into both columns.
SQL> create table TEST ( C clob , E varchar2(9000) ); Table created. SQL> insert into TEST select lpad(rownum,9000,'x'),lpad(rownum,9000,'x') from dual connect by level <=10; 10 rows created.
Here is the autotrace when reading the CLOB from 10 rows:
SQL> set autotrace trace stat SQL> select C from TEST; 10 rows selected. Statistics ---------------------------------------------------------- 2 recursive calls 0 db block gets 27 consistent gets 20 physical reads 0 redo size 93936 bytes sent via SQL*Net to client 2722 bytes received via SQL*Net from client 22 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
For only 10 rows I’ve made 22 roundtrips. This is the problem with LOBs. Too many roundtrips. Well there is another problem that I’ll not show here, which is the fact that you can fetch the lob a long time after, even when the cursor is closed. It does consistent read so you have to set your undo_retention accordingly.
Now here is the same data from the extended varchar2 column:
SQL> select E from TEST; 10 rows selected. Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 56 consistent gets 0 physical reads 0 redo size 90501 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
Here I got the same volume (10 times 9000 characters) but this time I did only 2 roundtrips. Let’s go further and trace with sql_trace. LOB calls are instrumented since 11g so we can see them from the trace file:
PARSING IN CURSOR #139894737850360 len=18 dep=0 uid=103 oct=3 lid=103 tim=8952647276 hv=844696927 ad='77e1a518' sqlid='132sh6wt5k3az' select C from TEST END OF STMT PARSE #139894737850360:c=0,e=82,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=895260 EXEC #139894737850360:c=0,e=69,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=897416 FETCH #139894737850360:c=0,e=97,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=848200 LOBREAD: c=999,e=10893,p=2,cr=1,cu=0,tim=8952659696 FETCH #139894737850360:c=0,e=43,p=0,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=860372 LOBREAD: c=1000,e=1614,p=2,cr=1,cu=0,tim=8952662447 FETCH #139894737850360:c=0,e=47,p=0,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=863495 LOBREAD: c=2000,e=657,p=2,cr=1,cu=0,tim=8952664615 FETCH #139894737850360:c=0,e=40,p=0,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=865575 LOBREAD: c=0,e=706,p=2,cr=1,cu=0,tim=8952666808 FETCH #139894737850360:c=0,e=37,p=0,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=867552 LOBREAD: c=1000,e=949,p=2,cr=1,cu=0,tim=8952669193 FETCH #139894737850360:c=0,e=92,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=869825 LOBREAD: c=0,e=844,p=2,cr=1,cu=0,tim=8952671276 FETCH #139894737850360:c=0,e=68,p=0,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=872168 LOBREAD: c=1000,e=756,p=2,cr=1,cu=0,tim=8952673521 FETCH #139894737850360:c=0,e=44,p=0,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=874712 LOBREAD: c=999,e=914,p=2,cr=1,cu=0,tim=8952676180 FETCH #139894737850360:c=0,e=64,p=0,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=877352 LOBREAD: c=0,e=931,p=2,cr=1,cu=0,tim=8952678875 FETCH #139894737850360:c=0,e=52,p=0,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=879774 LOBREAD: c=1000,e=795,p=2,cr=1,cu=0,tim=8952681136 FETCH #139894737850360:c=1000,e=7,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=891850 STAT #139894737850360 id=1 cnt=10 pid=0 pos=1 obj=100085 op='TABLE ACCESS FULL TEST (cr=16 pr=0 pw=0 time=60 us cost=5 size=20980 card=10)' CLOSE #139894737850360:c=0,e=31,dep=0,type=0,tim=8952684289
And the sql_trace with the same data from the extended datatype.
PARSING IN CURSOR #139895028091224 len=18 dep=0 uid=103 oct=3 lid=103 tim=8954178349 hv=1829009117 ad='7b48ba08' sqlid='4kq232tqh8xqx' select E from TEST END OF STMT PARSE #139895028091224:c=0,e=64,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=895345 EXEC #139895028091224:c=0,e=36,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=895435 FETCH #139895028091224:c=1000,e=896,p=0,cr=8,cu=0,mis=0,r=1,dep=0,og=1,plh=1357081020,tim=899458 FETCH #139895028091224:c=4000,e=3660,p=0,cr=48,cu=0,mis=0,r=9,dep=0,og=1,plh=1357081020,tim=89535 STAT #139895028091224 id=1 cnt=10 pid=0 pos=1 obj=100085 op='TABLE ACCESS FULL TEST (cr=56 pr=0 pw=0 time=670 us cost=5 size=20010 card=10)' CLOSE #139895028091224:c=0,e=13,dep=0,type=0,tim=8954214996
So there is is one big advantage over CLOB: the column values are returned without additional roundtrips.
That would mean that if you have character strings that may be between 4k and 32k then extended datatypes can be a good option. It’s a new feature however, and designed for another goal (easy migration from other databases). So it’s something to test carefully and the tests must integrate all you infrastructure components (backups, exports, replication, etc).
Thanks! Intersting, that we already talked about it, but without public tests – http://orasql.org/2013/07/13/oracle-12c-extended-varchars/#comment-964086726
Thanks Sayan for the link. Especially for _scalar_type_lob_storage_threshold. So we can choose between lob storage or chained rows…
Hi, On oracle 12c compatible 12.0.0.0, changed to extended with sysdba privileges. I can create a table with varchar2(16000) as column now and insert a string > 4000 bytes; but only when connected as sysdba. When connected as a normal user rather than sysdba, I cannot play with varchar2 >4000 bytes, an error ORA-60019 is thrown. Can you explain why?
Hi Shadab, I think it’s not related with the user but with the default tablespace. SYS has SYSTEM as default tablespace, which is in MSSM. Securefiles can be stored only on ASSM tablespaces, and extended datatypes are stored as securefiles. Regards, Franck.
Did you try
select dbms_lob.substr(C, 9000, 1) from TEST;
with Oracle 12? At least in Oracle 11 with LOBs of 4000 characters (the oracle 11 limit the round trips are avoided as well:
Regards,Peter
Hi Peter, Very good remark, thanks. It’s the same in 12c. Note that the way I’ve inserted the rows with lpad(rownum,9000,’x’) works only in 12c with extended datatypes. Without, the output of lpad is transparently truncated to 4000 as it is a varchar2. Regards, Franck.