This was first published on https://blog.dbi-services.com/is-cdb-stable-now-after-one-patchset-and-two-psu (2015-02-16)
Republishing here for new followers. The content is related to the the versions available at the publication date
There has been the announce that non-CDB is deprecated, and the reaction that CDB is not yet stable.
Well. Let’s talk about the major issue I’ve encountered. Multitenant is there for consolidation. What is the major requirement of consolidation? It’s availability. If you put all your databases into one server and managed by one instance, then you don’t expect a failure.
When 12c was out (and even earlier as we are beta testers) – 12.1.0.1 – David Hueber has encountered an important issue. When a SYSTEM datafile was lost, then we cannot revocer it without stopping the whole CDB. That’s bad of course.
When Patchet 1 was out (and we were beta tester again) I tried to check it that had been solved. I’ve seen that they had introduced the undocumented “_enable_pdb_close_abort” parameter in order to allow a shutdown abort of a PDB. But that was worse. When I dropped a SYSTEM datafile the whole CDB instance crashed immediately. I opened a SR and Bug 19001390 ‘PDB system tablespace media failure causes the whole CDB to crash’ was created for that. All is documented in that blog post.
Now the bug status is: fixed in 12.1.0.2.1 (Oct 2014) Database Patch Set Update
Good. I’ve installed the latest PSU which is 12.1.0.2.2 (Jan 2015) And I test the most basic recovery situation: loss of a non-system tablespace in one PDB.
Here it is:
RMAN> report schema;Report of database schema for database with db_unique_name CDB
List of Permanent Datafiles===========================File Size(MB) Tablespace RB segs Datafile Name—- ——– ——————– ——- ————————1 800 SYSTEM YES /u02/oradata/CDB/system01.dbf3 770 SYSAUX NO /u02/oradata/CDB/sysaux01.dbf4 270 UNDOTBS1 YES /u02/oradata/CDB/undotbs01.dbf5 250 PDB$SEED:SYSTEM NO /u02/oradata/CDB/pdbseed/system01.dbf6 5 USERS NO /u02/oradata/CDB/users01.dbf7 490 PDB$SEED:SYSAUX NO /u02/oradata/CDB/pdbseed/sysaux01.dbf11 260 PDB2:SYSTEM NO /u02/oradata/CDB/PDB2/system01.dbf12 520 PDB2:SYSAUX NO /u02/oradata/CDB/PDB2/sysaux01.dbf13 5 PDB2:USERS NO /u02/oradata/CDB/PDB2/PDB2_users01.dbf14 250 PDB1:SYSTEM NO /u02/oradata/CDB/PDB1/system01.dbf15 520 PDB1:SYSAUX NO /u02/oradata/CDB/PDB1/sysaux01.dbf16 5 PDB1:USERS NO /u02/oradata/CDB/PDB1/PDB1_users01.dbf
List of Temporary Files=======================File Size(MB) Tablespace Maxsize(MB) Tempfile Name—- ——– ——————– ———– ——————–1 60 TEMP 32767 /u02/oradata/CDB/temp01.dbf2 20 PDB$SEED:TEMP 32767 /u02/oradata/CDB/pdbseed/pdbseed_temp012015-02-06_07-04-28-AM.dbf3 20 PDB1:TEMP 32767 /u02/oradata/CDB/PDB1/temp012015-02-06_07-04-28-AM.dbf4 20 PDB2:TEMP 32767 /u02/oradata/CDB/PDB2/temp012015-02-06_07-04-28-AM.dbf
RMAN> host “rm -f /u02/oradata/CDB/PDB1/PDB1_users01.dbf“;host command complete
RMAN> alter system checkpoint;RMAN-00571: ===========================================================RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============RMAN-00571: ===========================================================RMAN-00601: fatal error in recovery managerRMAN-03004: fatal error during execution of commandORA-01092: ORACLE instance terminated. Disconnection forcedRMAN-03002: failure of sql statement command at 02/19/2015 22:51:55ORA-03113: end-of-file on communication channelProcess ID: 19135Session ID: 357 Serial number: 41977ORACLE error from target database:ORA-03114: not connected to ORACLE
Ok, but I have the PSU:
$ /u01/app/oracle/product/12102EE/OPatch/opatch lspatches19769480;Database Patch Set Update : 12.1.0.2.2 (19769480)
Here is the alert.log:
Completed: alter database open2015-02-19 22:51:46.460000 +01:00Shared IO Pool defaulting to 20MB. Trying to get it from Buffer Cache for process 19116.===========================================================Dumping current patch information===========================================================Patch Id: 19769480Patch Description: Database Patch Set Update : 12.1.0.2.2 (19769480)Patch Apply Time: 2015-02-19 22:14:05 GMT+01:00Bugs Fixed: 14643995,16359751,16870214,17835294,18250893,18288842,18354830,18436647,18456643,18610915,18618122,18674024,18674047,18791688,18845653,18849537,18885870,18921743,18948177,18952989,18964939,18964978,18967382,18988834,18990693,19001359,19001390,19016730,19018206,19022470,19024808,19028800,19044962,19048007,19050649,19052488,19054077,19058490,19065556,19067244,19068610,19068970,19074147,19075256,19076343,19077215,19124589,19134173,19143550,19149990,19154375,19155797,19157754,19174430,19174521,19174942,19176223,19176326,19178851,19180770,19185876,19189317,19189525,19195895,19197175,19248799,19279273,19280225,19289642,19303936,19304354,19309466,19329654,19371175,19382851,19390567,19409212,19430401,19434529,19439759,19440586,19468347,19501299,19518079,19520602,19532017,19561643,19577410,19597439,19676905,19706965,19708632,19723336,19769480,20074391,20284155===========================================================2015-02-19 22:51:51.113000 +01:00db_recovery_file_dest_size of 4560 MB is 18.72% used. This is auser-specified limit on the amount of space that will be used by thisdatabase for recovery-related files, and does not reflect the amount ofspace available in the underlying filesystem or ASM diskgroup.Setting Resource Manager plan SCHEDULER[0x4446]:DEFAULT_MAINTENANCE_PLAN via scheduler windowSetting Resource Manager CDB plan DEFAULT_MAINTENANCE_PLAN via parameter2015-02-19 22:51:54.892000 +01:00Errors in file /u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ckpt_19102.trc:ORA-63999: data file suffered media failureORA-01116: error in opening database file 16ORA-01110: data file 16: ‘/u02/oradata/CDB/PDB1/PDB1_users01.dbf‘ORA-27041: unable to open fileLinux-x86_64 Error: 2: No such file or directoryAdditional information: 3Errors in file /u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ckpt_19102.trc:ORA-63999: data file suffered media failureORA-01116: error in opening database file 16ORA-01110: data file 16: ‘/u02/oradata/CDB/PDB1/PDB1_users01.dbf’ORA-27041: unable to open fileLinux-x86_64 Error: 2: No such file or directoryAdditional information: 3USER (ospid: 19102): terminating the instance due to error 63999System state dump requested by (instance=1, osid=19102 (CKPT)), summary=[abnormal instance termination].System State dumped to trace file /u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_diag_19090_20150219225154.trcORA-1092 : opitsk aborting process2015-02-19 22:52:00.067000 +01:00Instance terminated by USER, pid = 19102
You can see the bug number in ‘bug fixed’ and the instance is still terminating after media failure on a PDB datafile. That’s bad news.
I’ve lost one datafile. At first checkpoint the CDB is crashed. I’ll have to open an SR again. But for sure consolidation through multitenancy architecture is not yet for sensible production.
That leaves me uncomfortable.
Can’t it be related to “_datafile_write_errors_crash_instance” parameter?
Hi Raphaël, Thank a lot yes the issue doesn’t reproduce when “_datafile_write_errors_crash_instance”=false; I have to investigate that parameter which is always bad in my opinion, and even worse in multitenant. Thanks. As I’ve put the link to this blog post in the SR I’ve opened we will probably see that a workaround. But then I wonder what did the patch 19001390. Regards, Franck.
Note that the parameter helps to workaround the non-system datafile issue. But the instance still crashes when a system datafile is missing, which is the bug supposed to be fixed and describe here: http://www.dbi-services.com/index.php/blog/entry/pdb-media-failure-may-case-the-whole-cdb-to-crash
Hi Franck, has this bug been finally fixed?
Hi Maciej, There are parameters to change the behavior. I detailed all that in: UKOUG OracleScene http://viewer.zmags.com/publication/dd0ea62e#/dd0ea62e/18 Depending on your HA protection, you may change the default. Regards, Franck.
[…] can read more about it at Franck Pachot’s blog here: http://blog.dbi-services.com/is-cdb-stable-now-after-one-patchset-and-two-psu/ […]