Experiences of a Freelance DBA: 2009

Summary
The scenario here is that you have a backup of a database from a week ago. It's an Oracle 10.2 DB running with an ASM. You wish to restore this database to different machine from its original location (though this is not really the important issue). Armed with a week's worth of archivelog files, you wish to restore from your old backup and then recover the database through all the log files you have available.

Unfortunately, during the week you had to add some datafiles to the database. You then issue the correct restore commands (to get the DB image unpacked from its backupsets), and then you issue your recover until command. Unfortunately you are greeted with the good old RMAN-06094 message.

RMAN-03002: failure of recover command at 09/05/2009 08:28:22

RMAN-06094: datafile 177 must be restored

It complains about a datafile that needs to be restored. But, you know that this file is one of a few which were created during the week. They are not in the backupset, but, Oracle is insisting that this file should be restored before it can continue with the DB recovery.

This blog is an account of my experience of such an issue and shows the complete case history from restoring the controlfile through to recovering the DB up to the point I needed.

I hope it helps you if you are faced with a similar issue.

RMAN> restore controlfile from '/mnt/adaptBU/adaptBU/29AUG09/c-3688516520-20090904-01';
Starting restore at 05-SEP-2009 05:04:54
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=321 devtype=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:08
output filename=+DG_DATA/ad11prd/controlfile/current1
output filename=+DG_DATA/ad11prd/controlfile/current2
Finished restore at 05-SEP-2009 05:05:06

-- No RMAN output shown for this set of commands.
RMAN> alter database mount;
RMAN>crosscheck backupset;
RMAN>delete expired backupset;
RMAN>list backup summary;

RMAN>catalog start with '/mnt/adaptBU/adaptBU/29AUG09/';

RMAN> list backup summary device type disk;

List of Backups
===============
Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag
------- -- -- - ----------- -------------------- ------- ------- ---------- ---
9867 B F A DISK 29-AUG-2009 17:43:38 1 1 YES HRMAXPD225_AD11PRD1_HOT_290809
9868 B F A DISK 29-AUG-2009 17:44:01 1 1 YES HRMAXPD225_AD11PRD1_HOT_290809
9869 B F A DISK 29-AUG-2009 17:46:45 1 1 YES HRMAXPD225_AD11PRD1_HOT_290809
9870 B F A DISK 29-AUG-2009 17:46:51 1 1 YES HRMAXPD225_AD11PRD1_HOT_290809
9872 B A A DISK 29-AUG-2009 17:47:13 1 1 YES TAG20090829T174710
9873 B A A DISK 29-AUG-2009 17:47:15 1 1 YES TAG20090829T174710
9874 B A A DISK 29-AUG-2009 17:47:17 1 1 YES TAG20090829T174710
9875 B A A DISK 29-AUG-2009 17:47:18 1 1 YES TAG20090829T174710
9876 B F A DISK 29-AUG-2009 17:47:19 1 1 NO TAG20090829T174719
10117 B A A DISK 04-SEP-2009 18:05:55 1 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10118 B A A DISK 04-SEP-2009 18:07:33 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10119 B A A DISK 04-SEP-2009 18:09:25 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10120 B A A DISK 04-SEP-2009 18:09:56 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10121 B A A DISK 04-SEP-2009 18:13:08 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10122 B A A DISK 04-SEP-2009 18:15:17 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10123 B A A DISK 04-SEP-2009 18:16:54 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10124 B A A DISK 04-SEP-2009 18:19:59 1 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10125 B A A DISK 04-SEP-2009 18:21:06 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10126 B A A DISK 04-SEP-2009 18:22:19 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10127 B A A DISK 04-SEP-2009 18:23:48 1 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10128 B A A DISK 04-SEP-2009 18:28:08 1 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10129 B A A DISK 04-SEP-2009 18:28:17 2 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10130 B A A DISK 04-SEP-2009 18:28:35 1 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10131 B A A DISK 04-SEP-2009 18:29:17 1 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10132 B A A DISK 04-SEP-2009 18:30:02 1 1 YES HRMAXPD225_AD11PRD1_HOT_040909
10134 B F A DISK 04-SEP-2009 18:30:14 1 1 YES TAG20090904T183013
10135 B F A DISK 04-SEP-2009 18:30:15 1 1 NO TAG20090904T183015

-- Off we go, we've cataloged our backup set (which we brought over from the production system),
-- now lets get the Datafiles out of the backupsets...

RMAN> restore database until time "to_date('29-AUG-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

-- We are about to offload a whole load of archive log files into our Flash Recovery Area.
-- BUMP UP THE db_recovery_file_dest_size to hold all uncompressed archivelogs...
RMAN> alter system set db_recovery_file_dest_size=100G;

RMAN> recover database until time "to_date('01-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

--Just a check of the ASM +DG_FLASH/AD11PRD/ARCHIVELOG area when the restoration moved from the unpack stage to the apply stage.

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 81349 41650688 41943040 thread_3_seq_1661.3975.696757821

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 190888 97734656 99614720 thread_3_seq_1662.3937.696757699

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 190888 97734656 99614720 thread_3_seq_1663.3938.696757701

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 190888 97734656 99614720 thread_3_seq_1664.3939.696757703

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 180813 92576256 94371840 thread_3_seq_1665.3957.696757769

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 180662 92498944 94371840 thread_3_seq_1666.3958.696757769

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 50744 25980928 26214400 thread_3_seq_1667.3976.696757829

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 2 1024 1048576 thread_3_seq_1668.3980.696757833

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 8933 4573696 5242880 thread_3_seq_1669.3979.696757833

ARCHIVELOG UNPROT COARSE SEP 05 07:00:00 Y 512 190917 97749504 99614720 thread_3_seq_1670.3916.696757627

.....

ASMCMD> Sat Sep 5 08:19:02 2009

-- Now at 08:27am

RMAN> recover database until time "to_date('02-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

-- Immediately, this came back with the dreaded error as seen earlier...

Starting recover at 05-SEP-2009 08:28:09
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 09/05/2009 08:28:22
RMAN-06094: datafile 177 must be restored

--At this time, from SQL*Plus...
-- v$datafile where file# > 176
177 1357906524 +DG_DATA/ad11prd/datafile/webint_data.481.696515055
178 1390246309 +DG_DATA/ad11prd/datafile/adapt_data.482.696614533
179 1390251108 +DG_DATA/ad11prd/datafile/adapt_data.483.696614539
180 1390253412 +DG_DATA/ad11prd/datafile/adapt_data.484.696614545
181 1390257349 +DG_DATA/ad11prd/datafile/adapt_data.485.696614551
182 1390261826 +DG_DATA/ad11prd/datafile/adapt_data.486.696614557

--And also for webint's datafiles.... (file 177 is named as a webint file (tablespace name)).
-- v$datafile name like '%webint%'
+DG_DATA/ad11prd/datafile/webint_data.537.696750009
+DG_DATA/ad11prd/datafile/webint_data.825.696750057
+DG_DATA/ad11prd/datafile/webint_data.471.696751555
+DG_DATA/ad11prd/datafile/webint_data.831.696753039
+DG_DATA/ad11prd/datafile/webint_data.454.696753017
+DG_DATA/ad11prd/datafile/webint_data.481.696515055

-- Now from checking the alert logs, I determined that the file 177 was added on 02-Sep-2009 at 12:24 (ish)...
-- So I wanted to see if I could recover to a time just before this event. - I could...
RMAN> recover database until time "to_date('02-SEP-2009 12:00:00','DD-MON-YYYY HH24:MI:SS')";

Done at: 09:08

-- Move up to the minute before the file addition...
RMAN> recover database until time "to_date('02-SEP-2009 12:24:00','DD-MON-YYYY HH24:MI:SS')";
Complete: at 09:19

-- Now try to hop over the minute and boom...
RMAN> recover database until time "to_date('02-SEP-2009 12:25:00','DD-MON-YYYY HH24:MI:SS')";

Same error about file# 177 as before.

-- Now issue that command that you know couldn't actually succeed as you know that the file is not available in any backupsets...
RMAN> restore datafile 177 until time "to_date('02-SEP-2009 12:25:00','DD-MON-YYYY HH24:MI:SS')";

--RMAN responds with...
Starting restore at 05-SEP-2009 09:28:53
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4

creating datafile fno=177 name=+DG_DATA/ad11prd/datafile/webint_data.481.696515055
restore not done; all files readonly, offline, or already restored
Finished restore at 05-SEP-2009 09:29:03

-- This line says that Oracle MADE the file at this time. Before, it had not been available, but, now it's been created - this is a good sign, even though the message

-- carries a bit of a negative message (the bit about the restore not being done) - never mind.

RMAN> recover database until time "to_date('02-SEP-2009 12:25:00','DD-MON-YYYY HH24:MI:SS')";

-- Now when the same command (as was executed earlier) is re-submitted, Oracle prgresses to recover into the newly created file #177.

Starting recover at 05-SEP-2009 09:29:20
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4

starting media recovery

archive log thread 1 sequence 12817 is already on disk as file +DG_FLASH/ad11prd/archivelog/2009_09_05/thread_1_seq_12817.2964.696762513

archive log thread 2 sequence 9024 is already on disk as file +DG_FLASH/ad11prd/archivelog/2009_09_05/thread_2_seq_9024.3346.696762813

archive log thread 3 sequence 1772 is already on disk as file +DG_FLASH/ad11prd/archivelog/2009_09_05/thread_3_seq_1772.3343.696762749

archive log filename=+DG_FLASH/ad11prd/archivelog/2009_09_05/thread_1_seq_12817.2964.696762513 thread=1 sequence=12817
archive log filename=+DG_FLASH/ad11prd/archivelog/2009_09_05/thread_2_seq_9024.3346.696762813 thread=2 sequence=9024
archive log filename=+DG_FLASH/ad11prd/archivelog/2009_09_05/thread_3_seq_1772.3343.696762749 thread=3 sequence=1772
media recovery complete, elapsed time: 00:00:02
Finished recover at 05-SEP-2009 09:29:43

-- Wow, now for the first time it's actually hopped over the missing file...
-- Time to push on with the recovery...
RMAN> recover database until time "to_date('02-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

-- Now it progresses...
-- Let's move it to the next day...
09:59

RMAN> recover database until time "to_date('03-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

Starting recover at 05-SEP-2009 10:02:15
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 09/05/2009 10:02:29
RMAN-06094: datafile 178 must be restored
-- Again we receive the old 06094 message for the next datafile that was added.... (But at least we know what to do this time)...

RMAN> restore datafile 178 until time "to_date('03-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

Starting restore at 05-SEP-2009 10:03:01
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4

creating datafile fno=178 name=+DG_DATA/ad11prd/datafile/adapt_data.482.696614533
restore not done; all files readonly, offline, or already restored
Finished restore at 05-SEP-2009 10:03:11

-- Good, a negative looking message but at least it says that it made the dreaded missing file....

RMAN> recover database until time "to_date('03-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

-- Re-run our next target attempt...

Starting recover at 05-SEP-2009 10:03:19
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 09/05/2009 10:03:33
RMAN-06094: datafile 179 must be restored

-- Now we get an error for the next file... (so deal with this too)..
RMAN> restore datafile 179 until time "to_date('03-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

Starting restore at 05-SEP-2009 10:04:48
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4

creating datafile fno=179 name=+DG_DATA/ad11prd/datafile/adapt_data.483.696614539
restore not done; all files readonly, offline, or already restored
Finished restore at 05-SEP-2009 10:04:58

-- And so on through the remaining files that need to be created...
RMAN> restore datafile 180 until time "to_date('03-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";
RMAN> restore datafile 181 until time "to_date('03-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";
RMAN> restore datafile 182 until time "to_date('03-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

-- Now, finally, we are ready to have an uninterrupted run to our next recovery target.
RMAN> recover database until time "to_date('03-SEP-2009 18:00:00','DD-MON-YYYY HH24:MI:SS')";

--10:54 complete.

-- Finally, issue the recovery command that will get us all the way to the endpoint.
-- Obviously, this could have been issued much earlier if we had been better prepared for our troubles.

RMAN> recover database until time "to_date('04-SEP-2009 17:58:00','DD-MON-YYYY HH24:MI:SS')";

-- Eventually, the recovery managed to get to the endpoint withoug further issues.

Experiences of a Freelance DBA

Saturday, September 5, 2009

RMAN-06094 While recovering through new datafile

Tuesday, August 4, 2009

Monotonicity in Contiguous Perfstat Snapshots

About Me

Followers

Blog Archive