When a mirror is not a mirror

by Volker Weber

Turn back the clock two years and imagine you want to run a web server for your small company, hosted in your ISP's datacenter. You go and buy a DELL Poweredge pizza box, complete with two IDE drives. You install Windows on disk 1, make two partitions for system and data and then you tell Windows to mirror disk 1 to disk 2. Just in case one of the disks fails later on. What you don't know yet is that IBM IC35L020AVER07 disks are not what you want in your server.

Fast forward to Oct 23, 2005. Disk 1 dies. Server crashes. You buy two new disks, just because the other disk is also two years old now. Disks are cheap and server uptime is more important then saving a few bucks. What is the plan? You remove disk 1 and boot from disk 2. This is a mirror of disk 1, right?

Wrong.

Disk 1 has a MBR, one DELL service partition and two partitions you built: system and data. However, you find out that disk 2 only has three partitions. All of them "dynamic volumes". There is no boot record, so the disk won't boot at all. What Microsoft is not telling you:

Keep in mind that you cannot boot to a drive that contains only dynamic volumes. If you mirror your system drive, be sure to partition the drive first and then delete the partition and mirror it. This creates a bootable partition in the MBR.

Nobody told you this. Windows conveniently built a mirror that won't work. Neat, eh?

The trouble does not stop here. You create a BartPE disk, a live Windows Boot CD, admin's best friend. But you won't be able to see disk 2 when disk 1 is not present. You first have to use Diskpart to break the mirror. This ain't easy since Diskpart insists on seeing the other disk. Would you dare to break the mirror before having saved the most valuable data from the disk?

Well, normally you would just ignore Windows, whip out your handy Linux boot disk and mount the NTFS drives to get to the data. Not with dynamic volumes though:

If a partition table entry of type 0x42 is present in the legacy partition table, then W2K ignores the legacy partition table and uses a proprietary partition table and a proprietary partitioning scheme (LDM or DDM). As the Microsoft KnowledgeBase writes: Pure dynamic disks (those not containing any hard-linked partitions) have only a single partition table entry (type 42) to define the entire disk. Dynamic disks store their volume configuration in a database located in a 1-MB private region at the end of each dynamic disk.

Tune in tomorrow when the story continues.

Comments

"...and then you tell Windows to mirror disk 1 to disk 2."
Bad idea. You want to rely on hardware (a RAID controller), not on Windows to do that for you. Leave the OS (*any* OS) out of that equation.
The other thing you'll want to make sure next time you set something up: After you setup a mirror, disconnect a drive and see what happens. Kinda like backup software: Don't just make a backup, also make sure you can actually do successful restores.

Thomas Gumz, 2005-10-25

Very sound advice. But too late in this case.

Volker Weber, 2005-10-25

Thomas, I completely agree with the sentiment re: testing your mirror setup.

However, your comment re: (*any* OS) is not right. Take a serious OS, say AIX or Solaris or OS400, and they've been doing OS mirroring successfully for many years, since hardware based mirroring/RAID was expensive, slow, possibly only supported one adapter per loop/chain and was a pig to manage.

Take AIX for example, we have installed over 500 AIX HA/CMP clusters over the years, most split over two sites, and all use AIX LVM mirroring. We have not once had an issue where a disk failed and we could not continue to boot/run off the other disk(s). In fact, we migrated an entire datacenter in this manner - add a third mirror in site A (so we now have two in site A and one in site B), break mirror in site B (the one being migrated), add mirror in site C (the new one), break original site A mirror. Thus a whole cluster moved without once losing protection. Try doing that with some proprietry RAID controller...

It is Windows that lets the side down again - bolting disk management onto a codebase that did not support it from day 1, and as is their want, not doing it properly. We in the IT business have to stop using other technology to cover up for MSFT's inadequacies...

Stuart

Stuart McIntyre, 2005-10-25

Is there such a thing like a MS "Free Acetylsalicylic Acid" program that comes with your server liscence?

Philipp Sury, 2005-10-25

If I were in your position I would try:

1. Makeing a copy with dd and keep the original disk stored at a save place. Doing the tests only with the copy.

2. Setting up a new system with a new mirror set based on dynamic volumes and CREATE THE BOOTFLOPPY as recommended by MS.

3. Breaking the test-mirror and trying to recover the data from disk 2 by using the bootfloppy.

4. If this works swapping the test-drives with the defective mirror-disk an repeat step 3

5. If this fails doing the test again with server 2003 because maybe the tools are better or more fault tolerant.

Good Luck.

Robert Krauss, 2005-10-25

On real servers, running an OS designed by people with brains, you would turn geom support, mirror disk 1 to disk 2 (or all slices on disk 1 to disk 2) and run. If one of your disks breaks while the OS is running (very likely after a few years of operation), you will take notice from the log and the mirror will run degraded. If you choose to restart the system, you can take out the broken disk and decide, if you want to operate the OS on the remaining disk in geom mode (which would allow you to add a new disk and rebuild it in the background) or as a "native" single disk — transsparently without geom support — for maintenance purpose. It s completely your choice, whatever makes more sense in this moment. Good solutions are transparent and can be configured forwards and backwards, according to your needs.

Damn, Windows is so 1980 when it comes to its boot-process or, say, filesystem and storage media handling. But, yeah, your servers boot in a somewhat ridiculous process (much like DOS did, to be honest) and have a nice graphical interface that makes you more productive (according to MS' marketing blah blah blah)…

In daily operations of mission critical devices, marketing arguments don't count.

Karsten W. Rohrbach, 2005-10-25


Man, that really sucks. (STBY as a friend of mine used to say = "sucks to be you")

I assume you read this already:

http://support.microsoft.com/?kbid=113977

> Once Windows NT is successfully booted from the fault tolerant boot floppy, the files and
> directories on the mirror drive are available for normal disk operations. Even if the disk containing
> the primary partition is lost, no differences are apparent to you (unless you study the status
> information displayed in Disk Administrator). But fault tolerance no longer exists: if the remaining
> partition is lost, all its data is lost also. So it is safer to break the current mirror set, configure
> a new boot and system partition, and create a new mirror.

Sounds like loads of fun. Good luck!

Eric Anderson, 2005-10-25

Dynamic Volumes have had me vor a spin already, I know how you feel.

Fortunately there was no important data lost, so I never bothered. But for recoveries i nomrally use RTOOLS (www.r-tt.com). As I now see Part of the R-Studio feature list is:

"Recognition and parsing Dynamic (Windows 2000/XP/2003), Basic and BSD (UNIX) partitions layout schema."

Steffen Gutermann, 2005-10-26

After sitting in a datacenter untill 2am trying to fix this exact same problem, I know the answer.

Give up, go home & buy a server with hardware RAID.

I tried this for a further 2 DAYS trying to get it it to work. It doesn't. It's Shite. It's Microsoft.

Warren Elsmore, 2005-10-26

Ghost your dynamic disk onto a new disk, to a basic partition.

Put new disk in machine, keep original disk in a safe place.

Boot off Windows CD to recovery console, run fixmbr.

Presto.

Mark Andrews, 2007-09-15

Old vowe.net archive pages

I explain difficult concepts in simple ways. For free, and for money. Clue procurement and bullshit detection.

vowe

Paypal vowe