Possible Drive Failure - Need to clone

Dhp4

Member
Jul 12, 2004
18
0
151
Hello,

Last night the server stopped responding and the NOC had to reboot it, however while it was starting up one of the techs noticed IO errors in the console. I have the drive scheduled to be tested tonight and I for one am not looking forward to the downtime.

Searching through the forums I’ve found scattered information on effectively 'cloning' the hard disk (as it still does run) and am looking for clarification on some issues.

First, if i were to use dd to clone each drive, wouldn’t the drive that is running (the failing one) have to be booted to read only as to keep the drive from changing while its in the cloning process? How would this be done? I do have serial console access, so i can access the server to insert flags into GRUB when it comes up.

Second, is there anything i have to do to the cloned drive once it’s done to allow it to start up?

Hopefully this will provide me and other people in the same boat with some clarification on this subject as I’ve heard dd can be a dangerous, yet helpful tool and i think i speak for everyone when I say I don’t want 120gb of corrupted data sitting around.

Thanks,

Don
 

nyjimbo

Well-Known Member
Jan 25, 2003
1,136
1
168
New York
Well, I dont know if this will help but....

How many drives are in the machine, how many are live data and how many are backup.

If you do a "dd" clone do you know if the datacenter will do the drive swapping for you for each time you clone?. Will they do the clone for you?

Right now are ALL services up and running and can you tell if all partitions look normal and files are accessible ?.

If you don't have a backup, which is what it sounds like, I would really see if they can put up a empty drive, format it as a big backup partition and start "tar"ing or "pax"ing data by partition as much as you can.

The drive is probably dying and you should already be prepapared for enough loss to require reloading parts of Cpanel, system tools, etc.., but I don't know what your situation is in regard to backups and your ISP/NOC policy on drive replacement.
 

Dhp4

Member
Jul 12, 2004
18
0
151
Thanks for the response,

I do have backups off to a NAS server of all the accounts. I looked thru my /var/log/messages and didnt find but did not find much as for hard drive errors. Maybe I am looking in the wrong place.

As for if the datacenter will do the clone for me - the answer is no. They claim its for liability issues, but i have a backup already.

What is going to happen from what research i have found and what the DC has said to me is:

They will mount the drive as primary and the new one as secondary.

I will restart the server in single user mode at the GRUB boot loader screen.

Ill then run "mount -o remount,ro /" to make the drive read only and to prevent the bits from changing while they are being copied.

Once the copy is done ill have the DC switch out the drives and hopefully be good to go.

My question is, first, do you see anything wrong with doing that?, and second how will i know what (if anything) is curropted so i can get in and fix it. Is there anything i can do now to find out? I ran a SMART test and it passed, so im not sure the data is bad as much as the drives cable could just be bad.

Thanks for your input,

Don
 

mahinder

Well-Known Member
Jun 12, 2003
69
0
156
matrix
i am in same situation, can anyone tell how to clone failing hard drive. i am running centos 4.x with cpanel 11.