The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Backups and broken RAID1

Discussion in 'Data Protection' started by kenashkov, May 1, 2011.

  1. kenashkov

    kenashkov Active Member

    Joined:
    Nov 23, 2006
    Messages:
    33
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    Sofia, Bulgaria
    cPanel Access Level:
    Root Administrator
    Hello everyone,

    one of our server has 2 drives with each partition in RAID1. When the scheduled backup occurs during the weekend the array always gets broken. Because of this during the backup it starts rebuilding and these two processes kill the server. I don't have a proof but it seems the backup causes the array to break (as during normal operation the array is fine). What could be the reason for that?

    Vesko Kenashkov
     
  2. cPanelTristan

    cPanelTristan Quality Assurance Analyst
    Staff Member

    Joined:
    Oct 2, 2010
    Messages:
    7,623
    Likes Received:
    21
    Trophy Points:
    38
    Location:
    somewhere over the rainbow
    cPanel Access Level:
    Root Administrator
    Hello Vesko,

    What are you using to perform the backup operations? Is this cpbackup or something else being used?

    Next, do you have any logs that show us the exact errors occurring during this time?

    Thanks!
     
  3. kenashkov

    kenashkov Active Member

    Joined:
    Nov 23, 2006
    Messages:
    33
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    Sofia, Bulgaria
    cPanel Access Level:
    Root Administrator
    Hi Tristan,

    I'm sorry I didn't specify explicitly - we are using the cpbackup functionality over FTP. I was just wondering is it possible the load created by tar/gzip to have something to do with the RAID1 breaking. In the /var/log/messages there are no useful messages as to why the array did break (just the sync messages).
    Is there is setting to change the priority of the backup process, or I have to change it manually each time with nice?

    Vesko Kenashkov
     
  4. cPanelTristan

    cPanelTristan Quality Assurance Analyst
    Staff Member

    Joined:
    Oct 2, 2010
    Messages:
    7,623
    Likes Received:
    21
    Trophy Points:
    38
    Location:
    somewhere over the rainbow
    cPanel Access Level:
    Root Administrator
    Hi Vesko,

    I'm uncertain for your questions on the load causing the RAID1 to break, but it seems unlikely that a high load would keep causing that issue to me personally.

    For changing the priority of the backup, there isn't an option to renice the process or change the nice value of the process for the backups, but you might want to consider a feature request to set a nice value for the cpbackup process in WHM > Backup > Configure Backup area. I could imagine that having a lower priority nice value might be beneficial to users who are having any load-based issues during the backup. The location for feature requests is the following:

    Feature Requests for cPanel and WHM

    Thanks.
     
  5. JerrySmith

    JerrySmith Active Member

    Joined:
    Apr 21, 2011
    Messages:
    35
    Likes Received:
    0
    Trophy Points:
    6
    Hello,

    I am not a RAID expert, but I thought I would share my thoughts with you.

    I have never heard of a backup process or high load causing a RAID to fail. It is very possible it would cause a performance bottleneck but should not cause an actual failure of the RAID.

    Are you using hardware or software raid?

    If you are using hardware raid, it is possible the controller card is failing and when given a lot of IO is causing the array to fail. I would have your datacenter replace the card to see if this resolves the issue.

    Another potential cause would be one or more of the drives in the array failing. I would check smartctl to see if there are any errors on the drives; Your datacenter can assist with this.
     
  6. kenashkov

    kenashkov Active Member

    Joined:
    Nov 23, 2006
    Messages:
    33
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    Sofia, Bulgaria
    cPanel Access Level:
    Root Administrator
    Hello,

    the RAID is software and no device is failing. At it only happened only when there is backup ongoing we decided there could be some connection between the two. We will be looking elsewhere for the problem...
    Than you for comments.

    Vesko Kenashkov
     
  7. nobodyk

    nobodyk Well-Known Member

    Joined:
    Aug 1, 2010
    Messages:
    90
    Likes Received:
    0
    Trophy Points:
    6
    This is actually common on non-qualified raid hard drives. What type of hard drives are you using?
     
  8. JerrySmith

    JerrySmith Active Member

    Joined:
    Apr 21, 2011
    Messages:
    35
    Likes Received:
    0
    Trophy Points:
    6
    Hello,

    Thanks for bringing that to my attention, nobodyk.

    In my earlier reply, I had forgotten to mention that there is a big difference between consumer and raid class HDDs.

    This could very well be causing his issue with software RAID.
     
  9. kenashkov

    kenashkov Active Member

    Joined:
    Nov 23, 2006
    Messages:
    33
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    Sofia, Bulgaria
    cPanel Access Level:
    Root Administrator
    Today is Sunday and again the backup is running and recync of the RAID1 volumes just started.
    So it appears that high I/O load can cause the RAID1 to recync... I simply cant find any other explanation (as these two events always coincide).

    I'm not sure the issue is related to the hard drives as this is software RAID and I imagine it should be hardware agnostic. Maybe there is an issue with the raid implementation in the kernel.

    I don't think the hard drives are anything special - bog standard SATA 2 from WD.
     
  10. kenashkov

    kenashkov Active Member

    Joined:
    Nov 23, 2006
    Messages:
    33
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    Sofia, Bulgaria
    cPanel Access Level:
    Root Administrator
    Further investigating the problem it seems that the server suffers from hardware problem - most probably memory error.
     
Loading...

Share This Page