Problem with aborted journals / Read-only filesystem

aag73

Member
Mar 28, 2008
20
0
51
Mexico
We have 8 RHEL 5 / Cpanel servers with ThePlanet. In the past 12-15 months, we have had 7 occurrences of aborted journals that cause the /home partition to go read-only. This of course basically renders the server useless, we have to request a manual FSCK and usually have around 1-2 hours of downtime.

The first couple of times it happenned I though they were random occurences of data corruption. They usually happen after the deletion of a "non-existent" file.

linux kernel: EXT3-fs warning (device sda8): ext3_unlink: Deleting nonexistent file (2097887), 0

After that, the journal aborts

linux kernel: EXT3-fs error (device sda8): ext3_lookup: unlinked inode 2097370 in dir #2097369
linux kernel: EXT3-fs error (device sda8): ext3_journal_start_sb: Detected aborted journal

The partition goes to read-only

Finding the corrupted files will always lead you to a cur folder on a mail folder of a customer's account. The filenames all appear red. It puzzles me is always a mailbox file that causes this.

Anyone else has seen this? Any possible cause or way to avoid it or fix it?

Warm Regards and many thanks.