Hey all, I'm pretty close to nailing down the issue here, just want to cover all my bases (and post in great detail in case others face a similar situation, as I use these forums to troubleshoot all the time and appreciate the details). There are 4 questions at the end after I describe my issue.
TL:DR: Using "rearrange an account" function in WHM and then removing the partition that the accounts were on removes symlinks created on removed and breaks PHP (fails to start) since PHP continues to point at old drive. (unless you repair hardcoded links in FPM-PHP). What is the proper way to do this?
Details:
1. Running multi-php (FPM-PHP 5.6, 7.0, 7.1, 7.2)
2. Ran out of space, added new temp drive / partition and moved a couple accounts a few months ago using rearrange an account feature of WHM. All was good.
3. Finally upgraded main hard drive, so I moved the accounts back, and rebooted. All was good. Once I verified everything, i unmounted the temp drive, removed from fstab, and rebooted again to verify.
4. After reboot, any site using FPM-PHP 7.1 was throwing a 503 error. When I attempt to restart FPM-PHP via Whm restart FPM-PHP service for Apache, it would only saiy that FPM-PHP 71 failed (no success message for others, just that one failed) - Verified that FPM-PHP 71 wasn't running via systemctl.
5. Apache Log showed these errors:
AH01079: failed to make connection to backend: httpd-UDS
No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /opt/cpanel/ea-php71/root/usr/var/run/php-fpm/
Both of which were because FPM-PHP 71 wasn't running, and thus apache could not connect to it, I assume.
6. Checking the PHP 71 error log showed a HUGE amount of data (about 50MB) of these entries:
[01-Jun-2018 22:02:48] WARNING: [pool domain_com] child 13250 said into stderr: "ERROR: [pool domain_com] failed to chdir(/mnt/home2/username): No such file or directory (2)"
[01-Jun-2018 22:02:48] WARNING: [pool domain_com] child 13250 said into stderr: "ERROR: [pool domain_com] child failed to initialize", pipe is closed
[01-Jun-2018 22:02:48] WARNING: [pool domain_com] child 13250 exited with code 70 after 0.003479 seconds from start
[01-Jun-2018 22:02:48] NOTICE: [pool domain_com] child 13251 started
domain_com - was the failing domain
username - was the username for said domain
This was pointing to the temp drive I had just removed: failed to chdir(/mnt/home2/username):
So I determined that when you rearrange an account, it creates symlinks, which is why it would work fine as long as the drive was attached, after the accounts had moved... and as the cpanel documentation notes, any hard coded scripts will need to be changed, or problems could arise. (In this case FPM-PHP 71)
The reason only FPM-PHP71, and no other version, failed was the few domains I moved were all 71. All other versions operated normally.
For now I've just left the temp drive attached to keep things working normally.
Questions:
1. What is the proper way to change/repair the links for FPM-PHP 71 so I can remove the partition? (I didn't want to just go edit things in case there's a cpanel script I should be running.)
2. Obviously at some point after the first move to the temp drive months ago, a similar thing happened where symlinks were created on the main drive, but then the links were switched permanently in the FPM-PHP config - what caused this? Apache rebuild? Php pool switch? Just curious for future debugging.
3. Why did FPM-PHP71 fail to start? Out of a dozen domains using 71, only two were moved (which lead to a 50MB log file), but shouldn't FPM-PHP71 still have started and worked on the 10 domains that weren't moved? Did the PHP71 just take a dump because it was caught in an endless cycle and just decided to stop trying to start because 2 of 10 links were no longer there?
4. Are there other hard-coded scripts I should be checking for in a standard whm/cpanel install?
Obviously a reboot didn't fix the links, but would an apache rebuild have sorted all the issues after everything was moved?
Thanks for any help! (And I hope this can help someone in the future!)
TL:DR: Using "rearrange an account" function in WHM and then removing the partition that the accounts were on removes symlinks created on removed and breaks PHP (fails to start) since PHP continues to point at old drive. (unless you repair hardcoded links in FPM-PHP). What is the proper way to do this?
Details:
1. Running multi-php (FPM-PHP 5.6, 7.0, 7.1, 7.2)
2. Ran out of space, added new temp drive / partition and moved a couple accounts a few months ago using rearrange an account feature of WHM. All was good.
3. Finally upgraded main hard drive, so I moved the accounts back, and rebooted. All was good. Once I verified everything, i unmounted the temp drive, removed from fstab, and rebooted again to verify.
4. After reboot, any site using FPM-PHP 7.1 was throwing a 503 error. When I attempt to restart FPM-PHP via Whm restart FPM-PHP service for Apache, it would only saiy that FPM-PHP 71 failed (no success message for others, just that one failed) - Verified that FPM-PHP 71 wasn't running via systemctl.
5. Apache Log showed these errors:
AH01079: failed to make connection to backend: httpd-UDS
No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /opt/cpanel/ea-php71/root/usr/var/run/php-fpm/
Both of which were because FPM-PHP 71 wasn't running, and thus apache could not connect to it, I assume.
6. Checking the PHP 71 error log showed a HUGE amount of data (about 50MB) of these entries:
[01-Jun-2018 22:02:48] WARNING: [pool domain_com] child 13250 said into stderr: "ERROR: [pool domain_com] failed to chdir(/mnt/home2/username): No such file or directory (2)"
[01-Jun-2018 22:02:48] WARNING: [pool domain_com] child 13250 said into stderr: "ERROR: [pool domain_com] child failed to initialize", pipe is closed
[01-Jun-2018 22:02:48] WARNING: [pool domain_com] child 13250 exited with code 70 after 0.003479 seconds from start
[01-Jun-2018 22:02:48] NOTICE: [pool domain_com] child 13251 started
domain_com - was the failing domain
username - was the username for said domain
This was pointing to the temp drive I had just removed: failed to chdir(/mnt/home2/username):
So I determined that when you rearrange an account, it creates symlinks, which is why it would work fine as long as the drive was attached, after the accounts had moved... and as the cpanel documentation notes, any hard coded scripts will need to be changed, or problems could arise. (In this case FPM-PHP 71)
The reason only FPM-PHP71, and no other version, failed was the few domains I moved were all 71. All other versions operated normally.
For now I've just left the temp drive attached to keep things working normally.
Questions:
1. What is the proper way to change/repair the links for FPM-PHP 71 so I can remove the partition? (I didn't want to just go edit things in case there's a cpanel script I should be running.)
2. Obviously at some point after the first move to the temp drive months ago, a similar thing happened where symlinks were created on the main drive, but then the links were switched permanently in the FPM-PHP config - what caused this? Apache rebuild? Php pool switch? Just curious for future debugging.
3. Why did FPM-PHP71 fail to start? Out of a dozen domains using 71, only two were moved (which lead to a 50MB log file), but shouldn't FPM-PHP71 still have started and worked on the 10 domains that weren't moved? Did the PHP71 just take a dump because it was caught in an endless cycle and just decided to stop trying to start because 2 of 10 links were no longer there?
4. Are there other hard-coded scripts I should be checking for in a standard whm/cpanel install?
Obviously a reboot didn't fix the links, but would an apache rebuild have sorted all the issues after everything was moved?
Thanks for any help! (And I hope this can help someone in the future!)