Did something change with MySQL/MariaDB this morning?

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
I see that /etc/my.cnf has a last modified time of 8/7/21 3:51:44. It doesn't look like any data has changed, but MariaDB crashed at that same time and won't restart!

This is what I see when I try to restart via WHM:

Aug 07 12:33:25 my.server.com systemd[1]: Starting MariaDB 10.3.31 database server...
Aug 07 12:33:26 my.server.com mysqld[18824]: 2021-08-07 12:33:26 0 [Note] /usr/sbin/mysqld (mysqld 10.3.31-MariaDB-log) starting as process 18824 ...
Aug 07 12:33:27 my.server.com systemd[1]: mariadb.service: main process exited, code=exited, status=1/FAILURE
Aug 07 12:33:27 my.server.com systemd[1]: Failed to start MariaDB 10.3.31 database server.
Aug 07 12:33:27 my.server.com systemd[1]: Unit mariadb.service entered failed state.
Aug 07 12:33:27 my.server.com systemd[1]: mariadb.service failed.
 

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
I also see that CSF reported a "suspicious process running under user mysql" at 3:51am:

Time: Sat Aug 7 03:51:15 2021 -0400
PID: 5871 (Parent PID:5871)
Account: mysql
Uptime: 3800666 seconds


Executable:

/usr/sbin/mysqld;610e3b52 (deleted)

The file system shows this process is running an executable file that has been deleted. This typically happens when the original file has been replaced by a new file when the application is updated. To prevent this being reported again, restart the process that runs this excecutable file. See csf.conf and the PT_DELETED text for more information about the security implications of processes running deleted executable files.


Command Line (often faked in exploits):

/usr/sbin/mysqld
 

ankeshanand

Well-Known Member
Mar 29, 2021
195
58
103
India
cPanel Access Level
Root Administrator
I don't see any New updates in MySQL/MariaDB. Also, The Server does not automatically update anything unless upcp script is running. Maybe Someone tried exploiting or Hacking the server Internally. Could you try Restarting the MariaDB Server and show the error log?

Also, Sometimes it happens that the Process is not fully closed which results in occupying of port even after Process has partially ended in error.
Try:
Code:
lsof -i:3306
if something comes up
 

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
My server provider was able to get it back online, the problem seems to be related to InnoDB... again. I had a major crash in May that was related to InnoDB, and they ended up having to use this in my.cnf to get it back online:

innodb_force_recovery=4

That caused about 75% of my tables to be completely deleted, and I have no explanation on that one. I changed all of the InnoDB tables to MyISAM, but Wordpress still uses InnoDB by default so that's not a permanent solution.

This time, changing that to "5" got me back up and running.

I read and more or less understand the "what":


but I don't understand the "why". Why am I having this issue, and what's the source of the problem? What's the permanent solution?
 

ankeshanand

Well-Known Member
Mar 29, 2021
195
58
103
India
cPanel Access Level
Root Administrator
My server provider was able to get it back online, the problem seems to be related to InnoDB... again. I had a major crash in May that was related to InnoDB, and they ended up having to use this in my.cnf to get it back online:

innodb_force_recovery=4

That caused about 75% of my tables to be completely deleted, and I have no explanation on that one. I changed all of the InnoDB tables to MyISAM, but Wordpress still uses InnoDB by default so that's not a permanent solution.

This time, changing that to "5" got me back up and running.

I read and more or less understand the "what":


but I don't understand the "why". Why am I having this issue, and what's the source of the problem? What's the permanent solution?
This should actually not happen! We also have InnoDB Enabled and Munin server monitor shows that occasionally, Its storage decreases and increases but it never caused an Issue. Maybe, You need to show it to a cPanel Expert for making it stable. Also, I expect that You had made a Backup of the Whole Server weekly so that there was no major data loss.

If you have used innodb_force_recovery=4 , You Databases have been corrupt by now! Were you able to recover the databases?
 

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
I went through every database and table, and I don't see any corrupted tables anywhere. The only InnoDB tables are in the "mysql" and "sys" databases, and none of them seem to be having an issue.

I see that all of the other tables in "sys" are "View", though, and show "~0 Rows". The InnoDB table sys_config is the only one with data, and it has 6 rows. Is that how it's supposed to be?

Can I safely convert the tables in "mysql" and "sys" to MyISAM, and then disable InnoDB entirely?
 

ankeshanand

Well-Known Member
Mar 29, 2021
195
58
103
India
cPanel Access Level
Root Administrator
In all Our Standard cPanel Shared Hosting Servers, Views Table has 0 Tables and sys_config has 6 rows. So its how its supposed to be.

I would not suggest it as the storage Engine is make Default for MySQL Releases after 5.5 due to Its differences with MyISAM. If you still want to convert it, You may want to read this Support Article by cPanel
 

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
Looking in the mysql log file, I see these from about an hour ago (so well after I got it back online):

2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`server_cost` in the cache. Attempting to load the tablespace with space id 19
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`innodb_table_stats` in the cache. Attempting to load the tablespace with space id 30
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`transaction_registry` in the cache. Attempting to load the tablespace with space id 32
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`slave_master_info` in the cache. Attempting to load the tablespace with space id 16
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`innodb_index_stats` in the cache. Attempting to load the tablespace with space id 29
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`help_keyword` in the cache. Attempting to load the tablespace with space id 7
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`slave_relay_log_info` in the cache. Attempting to load the tablespace with space id 15
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`gtid_executed` in the cache. Attempting to load the tablespace with space id 18
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`help_relation` in the cache. Attempting to load the tablespace with space id 6
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`help_topic` in the cache. Attempting to load the tablespace with space id 4
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`slave_worker_info` in the cache. Attempting to load the tablespace with space id 17
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`engine_cost` in the cache. Attempting to load the tablespace with space id 20
2021-08-07 15:28:30 84786 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`help_category` in the cache. Attempting to load the tablespace with space id 5
2021-08-07 15:28:59 85122 [ERROR] InnoDB: Failed to find tablespace for table `sys`.`sys_config` in the cache. Attempting to load the tablespace with space id 21
So there's definitely an ongoing problem. I'm guessing that this is because InnoDB is set to read-only with force_recovery at 5, but that doesn't help me figure out the original problem that made me change it in the first place :-(

What difference would I actually see by changing "mysql" and "sys" to MyISAM? It looks like the root of all of my database problems is InnoDB! I lost at least $100 in the 8 hours that it was down today, so it's going to have to be a significant difference to be worth the headache.

The only InnoDB setting I have in my.cnf is:

Code:
innodb_force_recovery=5
innodb_file_per_table=1

Here's my entire my.cnf file (minus comments) if you think it's important:

Code:
[mysqld]
performance-schema=0

## InnoDB is used in "mysql" and "sys" databases, but I'd love to disable it entirely
# innodb=OFF
# default_storage_engine=MyISAM

innodb_force_recovery=5
innodb_file_per_table=1

join_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

symbolic-links=0

max_allowed_packet=268435456

low_priority_updates=1

concurrent_insert=ALWAYS

table_open_cache=2k
table_definition_cache=8k
open_files_limit=10000

query_cache_size=128M
query_cache_limit=2M

key_buffer_size=1G

sql-mode=''

slow_query_log=1
long_query_time=1
slow_query_log_file=/var/log/mysqld.slow.log

log-error=/var/log/mysqld.log
 

ankeshanand

Well-Known Member
Mar 29, 2021
195
58
103
India
cPanel Access Level
Root Administrator
The Main problem you will see using MyISAM is Table level locking which would create difficulties like "Waiting for table level lock" whereas InnoDB is row level locking.

I don't think InnoDB was the problem. Its a Storage Engine and can never cause problem. The problem was something else but you overwrite the thing causing a data loss(According to my views).

You could have tried the following before recovery:
Code:
/usr/local/cpanel/bin/build_mysql_conf
/scripts/upcp -sync
These would have fixed the issue and restored any errors to default settings.
 
Last edited:

ankeshanand

Well-Known Member
Mar 29, 2021
195
58
103
India
cPanel Access Level
Root Administrator
I would suggest you to open a cPanel Support Ticket if you cannot fix it by yourself. You can also contact any Certified System Administrator which would make the thing go away.
 

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
At this point, I guess it's really too late to worry about the lost data from May. I only had a backup from December 2020 (!!!) so I had to roll things back to that point, and since then every one has moved on. So I'm afraid that, even if I could recover it, putting it back NOW would cause more confusion than it would solve :-(

And this time there was no apparent data loss, it was just that MySQL was crashing over and over. Until the server provider increased the force_recovery to 5, which is a HUGE concern for me!! It was set to 4 before, and the difference from MariaDB:

4(SRV_FORCE_NO_IBUF_MERGE) does not calculate tables statistics and prevents insert buffer merges.
5(SRV_FORCE_NO_UNDO_LOG_SCAN) treats incomplete transactions as committed, and does not look at the undo logs when starting.


I'm not sure that the undo logs could have caused such a problem, so if changing it from 4 to 5 "fixed" it then I'm guessing there was an "incomplete transaction"? But I have no clue how to find what it was.

This is what's in the error log at the time of the initial crash:

Code:
2021-08-07  3:51:18 0 [Note] /usr/sbin/mysqld (initiated by: unknown): Normal shutdown
2021-08-07  3:51:18 0 [Note] Event Scheduler: Purging the queue. 0 events
2021-08-07  3:51:36 0 [Note] InnoDB: Starting shutdown...
2021-08-07  3:51:36 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2021-08-07  3:51:36 0 [Note] InnoDB: Buffer pool(s) dump completed at 210807  3:51:36
2021-08-07 03:51:40 0x7fc7e2844700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.30/storage/innobase/trx/trx0rseg.cc line 361
InnoDB: Failing assertion: UT_LIST_GET_LEN(rseg->undo_list) == 0
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
210807  3:51:40 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

Server version: 10.3.30-MariaDB-log
key_buffer_size=1073741824
read_buffer_size=131072
max_used_connections=107
max_threads=153
thread_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1384935 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55fe4228733e]
/usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x55fe41d2100f]
sigaction.c:0(__restore_rt)[0x7fc8d82e3630]
:0(__GI_raise)[0x7fc8d7f3c387]
:0(__GI_abort)[0x7fc8d7f3da78]
/usr/sbin/mysqld(+0x4f590e)[0x55fe41a5b90e]
/usr/sbin/mysqld(+0xa8f00b)[0x55fe41ff500b]
/usr/sbin/mysqld(+0xa9339a)[0x55fe41ff939a]
/usr/sbin/mysqld(+0xa624c2)[0x55fe41fc84c2]
/usr/sbin/mysqld(+0x94e4df)[0x55fe41eb44df]
/usr/sbin/mysqld(_Z22ha_finalize_handlertonP13st_plugin_int+0x34)[0x55fe41d236c4]
/usr/sbin/mysqld(+0x5e4cb4)[0x55fe41b4acb4]
/usr/sbin/mysqld(+0x5e7b2e)[0x55fe41b4db2e]
/usr/sbin/mysqld(_Z15plugin_shutdownv+0x73)[0x55fe41b4e573]
/usr/sbin/mysqld(+0x51918a)[0x55fe41a7f18a]
/usr/sbin/mysqld(_Z10unireg_endv+0x3b)[0x55fe41a7f46b]
/usr/sbin/mysqld(+0x51cfef)[0x55fe41a82fef]
/usr/sbin/mysqld(kill_server_thread+0xe)[0x55fe41a8320e]
pthread_create.c:0(start_thread)[0x7fc8d82dbea5]
/lib64/libc.so.6(clone+0x6d)[0x7fc8d80049fd]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             62987                62987                processes 
Max open files            40000                40000                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       289154               289154               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
Core pattern: core

And then this is repeated over and over:

Code:
2021-08-07  3:51:42 0 [Note] InnoDB: Using Linux native AIO
2021-08-07  3:51:42 0 [Note] InnoDB: !!! innodb_force_recovery is set to 4 !!!
2021-08-07  3:51:42 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2021-08-07  3:51:42 0 [Note] InnoDB: Uses event mutexes
2021-08-07  3:51:42 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2021-08-07  3:51:42 0 [Note] InnoDB: Number of pools: 1
2021-08-07  3:51:42 0 [Note] InnoDB: Using SSE2 crc32 instructions
2021-08-07  3:51:42 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2021-08-07  3:51:42 0 [Note] InnoDB: Completed initialization of buffer pool
2021-08-07  3:51:42 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2021-08-07  3:51:43 0 [ERROR] InnoDB: Page [page id: space=0, page number=6] log sequence number 690626569522 is in the future! Current system log sequence number 690626204952.
2021-08-07  3:51:43 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
2021-08-07  3:51:43 0 [ERROR] InnoDB: Page [page id: space=0, page number=315] log sequence number 690626569566 is in the future! Current system log sequence number 690626204952.
2021-08-07  3:51:43 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
2021-08-07  3:51:43 0 [ERROR] InnoDB: corrupted TRX_NO 6bf91679110c0251
2021-08-07  3:51:43 0 [Note] InnoDB: Retry with innodb_force_recovery=5
2021-08-07  3:51:43 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
2021-08-07  3:51:43 0 [Note] InnoDB: Starting shutdown...
2021-08-07  3:51:43 0 [ERROR] Plugin 'InnoDB' init function returned error.
2021-08-07  3:51:43 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2021-08-07  3:51:43 0 [Note] Plugin 'FEEDBACK' is disabled.
2021-08-07  3:51:43 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded
2021-08-07  3:51:43 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2021-08-07  3:51:43 0 [ERROR] Aborting
 

cPRex

Jurassic Moderator
Staff member
Oct 19, 2014
7,494
1,009
313
cPanel Access Level
Root Administrator
It's important to note that even our cPanel support isn't able to assist with InnoDB issues. We recommend seeking out a qualified MySQL admin to help with those issues, especially if you're seeing this happen more than once on the same system.

In regards to software updates mentioned earlier, it would depend on your update settings, but a minor version change can happen. For example, you could be updated from MySQL 5.7.34 to 5.7.35 automatically.
 

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
A few years ago that wouldn't have been an issue, @cPRex, but after the quarantine I'm lucky to pay my own bills... hiring a pro just isn't going to be possible. Especially when it's to fix a problem that, as far as I can tell, is really a bug with MariaDB.

Googling individual error messages led me here:


Their errors happened during specific functions while mine seems to be random, but with the same error messages I have to think that something on the server ran automatically that caused the same problem they had.

But the error appears to be directly with InnoDB, and it's ONLY used on my end for MySQL (MariaDB) system tables. Process of elimination, there's just no way that the problem is on my end.
 
  • Like
Reactions: cPRex

sparek-3

Well-Known Member
Aug 10, 2002
2,039
229
368
cPanel Access Level
Root Administrator
It looks like there was an update for MariaDB 10.3 some time over the weekend.

I'm seeing a 10.3.31 update, while our servers are still running 10.3.30.

Are you sure your MariaDB was updated over the weekend?

No clue as to whether or not the update would have caused your issues, but seems to line up.

I think I'll wait a few more days before pushing this update to our servers in case this is 10.3.31 update is an issue.
 

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
Great catch!! I AM running 10.3.31, and now I recognize this in the error log:

2021-08-07 03:51:40 0x7fc7e2844700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.30/storage/innobase/trx/trx0rseg.cc line 361
InnoDB: Failing assertion: UT_LIST_GET_LEN(rseg->undo_list) == 0
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to System Dashboard - Jira
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: InnoDB Recovery Modes
InnoDB: about forcing recovery.
So it was running 10.3.30 at the time of the error.

My server provider changed innodb_force_recovery to 5 shortly after I started this thread at 12:51, and then I see that the log shows 10.3.31 at 12:55:

2021-08-07 12:55:50 0 [Note] InnoDB: 10.3.31 started; log sequence number 690626204943; transaction id 0
So they definitely have to be connected.
 

GoWilkes

Well-Known Member
Sep 26, 2006
613
25
168
cPanel Access Level
Root Administrator
Awesome!!

For future readers, I THINK that I've narrowed down the problem.

Remember that in May I had an InnoDB crash that wiped out a large chunk of my data. I still haven't figured out the cause of that crash, but the resolution was to add this to my.cnf:

innodb_force_recovery=4

I changed all of the InnoDB tables to MyISAM except for those in "mysql" and "sys", so I left the force_recovery line in an attempt to prevent future crashes.

But in doing this, it set the InnoDB tables in "mysql" and "sys" to read-only.

I THINK that this was the reason for last week's crash; the update couldn't write to those tables, so instead of exiting gracefully it crashed.

Based on this, @cPRex, my suggestion for a fix would be for future updates to make sure that everything is readable and writable before doing anything else, and if not then die gracefully and send an alert.
 
  • Like
Reactions: cPRex