MySQL DBA请注意 不要被Sleep会话蒙蔽了双眼
生产中有业务反应,进行一个insert操作,一直报错,报错信息如下:
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
为了效果展示,我进行了复刻
mysql> insert into t1 values (1,'bbb');
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
看到这个报错,提示也很明显了,锁超时了,那就要看看谁占用了该锁。
登录数据库 show processlist 或者select * from information_schema.processlist 查看。
查看是否有与该表相关的sql占用了该表的一些行锁。
结果发现抛开sleep的会话,没有任何的锁冲突和占用。
select id,user,host,db,command,time,state,info from information_schema.processlist where command!='sleep' order by time desc;
+---------+-----------------+----------------------+------+---------+---------+------------------------+------------------------------------------------------------------------------------------------------------------------------+
| id | user | host | db | command | time | state | info |
+---------+-----------------+----------------------+------+---------+---------+------------------------+------------------------------------------------------------------------------------------------------------------------------+
| 1 | event_scheduler | localhost | NULL | Daemon | 3545406 | Waiting on empty queue | NULL |
| 4703731 | root | ip:端口 | test | Query | 0 | executing | select id,user,host,db,command,time,state,info from information_schema.processlist where command!='sleep' order by time desc |
+---------+-----------------+----------------------+------+---------+---------+------------------------+------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
查看锁相关的信息
例如innodb_locks、innodb_lock_waits、sys.x$innodb_lock_waits或 sys.innodb_lock_waits没有发现任何锁问题。
那么问题出在哪里呢?
不妨查查sleep的会话中有没有什么秘密
mysql> SELECT trx_mysql_thread_id AS PROCESSLIST_ID,-> NOW(),-> TRX_STARTED,-> TO_SECONDS(now())-TO_SECONDS(trx_started) AS TRX_LAST_TIME ,-> USER,-> HOST,-> DB,-> TRX_QUERY-> FROM INFORMATION_SCHEMA.INNODB_TRX trx-> JOIN INFORMATION_SCHEMA.processlist pcl ON trx.trx_mysql_thread_id=pcl.id-> WHERE trx_mysql_thread_id != connection_id()-> AND TO_SECONDS(now())-TO_SECONDS(trx_started) >= 5 ;
+----------------+---------------------+---------------------+---------------+------+----------------------+------+-----------+
| PROCESSLIST_ID | NOW() | TRX_STARTED | TRX_LAST_TIME | USER | HOST | DB | TRX_QUERY |
+----------------+---------------------+---------------------+---------------+------+----------------------+------+-----------+
| 4711241 | 2025-09-01 17:58:40 | 2025-09-01 17:42:49 | 951 | root | 10.251.101.136:63792 | test | NULL |
+----------------+---------------------+---------------------+---------------+------+----------------------+------+-----------+
1 row in set (0.00 sec)
发现这条特殊的sleep会话
mysql> select * from INFORMATION_SCHEMA.processlist where id =4711241;
+---------+------+----------------------+------+---------+------+-------+------+
| ID | USER | HOST | DB | COMMAND | TIME | STATE | INFO |
+---------+------+----------------------+------+---------+------+-------+------+
| 4711241 | root | 10.251.101.136:63792 | test | Sleep | 1032 | | NULL |
+---------+------+----------------------+------+---------+------+-------+------+
1 row in set (0.00 sec)
为什么要从innodb_trx表去查询,因为:
INNODB_TRX 只包含活跃事务(未提交的事务)
processlist 包含所有连接(包括空闲连接)
大多数连接处于 Sleep 状态,但没有活跃事务
所以我说通过查看processlist的会话,很容易忽视sleep状态的会话。当前会话一定要结合innodb_trx表分析。
谜底揭晓
mysql> SELECT -> trx.trx_id,-> trx.trx_state,-> trx.trx_started,-> trx.trx_mysql_thread_id as thread_id,-> trx.trx_query,-> trx.trx_operation_state,-> trx.trx_tables_in_use,-> trx.trx_tables_locked,-> trx.trx_rows_locked,-> trx.trx_rows_modified,-> p.USER,-> p.HOST,-> p.DB,-> p.COMMAND,-> p.TIME as idle_time,-> p.STATE,-> p.INFO as last_query,-> TIMESTAMPDIFF(SECOND, trx.trx_started, NOW()) as trx_duration_sec-> FROM INFORMATION_SCHEMA.INNODB_TRX trx-> LEFT JOIN INFORMATION_SCHEMA.PROCESSLIST p -> ON trx.trx_mysql_thread_id = p.ID-> WHERE trx.trx_mysql_thread_id = 4711241;
+------------+-----------+---------------------+-----------+-----------+---------------------+-------------------+-------------------+-----------------+-------------------+------+----------------------+------+---------+-----------+-------+------------+------------------+
| trx_id | trx_state | trx_started | thread_id | trx_query | trx_operation_state | trx_tables_in_use | trx_tables_locked | trx_rows_locked | trx_rows_modified | USER | HOST | DB | COMMAND | idle_time | STATE | last_query | trx_duration_sec |
+------------+-----------+---------------------+-----------+-----------+---------------------+-------------------+-------------------+-----------------+-------------------+------+----------------------+------+---------+-----------+-------+------------+------------------+
| 3015071439 | RUNNING | 2025-09-01 17:42:49 | 4711241 | NULL | NULL | 0 | 1 | 1 | 1 | root | 10.251.101.136:63792 | test | Sleep | 1555 | | NULL | 1555 |
+------------+-----------+---------------------+-----------+-----------+---------------------+-------------------+-------------------+-----------------+-------------------+------+----------------------+------+---------+-----------+-------+------------+------------------+
1 row in set (0.00 sec)mysql>
mysql> SELECT -> THREAD_ID,-> EVENT_ID,-> SQL_TEXT,-> TIMER_WAIT/1000000000 as execution_time_sec,-> CURRENT_SCHEMA,-> MESSAGE_TEXT-> FROM performance_schema.events_statements_history-> WHERE THREAD_ID = (SELECT THREAD_ID -> FROM performance_schema.threads -> WHERE PROCESSLIST_ID = 4711241)-> ORDER BY EVENT_ID DESC-> LIMIT 10;
+-----------+----------+----------------------------------+--------------------+----------------+--------------+
| THREAD_ID | EVENT_ID | SQL_TEXT | execution_time_sec | CURRENT_SCHEMA | MESSAGE_TEXT |
+-----------+----------+----------------------------------+--------------------+----------------+--------------+
| 5821754 | 8 | insert into t1 values (1,'aaa') | 0.3886 | test | NULL |
| 5821754 | 7 | begin | 0.1184 | test | NULL |
| 5821754 | 6 | NULL | 0.2328 | test | NULL |
| 5821754 | 5 | show tables | 0.1258 | test | NULL |
| 5821754 | 4 | show databases | 0.7046 | test | NULL |
| 5821754 | 3 | NULL | 0.0624 | NULL | NULL |
| 5821754 | 2 | SELECT DATABASE() | 0.1558 | NULL | NULL |
| 5821754 | 1 | select @@version_comment limit 1 | 0.1633 | NULL | NULL |
+-----------+----------+----------------------------------+--------------------+----------------+--------------+
8 rows in set (0.03 sec)
通过上面的信息大家也能看到了我做了什么操作,我在另一个会话中提前开启了一个事务,但未提交。