Innodb undo之 undo结构简析


水平有限,如果有误请指出


参考:

  • 阿里内核月报
  • 姜老师的MySQL内核:innodb存储引擎

做一个简单的记录,自己备用

一、大体结构

rollback segments(128)

undo segments(1024)

undo log (header insert/modify 分开的) <-> undo page

undo record
undo record

作为undo segments的第一个undo page可以存放多个事物的undo log,因为如果这个块的undo 记录没有填满3/4则会进入 rollback segment的cache list,那么下次可以继续使用,但是如果第一个块不足以装下事物的undo 记录,那么很显然需要分配新的undo page,这种情况下一个undo page就只能包含一个事物的undo记录了。
事物每次需要分配rollback segments然后分配undo segments然后初始化好undo log header,insert和update/delete需要分配不同的undo segments,一个undo segments往往对应了一个undo log,undo log可以包含多个undo record(因为从debug来看undo log header的初始化只做了一次),对于操作的每行都会留下一个undo record作为mvcc构建历史版本的基础。
undo生成的基本单位是undo record,每行记录都会包含一个undo record,而rollback ptr指向的是undo record的偏移量,对于每行的记录都会去判断其可见性,如果需要构建前版本则通过本指针进行构建包含:

  • 第1位是否是insert 第2到8位是undo segment id 第9到40位为page no 第41位到56位为 offset

每一个undo log包含一个trx_undo_t结构体
每一个rollback segments包含一个trx_rseg_t结构体

二、物理结构

  • undo page header 每一个undo page都包含
/** Transaction undo log page header offsets */
/* @{ */
#define TRX_UNDO_PAGE_TYPE  0   /*!< TRX_UNDO_INSERT or
                    TRX_UNDO_UPDATE */
#define TRX_UNDO_PAGE_START 2   /*!< Byte offset where the undo log
                    records for the LATEST transaction
                    start on this page (remember that
                    in an update undo log, the first page
                    can contain several undo logs) */
#define TRX_UNDO_PAGE_FREE  4   /*!< On each page of the undo log this
                    field contains the byte offset of the
                    first free byte on the page */
#define TRX_UNDO_PAGE_NODE  6   /*!< The file list node in the chain
                    of undo log pages */

  • undo semgent header 第一个page 才会用 undo segment header信息
#define TRX_UNDO_STATE      0   /*!< TRX_UNDO_ACTIVE, ... */

#ifndef UNIV_INNOCHECKSUM

#define TRX_UNDO_LAST_LOG   2   /*!< Offset of the last undo log header
                    on the segment header page, 0 if
                    none */
#define TRX_UNDO_FSEG_HEADER    4   /*!< Header for the file segment which
                    the undo log segment occupies */
#define TRX_UNDO_PAGE_LIST  (4 + FSEG_HEADER_SIZE)
                    /*!< Base node for the list of pages in
                    the undo log segment; defined only on
                    the undo log segment's first page */
  • 每一个undo log
    • undo log header
    • undo log record 相应的undo实际内容
    • undo log record 相应的undo实际内容

undo log header 包含

#define TRX_UNDO_TRX_ID     0   /*!< Transaction id */
#define TRX_UNDO_TRX_NO     8   /*!< Transaction number of the
                    transaction; defined only if the log
                    is in a history list */
#define TRX_UNDO_DEL_MARKS  16  /*!< Defined only in an update undo
                    log: TRUE if the transaction may have
                    done delete markings of records, and
                    thus purge is necessary */
#define TRX_UNDO_LOG_START  18  /*!< Offset of the first undo log record
                    of this log on the header page; purge
                    may remove undo log record from the
                    log start, and therefore this is not
                    necessarily the same as this log
                    header end offset */
#define TRX_UNDO_XID_EXISTS 20  /*!< TRUE if undo log header includes
                    X/Open XA transaction identification
                    XID */
#define TRX_UNDO_DICT_TRANS 21  /*!< TRUE if the transaction is a table
                    create, index create, or drop
                    transaction: in recovery
                    the transaction cannot be rolled back
                    in the usual way: a 'rollback' rather
                    means dropping the created or dropped
                    table, if it still exists */
#define TRX_UNDO_TABLE_ID   22  /*!< Id of the table if the preceding
                    field is TRUE */
#define TRX_UNDO_NEXT_LOG   30  /*!< Offset of the next undo log header
                    on this page, 0 if none */
#define TRX_UNDO_PREV_LOG   32  /*!< Offset of the previous undo log
                    header on this page, 0 if none */
#define TRX_UNDO_HISTORY_NODE   34  /*!< If the log is put to the history
                    list, the file list node is here */

/* Note: the writing of the undo log old header is coded by a log record
MLOG_UNDO_HDR_CREATE or MLOG_UNDO_HDR_REUSE. The appending of an XID to the
header is logged separately. In this sense, the XID is not really a member
of the undo log header. TODO: do not append the XID to the log header if XA
is not needed by the user. The XID wastes about 150 bytes of space in every
undo log. In the history list we may have millions of undo logs, which means
quite a large overhead. */

/** X/Open XA Transaction Identification (XID) */
/* @{ */
/** xid_t::formatID */
#define TRX_UNDO_XA_FORMAT  (TRX_UNDO_LOG_OLD_HDR_SIZE)
/** xid_t::gtrid_length */
#define TRX_UNDO_XA_TRID_LEN    (TRX_UNDO_XA_FORMAT + 4)
/** xid_t::bqual_length */
#define TRX_UNDO_XA_BQUAL_LEN   (TRX_UNDO_XA_TRID_LEN + 4)
/** Distributed transaction identifier data */
#define TRX_UNDO_XA_XID     (TRX_UNDO_XA_BQUAL_LEN + 4)

三、分配步骤和写入

  • 第一步为 分配rollback segments
#0  get_next_redo_rseg (max_undo_logs=128, n_tablespaces=4) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:1138
#1  0x0000000001c0bce8 in trx_assign_rseg_low (max_undo_logs=128, n_tablespaces=4, rseg_type=TRX_RSEG_TYPE_REDO)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:1314
#2  0x0000000001c1097d in trx_set_rw_mode (trx=0x7fffd7804080) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:3352
#3  0x0000000001a64013 in lock_table (flags=0, table=0x7ffeac012ae0, mode=LOCK_IX, thr=0x7ffe7c92ef48)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/lock/lock0lock.cc:4139
#4  0x0000000001b7950e in row_search_mvcc (buf=0x7ffe7c92e350 "\377", mode=PAGE_CUR_GE, prebuilt=0x7ffe7c92e7d0, match_mode=1, direction=0)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0sel.cc:5100
#5  0x00000000019d5443 in ha_innobase::index_read (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key_ptr=0x7ffe7cd57590 "\004", key_len=4, 
    find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:9536
#6  0x0000000000f9345a in handler::index_read_map (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key=0x7ffe7cd57590 "\004", keypart_map=1, 
    find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/handler.h:2942
#7  0x0000000000f83e44 in handler::ha_index_read_map (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key=0x7ffe7cd57590 "\004", keypart_map=1, 
    find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/handler.cc:3248

第二步 对于主键每行更改操作都会调用trx_undo_report_row_operation 他会分配undo segments 并且会负责写入undo record

#0  trx_undo_report_row_operation (flags=0, op_type=2, thr=0x7ffe7c932828, index=0x7ffea4016590, clust_entry=0x7ffe7c932cc0, update=0x0, cmpl_info=0, 
    rec=0x7fffb580d369 "", offsets=0x7fffec0f3e00, roll_ptr=0x7fffec0f3688) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0rec.cc:1866
#1  0x0000000001c5795b in btr_cur_del_mark_set_clust_rec (flags=0, block=0x7fffb4ccaae0, rec=0x7fffb580d369 "", index=0x7ffea4016590, offsets=0x7fffec0f3e00, 
    thr=0x7ffe7c932828, entry=0x7ffe7c932cc0, mtr=0x7fffec0f38f0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/btr/btr0cur.cc:4894
#2  0x0000000001b9f218 in row_upd_del_mark_clust_rec (flags=0, node=0x7ffe7c932550, index=0x7ffea4016590, offsets=0x7fffec0f3e00, thr=0x7ffe7c932828, referenced=0, 
    mtr=0x7fffec0f38f0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:2778
#3  0x0000000001b9f765 in row_upd_clust_step (node=0x7ffe7c932550, thr=0x7ffe7c932828)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:2923
#4  0x0000000001b9fc74 in row_upd (node=0x7ffe7c932550, thr=0x7ffe7c932828) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:3042
#5  0x0000000001ba0155 in row_upd_step (thr=0x7ffe7c932828) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:3188
#6  0x0000000001b3d3a0 in row_update_for_mysql_using_upd_graph (mysql_rec=0x7ffe7c9318d0 "\375\001", prebuilt=0x7ffe7c931d50)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0mysql.cc:3040
#7  0x0000000001b3d6a1 in row_update_for_mysql (mysql_rec=0x7ffe7c9318d0 "\375\001", prebuilt=0x7ffe7c931d50)
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0mysql.cc:3131
#8  0x00000000019d47c3 in ha_innobase::delete_row (this=0x7ffe7c931390, record=0x7ffe7c9318d0 "\375\001")
    at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:9141

大概流程

switch (op_type)
{
    case TRX_UNDO_INSERT_OP:
        undo = undo_ptr->insert_undo; //如果是 insert 则使用insert_undo 类型为trx_undo_t 指针

        if (undo == NULL) { //如果已经分配了就不用分配了

            err = trx_undo_assign_undo( //分配undo segment 同时初始化 undo log header
                trx, undo_ptr, TRX_UNDO_INSERT);
            undo = undo_ptr->insert_undo;
  ...
        }
        break;
    default:
        ut_ad(op_type == TRX_UNDO_MODIFY_OP); //断言

        undo = undo_ptr->update_undo;

        if (undo == NULL) {
            err = trx_undo_assign_undo(
                trx, undo_ptr, TRX_UNDO_UPDATE); //分配undo segment 同时初始化 undo log header
            undo = undo_ptr->update_undo;
    ...
    }

  ...
  case TRX_UNDO_INSERT_OP://注意是每行都会操作
            offset = trx_undo_page_report_insert( //写入insert undo log record
                undo_page, trx, index, clust_entry, &mtr);
            break;
        default:
            ut_ad(op_type == TRX_UNDO_MODIFY_OP); //写入delete update undo log record
            offset = trx_undo_page_report_modify(
                undo_page, trx, index, rec, offsets, update,
                cmpl_info, clust_entry, &mtr);
        }
    ...
    
    *roll_ptr = trx_undo_build_roll_ptr( //构建rollback ptr 主键中每行都有这个 用于MVCC构建回滚版本
                op_type == TRX_UNDO_INSERT_OP,
                undo_ptr->rseg->id, page_no, offset);

四、分解undo log record

我将undo log record的写入到了错误日志,下面进行简单的分解。
表结构如下:

mysql> show create table t1;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `id1` int(11) NOT NULL,
  `id2` int(11) DEFAULT NULL,
  PRIMARY KEY (`id1`),
  KEY `id2` (`id2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
  • insert 的undo记录,具体构造在trx_undo_page_report_insert中
    语句
mysql> insert into t1 values(28,28);
Query OK, 1 row affected (0.00 sec)

输出如下:

trx_undo_assign_undo:assign undo space:
RSEG SLOT:34,RSEG SPACE ID:2 PAGE NO:3
UNDO SLOT:0,UNDO SPACE ID:2 UNDO LOG HEADER PAGE NO:27,UNDO LOG HEADER OFFSET:86,UNDO LOG LAST PAGE:27
trx_undo_page_report_insert:undo log record
TABLE_NAME:test/t1 TRX_ID:12591,UODO RECORD LEN:10
 len 10; hex 011e0b0032048000001c; 

011e0b0032048000001c就是undo record的实际记录解析如下:

011c page内部本undo record结束的位置
0b 类型为 #define  TRX_UNDO_INSERT_REC 11(0X0b)
00 undo no,提交才会有
32 table_id 可以查询 INNODB_SYS_TABLES 对照
04 字段长度4个字节
8000001c 我插入的记录主键 28(0X1c)
  • update 的undo记录,具体构造在trx_undo_page_report_modify中
    语句:
mysql> update t1 set id2=1000 where id1=14;
Query OK, 1 row affected (5 min 40.91 sec)
Rows matched: 1  Changed: 1  Warnings: 0

输出如下:

trx_undo_assign_undo:assign undo space:
RSEG SLOT:41,RSEG SPACE ID:1 PAGE NO:5
UNDO SLOT:1,UNDO SPACE ID:1 UNDO LOG HEADER PAGE NO:37,UNDO LOG HEADER OFFSET:1389,UNDO LOG LAST PAGE:37
trx_undo_page_report_modify:undo log record
TABLE_NAME:test/t1 TRX_ID:12604,UODO RECORD LEN:47
 len 47; hex 06560c0032000000003136e0260000002c052e048000000e010304800003e7000e00048000000e0304800003e70627; 

06560c0032000000003136e0260000002c052e048000000e010304800003e7000e00048000000e030480
就是undo record的记录
大体解析如下:

0656 :page内部本undo record结束的位置
0c:类型为 #define  TRX_UNDO_UPD_EXIST_REC  12(0X0c)
00: undo no,提交才会有
32: table_id 可以查询 INNODB_SYS_TABLES 对照
00:
0000003136e0:事物ID
260000002c052e:undo回滚指针
04:主键长度
8000000e:主键值
01
03:位置
04:被修改值的长度
800003e7:值为999(0x3e7)
000e:接下来字符的长度,记录原始值?
00:位置
04:长度
8000000e:主键值
03:位置
04:长度
800003e7:值为999(0x3e7)
0627:page内部本undo record开始的位置,0X0656-0X0627就是长度

作者微信:gp_22389860

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 157,298评论 4 360
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 66,701评论 1 290
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 107,078评论 0 237
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 43,687评论 0 202
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,018评论 3 286
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,410评论 1 211
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,729评论 2 310
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,412评论 0 194
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,124评论 1 239
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,379评论 2 242
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 31,903评论 1 257
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,268评论 2 251
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 32,894评论 3 233
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,014评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,770评论 0 192
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,435评论 2 269
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,312评论 2 260

推荐阅读更多精彩内容