【PHP问题定位】修改数据库字段导致的fpm的coredump问题定位

Phplayers 2019-06-27

顺风车运营研发团队 黄桃

背景

线上零星会出现fpm进程coredump 及 fpm进程占用内存超过限制等报警告,并且两者报警的时间上往往都比较接近,如下:

【PHP问题定位】修改数据库字段导致的fpm的coredump问题定位

【PHP问题定位】修改数据库字段导致的fpm的coredump问题定位

原因分析

时间上接近,那么出现两者报警的原因有可能是相同的,登录机器gdb调试coredump生成core文件:

cd /***/coresave/
gdb /***/php7/sbin/php-fpm -c core.php-fpm.12121.1528322653
bt
#0  0x0000003f0f089770 in memcpy () from /lib64/libc.so.6
#1  0x00000000006403c5 in zend_string_init (stmt=0x7f505b478380, colno=<value optimized out>)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_string.h:159
#2  pdo_mysql_stmt_describe (stmt=0x7f505b478380, colno=<value optimized out>)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/ext/pdo_mysql/mysql_statement.c:705
#3  0x000000000063a795 in pdo_stmt_describe_columns (stmt=0x7f505b478380) at /**/**/offcial_code/php/7.0.6/php-7.0.6/ext/pdo/pdo_stmt.c:206
#4  0x000000000063add5 in zim_PDOStatement_execute (execute_data=<value optimized out>, return_value=0x7f505b4157e0)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/ext/pdo/pdo_stmt.c:523
#5  0x00007f5054056bf1 in hp_execute_internal (execute_data=0x7f505b415830, return_value=0x7f505b4157e0) at /**/**/xhprof/extension/xhprof.c:1775
#6  0x0000000000879702 in ZEND_DO_FCALL_SPEC_HANDLER (execute_data=0x7f505b415680)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:844
#7  0x0000000000841a40 in execute_ex (ex=<value optimized out>) at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:417
#8  0x00007f50540583d1 in hp_execute_ex (execute_data=0x7f505b415680) at /**/**/xhprof/extension/xhprof.c:1748
#9  0x000000000087957a in ZEND_DO_FCALL_SPEC_HANDLER (execute_data=0x7f505b4155a0)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:800
#10 0x0000000000841a40 in execute_ex (ex=<value optimized out>) at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:417
#11 0x00007f50540583d1 in hp_execute_ex (execute_data=0x7f505b4155a0) at /**/**/xhprof/extension/xhprof.c:1748
#12 0x000000000087957a in ZEND_DO_FCALL_SPEC_HANDLER (execute_data=0x7f505b4154a0)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:800
#13 0x0000000000841a40 in execute_ex (ex=<value optimized out>) at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:417
#14 0x00007f50540583d1 in hp_execute_ex (execute_data=0x7f505b4154a0) at /**/**/xhprof/extension/xhprof.c:1748
#15 0x000000000087957a in ZEND_DO_FCALL_SPEC_HANDLER (execute_data=0x7f505b4153b0)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:800
#16 0x0000000000841a40 in execute_ex (ex=<value optimized out>) at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:417
#17 0x00007f50540583d1 in hp_execute_ex (execute_data=0x7f505b4153b0) at /**/**/xhprof/extension/xhprof.c:1748
#18 0x00000000007f52a8 in zend_call_function (fci=0x7ffebd295f40, fci_cache=0x7ffebd295f90)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_execute_API.c:866
#19 0x0000000000709fef in zif_call_user_func_array (execute_data=0x7f505b415330, return_value=0x7f505b415300)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/ext/standard/basic_functions.c:4811
#20 0x00007f5054056bf1 in hp_execute_internal (execute_data=0x7f505b415330, return_value=0x7f505b415300) at /**/**/xhprof/extension/xhprof.c:1775
#21 0x0000000000879702 in ZEND_DO_FCALL_SPEC_HANDLER (execute_data=0x7f505b415290)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:844
#22 0x0000000000841a40 in execute_ex (ex=<value optimized out>) at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:417
---Type <return> to continue, or q <return> to quit---
#23 0x00007f50540583d1 in hp_execute_ex (execute_data=0x7f505b415290) at /**/**/xhprof/extension/xhprof.c:1748
#24 0x000000000087957a in ZEND_DO_FCALL_SPEC_HANDLER (execute_data=0x7f505b4150f0)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:800
#25 0x0000000000841a40 in execute_ex (ex=<value optimized out>) at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:417
#26 0x00007f50540583d1 in hp_execute_ex (execute_data=0x7f505b4150f0) at /**/**/xhprof/extension/xhprof.c:1748
#27 0x0000000000887037 in ZEND_INCLUDE_OR_EVAL_SPEC_TMPVAR_HANDLER (execute_data=0x7f505b415030)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:40848
#28 0x0000000000841a40 in execute_ex (ex=<value optimized out>) at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:417
#29 0x00007f50540583d1 in hp_execute_ex (execute_data=0x7f505b415030) at /**/**/xhprof/extension/xhprof.c:1748
#30 0x000000000089496b in zend_execute (op_array=0x7f505b46a000, return_value=<value optimized out>)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend_vm_execute.h:458
#31 0x0000000000802233 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /**/**/offcial_code/php/7.0.6/php-7.0.6/Zend/zend.c:1427
#32 0x00000000007a4b40 in php_execute_script (primary_file=0x7ffebd298990) at /**/**/offcial_code/php/7.0.6/php-7.0.6/main/main.c:2494
#33 0x00000000008a27fe in main (argc=<value optimized out>, argv=<value optimized out>)
    at /**/**/offcial_code/php/7.0.6/php-7.0.6/sapi/fpm/fpm/fpm_main.c:1968

可以看到是在调用pdo扩展读取MySQL数据后,然后copy读取到的数据(memcpy)时出现的coredump

通过php的gdb工具.gdbint,来查看对应的zbacktrace

source /tmp/php-src-PHP-7.0.6/.gdbinit
zbacktrace
[0x7f505b415830] PDOStatement->execute(array(1)[0x7f505b415890]) [internal function]
[0x7f505b415680] ***\***\Mysql->run("SELECT\40*\40FROM\40table_name\40WHERE\40\40user_id\40=\40?", array(1)[0x7f505b4156f0], "select") /**/**/**/**/helper/mysql.php:270
[0x7f505b4155a0] ***\***\Mysql->select("table_name", "\40user_id\40=\40?\40", array(1)[0x7f505b415620]) /**/**/**/**/helper/mysql.php:364
[0x7f505b4154a0] ***\***\Model\User\FaceAuth->getInfoByUidForApi("144796772") /**/**/**/**/model/user/faceauth.php:200
[0x7f505b4153b0] ***\***\Controller\User\FaceAuthInfo->index(array(2)[0x7f505b415410]) /**/**/***/***/controller/user/faceauthinfo.php:50
[0x7f505b415330] call_user_func_array(array(2)[0x7f505b415390], array(1)[0x7f505b4153a0]) [internal function]
[0x7f505b415290] ***\Framework\Base\Router->run(array(2)[0x7f505b4152f0]) /**/**/**/**/base/router.php:96
[0x7f505b4150f0] (main) /**/**/**/**/framework.php:139
[0x7f505b415030] (main) /**/**/***/***/index.php:46

是在执行“SELECT40*40FROM40table_name40WHERE4040user_id40=40? ”出现的问题

此时两种可能,第一个传入的参数是否有问题?第二个是此语句返回的数据是否有问题?

先验证传入参数:

继续看当时传入的参数是什么?打印地址 0x7f505b415620 当时的值

(gdb) print ((zval *)0x7f505b415620)
$1 = (zval *) 0x7f505b415620
(gdb) p $1
$2 = (zval *) 0x7f505b415620
(gdb) p *$1
$3 = {value = {lval = 139983105408624, dval = 6.9160843380575109e-310, counted = 0x7f505b45be70, str = 0x7f505b45be70, arr = 0x7f505b45be70,
    obj = 0x7f505b45be70, res = 0x7f505b45be70, ref = 0x7f505b45be70, ast = 0x7f505b45be70, zv = 0x7f505b45be70, ptr = 0x7f505b45be70, ce = 0x7f505b45be70,
    func = 0x7f505b45be70, ww = {w1 = 1531297392, w2 = 32592}}, u1 = {v = {type = 7 '\a', type_flags = 28 '\034', const_flags = 0 '\000',
      reserved = 0 '\000'}, type_info = 7175}, u2 = {var_flags = 0, next = 0, cache_slot = 0, lineno = 0, num_args = 0, fe_pos = 0, fe_iter_idx = 0}}
 //此时的参数传入的是个数组 type=7;继续查看数组内容是啥;,
(gdb) p *$1.value.arr.arData.key
Cannot access memory at address 0x0 //key为空
p *$1.value.arr.arData.val.value.str.val@30
$10 = "144796772\000\000\000\377\377\377\377\001\000\000\000\006\000\000\000\bH\024\252+0"
//传入参数user_id=144796772,入参没问题

那当时语句返回的数据呢?

根据前面查看core文件的信息可以知道,返回数据后,最终调用了pdo_mysql_stmt_describe 函数,也正是在这之后出现的coredump,此函数中变量stmt 存储着sql语句执行后的返回值。
继续看返回stmt里面值是什么?//pdo_mysql_stmt_describe (stmt=0x7f505b478380, colno=<value optimized out>) (注:打印某个地址的值时,需要知道变量的类型,
对应的数据结构也需在源码中看,否则无法继续跟进,此处查看stmt 变量的类型为pdo_stmt_t )
先记住几个变量:

(gdb) print ((pdo_stmt_t *)0x7f505b478380)
$24 = (pdo_stmt_t *) 0x7f505b478380 //stmt变量对应的是$24
(gdb) p (pdo_mysql_stmt *)$24.driver_data
$31 = (pdo_mysql_stmt *) 0x7f505b45a500
(gdb) p (struct pdo_column_data *) $24.columns
$41 = (struct pdo_column_data *) 0x7f505b4b5300
 
pdo_mysql_stmt_describe函数的php的源码如下 :
static int pdo_mysql_stmt_describe(pdo_stmt_t *stmt, int colno) /* {{{ */
{
    pdo_mysql_stmt *S = (pdo_mysql_stmt*)stmt->driver_data;//变量$31
    struct pdo_column_data *cols = stmt->columns;//$变量$41
    int i;
 
    PDO_DBG_ENTER("pdo_mysql_stmt_describe");
    PDO_DBG_INF_FMT("stmt=%p", S->stmt);
    if (!S->result) {
        PDO_DBG_RETURN(0);
    }
 
    if (colno >= stmt->column_count) {
        /* error invalid column */
        PDO_DBG_RETURN(0);
    }
 
    /* fetch all on demand, this seems easiest
    ** if we've been here before bail out
    */
    if (cols[0].name) {
        PDO_DBG_RETURN(1);
    }
    for (i = 0; i < stmt->column_count; i++) {
    //stmt->column_count的值是17,对应的是table_name表中列的总数
 
        if (S->H->fetch_table_names) {
            cols[i].name = strpprintf(0, "%s.%s", S->fields[i].table, S->fields[i].name);
        } else {
            cols[i].name = zend_string_init(S->fields[i].name, S->fields[i].name_length, 0);
//猜想:1、coredump是在调用zend_string_init函数里面的memcpy (),猜想越界访问受保护的内存?
//2、内存占用超出PHP限制大小是否也是在这出的问题呢,调用zend_string_init时传入的S->fields[i].name_length 过大导致?
 
        cols[i].precision = S->fields[i].decimals;
        cols[i].maxlen = S->fields[i].length;
 
#ifdef PDO_USE_MYSQLND
        if (S->stmt) {
            cols[i].param_type = PDO_PARAM_ZVAL;
        } else
#endif
        {
            cols[i].param_type = PDO_PARAM_STR;
        }
    }
    PDO_DBG_RETURN(1);
}

猜想:1、coredump是在调用zend_string_init函数里面的memcpy (),猜想越界访问受保护的内存?2、内存占用超出PHP限制大小是否也是在这出的问题呢,调用zend_string_init时传入的S->fields[i].name_length 过大导致?继续 打印验证;

原因验证

打印当时返回的数据来验证前面的猜想:

//先看一下S->fields[i] 变量的值
(gdb) p $31.fields[0]
$79 = {sname = 0x7f505b45fdc0, name = 0x7f505b45fdd8 "id", org_name = 0x7f505b403a17 "id", table = 0x7f505b403a03 "table_name",
  org_table = 0x7f505b403a0d "table_name", db = 0x7f505b4039f4 "db_name", catalog = 0x7f505b4039f0 "def", def = 0x0, length = 20, max_length = 0,
  name_length = 2, org_name_length = 2, table_length = 9, org_table_length = 9, db_length = 14, catalog_length = 3, def_length = 0, flags = 49699,
  decimals = 0, charsetnr = 63, type = MYSQL_TYPE_LONGLONG, root = 0x7f505b4039f0 "def", root_len = 45}
//name=“id”,table="table_name",很明显是数据表中某一列的信息信息数据,pdo_mysql_stmt_describe函数的作用是变量S->fields 并赋值给变量cols;
//继续打印:
(gdb) p $31.fields[0].name_length
$64 = 2
(gdb) p $31.fields[1].name_length
$65 = 7
(gdb) p $31.fields[2].name_length
$66 = 10
(gdb) p $31.fields[3].name_length
$67 = 13
(gdb) p $31.fields[4].name_length
$68 = 16
(gdb) p $31.fields[13].name_length
$69 = 5
(gdb) p $31.fields[14].name_length
$70 = 8
(gdb) p $31.fields[15].name_length
$71 = 0
(gdb) p $31.fields[16].name_length
$72 = 5863687//此时罪魁祸首出现了,读取的p $31.fields[16].name_length 出现了一个巨大的值;
(gdb) p $31.fields[16]//打印变量看,出现了访问越界
$109 = {sname = 0x7f505b464028, name = 0xffffffff00001406 <Address 0xffffffff00001406 out of bounds>,//访问越界
  org_name = 0x800000000b88b287 <Address 0x800000000b88b287 out of bounds>, table = 0x7f505b45d0c0 "Asia/Shanghai",
  org_table = 0x7f505b45d120 "\020\321E[P\177", db = 0xffffffff00001406 <Address 0xffffffff00001406 out of bounds>,
  catalog = 0x80000652f67b748b <Address 0x80000652f67b748b out of bounds>, def = 0x7f505b45d140 "0j\003IP\177", length = 139983105413536,
  max_length = 8589939718, name_length = 5863687, org_name_length = 2147483648, table_length = 1531302336, org_table_length = 32592, db_length = 1531302432,
  catalog_length = 32592, def_length = 5126, flags = 3, decimals = 1929407563, charsetnr = 2147537079, type = 1531302464,
  root = 0x7f505b464078 "\300MF[P\177", root_len = 18446744069414589446}

stmt->driver_data->fields数组实际的大小与 stmt->column_count标记大小不一致,导致了在循环该数组时,出现内存访问越界,如果此时访问的内存受系统保护则coredump,否则也取到一个很大的值,初始化一个极大的字符串,导致内存占用超出PHP限制的大小;

为什么stmt->driver_data->fields数组与 stmt->column_count标记的大小不一致呢?往前追溯stmt变量的初始化,看是否存在问题?

在PHP源码中函数 pdo_mysql_stmt_execute->pdo_mysql_stmt_execute_prepared_mysqlnd,对column_count 及fields 有做定义,具体如下:

static int pdo_mysql_stmt_execute_prepared_mysqlnd(pdo_stmt_t *stmt) /* {{{ */
{
    pdo_mysql_stmt *S = stmt->driver_data;
    pdo_mysql_db_handle *H = S->H;
    int i;

    PDO_DBG_ENTER("pdo_mysql_stmt_execute_prepared_mysqlnd");

    if (mysql_stmt_execute(S->stmt)) {//mysqlnd/mysqlnd_ps.c 619
        pdo_mysql_error_stmt(stmt);
        PDO_DBG_RETURN(0);
    }

    if (S->result) {
        /* TODO: add a test to check if we really have zvals here... */
        mysql_free_result(S->result);
        S->result = NULL;
    }

    /* for SHOW/DESCRIBE and others the column/field count is not available before execute */
    stmt->column_count = mysql_stmt_field_count(S->stmt);
    for (i = 0; i < stmt->column_count; i++) {
        mysqlnd_stmt_bind_one_result(S->stmt, i);
    }

    S->result = mysqlnd_stmt_result_metadata(S->stmt);
    if (S->result) {
        S->fields = mysql_fetch_fields(S->result);
        /* if buffered, pre-fetch all the data */
        if (H->buffered) {
            if (mysql_stmt_store_result(S->stmt)) {
                PDO_DBG_RETURN(0);
            }
        }
    }

    pdo_mysql_stmt_set_row_count(stmt);
    PDO_DBG_RETURN(1);
}
定义代码:
stmt->column_count = mysql_stmt_field_count(S->stmt);//返回17 
S->fields = S->fields = mysql_fetch_fields(S->result);//实际数组则只有15

问题的根源就是 宏mysql_stmt_field_count(获取数组大小) 与 宏mysql_fetch_fields(获取数组内容) 返回对应不上导致的;

这两个宏定义如下:

mysql_stmt_field_count:
#define mysql_stmt_field_count(s)       mysqlnd_stmt_field_count((s))
#define mysqlnd_stmt_field_count(stmt)      (stmt)->m->get_field_count((stmt))//函数指针 php_mysqlnd_stmt_field_count_pub 函数
mysql_fetch_fields:
#define mysql_fetch_fields(r) mysqlnd_fetch_fields((r))//函数指针:
#define mysqlnd_fetch_fields(result) (result)->m.fetch_fields((result)) //函数指针 php_mysqlnd_res_fetch_fields_pub 函数

看这两个函数实现,函数实现:

stmt->column_count:php_mysqlnd_stmt_field_count_pub函数->最终取的是 stmt->driver_data.stmt,data.field_count;

stmt->driver_data->fields数组:php_mysqlnd_res_fetch_fields_pub函数->php_mysqlnd_res_meta_fetch_field_pub函数,最终取的是:stmt->driver_data.result.meta.fields 此数组的长度与 stmt->driver_data.stmt.data.field_count 取得值存在误差;

继续跟进为什么两者存在偏差?

猜测原因:猜测原因:stmt->driver_data.result.meta.fields 此数组的长度与 stmt->driver_data.stmt.data.field_count是两次链接读取的;

先看select函数的核心实现:

$t1 = microtime(true);
$pdoStmt = $this->pdo->prepare($sql); //调用的是 php_mysqlnd_stmt_prepare_pub
$prepareSuccess = true;
$return = true;
$result = $pdoStmt->execute($bind);//调用的是  php_mysqlnd_stmt_execute_pub
gdb调试,依次打上断点:
b php_mysqlnd_stmt_prepare_pub
b php_mysqlnd_stmt_execute_pub

在 php_mysqlnd_stmt_execute_pub 函数可以看到 该变量( stmt->driver_data.stmt.data.field_count )已经被初始化(在php_mysqlnd_stmt_prepare_pub初始化),而stmt->driver_data.result.meta.fields 的初始化则是在之后的php_mysqlnd_res_read_result_metadata_pub函数中赋值,这就导致了两者之间可能不一致的情况;

【PHP问题定位】修改数据库字段导致的fpm的coredump问题定位

php_mysqlnd_res_read_result_metadata_pub函数具体bt信息如下:

#0  php_mysqlnd_res_read_result_metadata_pub (result=0x7ffff61e8040, conn=0x7ffff61b0400)
    at /**/offcial_code/php/7.0.6/php-7.0.6/ext/mysqlnd/mysqlnd_result.c:357
#1  0x0000000000798971 in mysqlnd_query_read_result_set_header (conn=0x7ffff61b0400, s=<value optimized out>)
    at /**/offcial_code/php/7.0.6/php-7.0.6/ext/mysqlnd/mysqlnd_result.c:533
#2  0x000000000079eb71 in mysqlnd_stmt_execute_parse_response (s=0x7ffff61f0000, type=<value optimized out>)
    at /**/offcial_code/php/7.0.6/php-7.0.6/ext/mysqlnd/mysqlnd_ps.c:504
#3  0x000000000079bc6c in php_mysqlnd_stmt_execute_pub (s=0x7ffff61f0000) at /**/offcial_code/php/7.0.6/php-7.0.6/ext/mysqlnd/mysqlnd_ps.c:614

验证一下此处的stmt变量是否为php_mysqlnd_stmt_prepare_pub函数中使用的变量,打印两个变量地址对比结果如下:

php_mysqlnd_stmt_prepare_pub 时的 stmt 地址 : $38 = (MYSQLND_STMT_DATA *) 0x7fffed40a700
php_mysqlnd_stmt_execute_pub 时的 S->stmt.data 的stmt地址 : $46 = (MYSQLND_STMT_DATA *) 0x7fffed40a700

继续跟进代码,php_mysqlnd_stmt_execute_pub函数的整体的调用流程如下:

第一步:
if(!stmt){}else{//stmt 为第一次链接mysql,调用(php_mysqlnd_stmt_prepare_pub)执行返回的数据
    result = stmt.result //result第一次函数调用返回的结果,里面包含了表的列数信息等 mysqlnd_result.c:525行
}
第二步:
//初始化result->meta,传入列数为result->field_count,即第一次调用mysql返回的列数,问题的关键点在这,此处应该取 rset_header->field_count 就不会出问题;
result->meta = result->m.result_meta_init(result->field_count, result->persistent); //mysqlnd_result.c 371行 
第三步:
//rset_header变量的初始化为mysqlnd_result.c:413行 
conn->field_count = rset_header->field_count;//rset_header 为第二次链接mysql读取到的返回值,(此阶段为php_mysqlnd_stmt_execute_pub函数中) mysqlnd_result.c:499行
第四步:
stmt->field_count = stmt->result->field_count = conn->field_count;//把第二次链接mysql读取到的列数赋值给stmt->field_count  //mysqlnd_ps.c:541行
第五步:
stmt->column_count = mysql_stmt_field_count(S->stmt);//return stmt? stmt->field_count : 0;//把stmt->driver_data.stmt.field_count(即第四步的变量stmt->field_count) 赋值给 stmt->column_count //mysql_statement.c:293行
第六步:
S->fields = mysql_fetch_fields(S->result);//把stmt->driver_data.result.meta.fields 赋值给 stmt->driver_data.fields //mysql_statement.c:300行 
第七步:
调用 pdo_mysql_stmt_describe函数 执行如下代码,如果两次读数据库返回的field_count不一致,则可能造成coredump及内存超出限制大小
for (i = 0; i < stmt->column_count; i++) {
    cols[i].name = zend_string_init(S->fields[i].name, S->fields[i].name_length, 0);
     cols[i].precision = S->fields[i].decimals;                                                                                       │
     cols[i].maxlen = S->fields[i].length;    
}

猜测的原因为:在更改表结构时,因为存在一主多从,表结构从15列改为了17列,第一次php_mysqlnd_stmt_prepare_pub预执行时读取到的从库 A的表信息,此时恰好表结构并未更改过来,列数为15,并把15存储在了 stmt.driver_data.stmt.data.result.field_count 变量中,第二次php_mysqlnd_stmt_execute_pub函数读取到从库B的表信息,次数表结构已更改为17列,取到的列数数为17,但是数组内容的初始化用的为第一次取到的列数15,即上文第二步执行的过程;之后遍历循环数组内容时用的列数为第二次返回的列数17,因此这时出现了内存越界访问,也就是造成前文的报警;

建议

1、作者在遍历循环数组内容(stmt->driver_data->fields)时 不应该取 stmt->column_count 值,而应该取stmt->driver_data.result.meta.field_count更合适,其 与 stmt->driver_data.result.meta.fields(数组内容) 在同一个结构体中,且同时被初始化,不被其他逻辑污染;

2、作者在初始化列数组内容时(result->m.result_meta_init(result->field_count, result->persistent); //mysqlnd_result.c 371行 )不应该用第一次链接MySQL返回的列数,而应该用第二次链接MySQL取到的列数,即rset_header->field_count ;

gdb断点处:

b php_mysqlnd_stmt_prepare_pub
b pdo_stmt.c:511
b mysqlnd_ps.c:540
b php_mysqlnd_stmt_execute_pub
b mysqlnd_stmt_execute_parse_response
b mysqlnd_result_meta_init

这次问题跟进颇为不易,pdo所有的数据结构比较多且杂,代码上命名又写成一致,让人有点眼花缭乱,原因也只能一次又一次的否认重新跟进定位,大家可按照前面写的,“php_mysqlnd_stmt_execute_pub函数的整体的调用流程”中每个步骤列的代码行,逐个打上断点,gdb逐行跟进验证;

php版本为:PHP 7.0.6

粗略整理了此次跟进pdo核心所依赖的数据结构:pdo结构体梳理

相关推荐