Oracle并行进程小结

天涯客Blog 2012-08-28

原理:

将一个任务拆分成多个小任务同时处理,发起该sql的服务器进程成为query coordinator进程,负责协调调度slave processes并将其结果集整合返回给客户端;

并行操作的granule有两种:partition granule和block range granule,后者是sql运行时动态定义的,一般更能平均的在salve processes之间分配,而并行处理的速度是由最慢的那个slave process决定的;

当一条sql执行多个操作时,例如扫描和排序,则会分配多组slave processes;

单个操作的并行化称为intra-operation parallelism,而多组slave processes之间的交互则为inter-operation parallelism,后者则导致各组之间出现通讯 ;

发送数据的进程为producer,而接受的进程则为consumer,producer通过SGA中的table queue来给consumer发送数据,每对producer-consumer都对应一个table queue;

两者可采用以下方式进行数据通讯:

广播—每个producer发送数据给所有的consumer

循环—producer采用轮循的方式给所有consumer发送记录

范围—producer将指定范围的记录发给特定的consumer

Hash—producer利用hash函数 决定接受数据的consumer

QC Random—每个producer都将记录随机发给coordinator,顺序不重要

QC order--每个producer都将记录按序发给coordinator

在并行操作的执行计划里可以看到如下的操作:

P -> S ---并行发送数据到串行,例如,每个执行计划中最后向coordinator进程发送数据都是采用这种方式

P -> P ---一个并行操作发送数据给另一个并行操作,当存在多组并行进程时会用到

S -> P ---串行发送数据给并行,此操作效率比较差,应该尽量避免

PCWP---并行与父操作合并,此为同一组内的进程交互,因此没有组间通讯

PCWC---并行与子操作合并,也为同组进程交互,没有组间通讯

参数配置

每个instance能够使用的并行进程数量是有限的,instance会维护一个slave process pool,类似连接池,每次coordinator会从中请求slave process执行完成后再将其返还;

parallel_min_servers:指定instance启动时创建的slave process数量,默认为0;通常只有在sql花费过多时间创建slave process时 才修改此值,此操作相关的等待事件为os thread startup;

parallel_max_servers:指定slave process的最大可用数量;

parallel_execution_message_size:前面提到的table queue存在于large pool(专门存放不可重用的数据结构)中,每个table queue有3-4个缓冲区组成,该参数用来定义该缓冲区大小;

parallel_automatic_tuning:10g开始已经不推荐使用了,设置为true时Oracle会使用large pool处理table queue

parallel_min_percent:默认为0,可设置成0-100,为0时表明oracle将尽可能的提供足够多的slave process,如果可分配的数量小于2则进行串行操作;若设置成非0值,则至少为sql提供指定比例的slave process数量,否则会报ora-12827,例如parallel_min_percent=25且有sql请求16个子进程,则至少提供16*25/100=4个,否则ora-12827;

parallel_adaptive_multi_user:10g默认为true;为false时,只要还有子进程,sql请求多少就分配给多少;true则oracle会根据实际情形下调sql的并行度以保证slave process不会被耗尽;

如何使用parallel

有三种方式:

1 指定table/index的并行度

2使用parallel提示

3在session级别enable parallel query/DML/DDL,其中query是默认开启的

三者关系为hint优先级最高,而force parallel又可以覆盖表或索引级别定义的并行度;而要想在instance彻底禁止并行操作,可将parallel_max_servers设置为0;

何时使用

只有在满足以下两个条件时使用并行操作才能达到最佳效果:

1 系统有大量的闲置资源(CPU,I/O和内存)

2 sql串行执行时间过长,比如超过10秒,因为并行操作的初始化操作(创建slave process和table queue)也是耗费资源的,

常用视图

v$px_process_sysstat—shows the status of query servers and provides buffer allocation statistics

V$PQ_SESSTAT—列出session级别的并行信息

V$PQ_SYSSTAT—列出system级别的并行信息

V$PQ_SLAVE—列出每一个active并行进程的信息

V$PQ_TQSTAT— provides a detailed report of message traffic at the table queue,data is valid only when queried from a session that is executing parallel SQL stateme记录当前session的并行执行的信息

V$PX_BUFFER_ADVICE-- provides statistics on historical and projected maximum buffer usage by all parallel queries. You can consult this view to reconfigure SGA size in response to insufficient memory problems for parallel queries

V$PX_PROCESS-- contains information about the parallel processes, including status, session ID, process ID, and other information

V$PX_PROCESS_SYSSTAT-- shows the status of query servers and provides buffer allocation statistics

V$PX_SESSION-- shows data about query server sessions, groups, sets, and server numbers. It also displays real-time data about the processes working on behalf of parallel execution. This table includes information about the requested degree of parallelism (DOP) and the actual DOP granted to the operation.

V$PX_SESSTAT-- provides a join of the session information from V$PX_SESSION and the V$SESSTAT table.

查看并行进程的coordinator信息,

可以使用如下两个SQL查询

col username for a12

col "QC SID" for A6

col "SID" for A6

col "QC/Slave" for A8

col "Req. DOP" for 9999

col "Actual DOP" for 9999

col "Slaveset" for A8

col "Slave INST" for A9

col "QC INST" for A6

set pages 300 lines 300

col wait_event format a30

select

decode(px.qcinst_id,NULL,username,' - '||lower(substr(pp.SERVER_NAME,

length(pp.SERVER_NAME)-4,4) ) )"Username",

decode(px.qcinst_id,NULL, 'QC', '(Slave)') "QC/Slave" ,

to_char( px.server_set) "SlaveSet",

to_char(s.sid) "SID",

to_char(px.inst_id) "Slave INST",

decode(sw.state,'WAITING', 'WAIT', 'NOT WAIT' ) as STATE,

case  sw.state WHEN 'WAITING' THEN substr(sw.event,1,30) ELSE NULL end as wait_event ,

decode(px.qcinst_id, NULL ,to_char(s.sid) ,px.qcsid) "QC SID",

to_char(px.qcinst_id) "QC INST",

px.req_degree "Req. DOP",

px.degree "Actual DOP"

from gv$px_session px,

gv$session s ,

gv$px_process pp,

gv$session_wait sw

where px.sid=s.sid (+)

and px.serial#=s.serial#(+)

and px.inst_id = s.inst_id(+)

and px.sid = pp.sid (+)

and px.serial#=pp.serial#(+)

and sw.sid = s.sid

and sw.inst_id = s.inst_id

order by

  decode(px.QCINST_ID,  NULL, px.INST_ID,  px.QCINST_ID),

  px.QCSID,

  decode(px.SERVER_GROUP, NULL, 0, px.SERVER_GROUP),

  px.SERVER_SET,

  px.INST_ID;

 

SELECT px.SID "SID", p.PID, p.SPID "SPID", px.INST_ID "Inst",

       px.SERVER_GROUP "Group", px.SERVER_SET "Set",

       px.DEGREE "Degree", px.REQ_DEGREE "Req Degree", w.event "Wait Event"

FROM GV$SESSION s, GV$PX_SESSION px, GV$PROCESS p, GV$SESSION_WAIT w

WHERE s.sid (+) = px.sid AND s.inst_id (+) = px.inst_id AND

      s.sid = w.sid (+) AND s.inst_id = w.inst_id (+) AND

      s.paddr = p.addr (+) AND s.inst_id = p.inst_id (+)

ORDER BY DECODE(px.QCINST_ID,  NULL, px.INST_ID,  px.QCINST_ID), px.QCSID,

DECODE(px.SERVER_GROUP, NULL, 0, px.SERVER_GROUP), px.SERVER_SET, px.INST_ID

 

查看并行进程的物理读信息

SELECT QCSID, SID, INST_ID "Inst", SERVER_GROUP "Group", SERVER_SET "Set",

  NAME "Stat Name", VALUE

FROM GV$PX_SESSTAT A, V$STATNAME B

WHERE A.STATISTIC# = B.STATISTIC# AND NAME LIKE 'PHYSICAL READS'

  AND VALUE > 0 ORDER BY QCSID, QCINST_ID, SERVER_GROUP, SERVER_SET;

QCSID  SID   Inst   Group  Set    Stat Name          VALUE     

------ ----- ------ ------ ------ ------------------ ----------

     9     9      1               physical reads           3863

     9     7      1      1      1 physical reads              2

     9    21      1      1      1 physical reads              2

     9    18      1      1      2 physical reads              2

     9    20      1      1      2 physical reads              2

 

查看系统中与parallel有关的信息

SELECT NAME, VALUE FROM GV$SYSSTAT

WHERE UPPER (NAME) LIKE '%PARALLEL OPERATIONS%'

OR UPPER (NAME) LIKE '%PARALLELIZED%' OR UPPER (NAME) LIKE '%PX%';

NAME                                               VALUE     

-------------------------------------------------- ----------

queries parallelized                                      347

DML statements parallelized                                 0

DDL statements parallelized                                 0

DFO trees parallelized                                    463

Parallel operations not downgraded                         28

Parallel operations downgraded to serial                   31

Parallel operations downgraded 75 to 99 pct               252

Parallel operations downgraded 50 to 75 pct               128

Parallel operations downgraded 25 to 50 pct                43

Parallel operations downgraded 1 to 25 pct                 12

PX local messages sent                                  74548

PX local messages recv'd                                74128

PX remote messages sent                                     0

PX remote messages recv'd                                   0

 

查看px占用的内存信息—该instance没有分配large pool

SQL> select * from v$sgastat where name like 'PX%';

 

POOL         NAME                            BYTES

------------ -------------------------- ----------

shared pool  PX subheap                   11651336

shared pool  PX msg pool                 103310848

shared pool  PX QC deq stats                  1480

shared pool  PX QC msg stats                  2288

shared pool  PX subheap desc                   256

shared pool  PX msg pool struct               1088

shared pool  PX server deq stats              1480

shared pool  PX server msg stats              2288

 

 

案例

create table t as select owner, object_name name from dba_objects where owner in ('SYSMAN','ORDSYS','PUBLIC','SYS');

create table m(owner varchar2(20));

insert into m values('SYS');

--收集统计信息

运行select * from t, m where t.owner=m.owner and m.owner='SYS';

当两个表都不开启并行时

---------------------------------------------------------------------------

| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |      | 23736 |  2202K|    55   (4)| 00:00:01 |

|*  1 |  HASH JOIN         |      | 23736 |  2202K|    55   (4)| 00:00:01 |

|*  2 |   TABLE ACCESS FULL| M    |     1 |    12 |     2   (0)| 00:00:01 |

|*  3 |   TABLE ACCESS FULL| T    | 23736 |  1923K|    52   (2)| 00:00:01 |

---------------------------------------------------------------------------

只为表M开启并行alter table t parallel 4;

-----------------------------------------------------------------------------------------------------------------

| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |

-----------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT        |          | 23213 |   770K|    17   (6)| 00:00:01 |        |      |            |

|   1 |  PX COORDINATOR         |          |       |       |            |          |        |      |            |

|   2 |   PX SEND QC (RANDOM)   | :TQ10001 | 23213 |   770K|    17   (6)| 00:00:01 |  Q1,01 | P->S | QC (RAND)  |

|*  3 |    HASH JOIN            |          | 23213 |   770K|    17   (6)| 00:00:01 |  Q1,01 | PCWP |            |

|   4 |     BUFFER SORT         |          |       |       |            |          |  Q1,01 | PCWC |            |

|   5 |      PX RECEIVE         |          |     1 |     4 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |

|   6 |       PX SEND BROADCAST | :TQ10000 |     1 |     4 |     2   (0)| 00:00:01 |        | S->P | BROADCAST  |

|*  7 |        TABLE ACCESS FULL| M        |     1 |     4 |     2   (0)| 00:00:01 |        |      |            |

|   8 |     PX BLOCK ITERATOR   |          | 23213 |   680K|    14   (0)| 00:00:01 |  Q1,01 | PCWC |            |

|*  9 |      TABLE ACCESS FULL  | T        | 23213 |   680K|    14   (0)| 00:00:01 |  Q1,01 | PCWP |            |

-----------------------------------------------------------------------------------------------------------------

--访问表T用到了并行, 操作粒度为block,而M依旧是串行访问,第6步出现了S->P,并且是通过广播的方式向并行进程发送信息;

为表T开启并行alter table m parallel 4;

-----------------------------------------------------------------------------------------------------------------

| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |

-----------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT        |          | 23213 |   770K|    17   (6)| 00:00:01 |        |      |            |

|   1 |  PX COORDINATOR         |          |       |       |            |          |        |      |            |

|   2 |   PX SEND QC (RANDOM)   | :TQ10001 | 23213 |   770K|    17   (6)| 00:00:01 |  Q1,01 | P->S | QC (RAND)  |

|*  3 |    HASH JOIN            |          | 23213 |   770K|    17   (6)| 00:00:01 |  Q1,01 | PCWP |            |

|   4 |     PX RECEIVE          |          |     1 |     4 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |

|   5 |      PX SEND BROADCAST  | :TQ10000 |     1 |     4 |     2   (0)| 00:00:01 |  Q1,00 | P->P | BROADCAST  |

|   6 |       PX BLOCK ITERATOR |          |     1 |     4 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |

|*  7 |        TABLE ACCESS FULL| M        |     1 |     4 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |

|   8 |     PX BLOCK ITERATOR   |          | 23213 |   680K|    14   (0)| 00:00:01 |  Q1,01 | PCWC |            |

|*  9 |      TABLE ACCESS FULL  | T        | 23213 |   680K|    14   (0)| 00:00:01 |  Q1,01 | PCWP |            |

-----------------------------------------------------------------------------------------------------------------

--此时M也使用并行访问,操作粒度为block;因为sql没有要求排序,最后向coordinator使用RAND方式发送数据

--M的结果集使用广播的方式发送数据,使用hint  /*+ pq_distribute(t,hash,hash) */可以将其改为Hash

-----------------------------------------------------------------------------------------------------------------

| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |

-----------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT        |          | 23213 |   770K|    17   (6)| 00:00:01 |        |      |            |

|   1 |  PX COORDINATOR         |          |       |       |            |          |        |      |            |

|   2 |   PX SEND QC (RANDOM)   | :TQ10002 | 23213 |   770K|    17   (6)| 00:00:01 |  Q1,02 | P->S | QC (RAND)  |

|*  3 |    HASH JOIN BUFFERED   |          | 23213 |   770K|    17   (6)| 00:00:01 |  Q1,02 | PCWP |            |

|   4 |     PX RECEIVE          |          |     1 |     4 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |

|   5 |      PX SEND HASH       | :TQ10000 |     1 |     4 |     2   (0)| 00:00:01 |  Q1,00 | P->P | HASH       |

|   6 |       PX BLOCK ITERATOR |          |     1 |     4 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |

|*  7 |        TABLE ACCESS FULL| M        |     1 |     4 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |

|   8 |     PX RECEIVE          |          | 23213 |   680K|    14   (0)| 00:00:01 |  Q1,02 | PCWP |            |

|   9 |      PX SEND HASH       | :TQ10001 | 23213 |   680K|    14   (0)| 00:00:01 |  Q1,01 | P->P | HASH       |

|  10 |       PX BLOCK ITERATOR |          | 23213 |   680K|    14   (0)| 00:00:01 |  Q1,01 | PCWC |            |

|* 11 |        TABLE ACCESS FULL| T        | 23213 |   680K|    14   (0)| 00:00:01 |  Q1,01 | PCWP |            |

-----------------------------------------------------------------------------------------------------------------

相关推荐

zgxzowen / 0评论 2015-07-01
handle0 / 0评论 2015-07-01