邪恶八进制信息安全团队技术讨论组's Archiver

pub!1c 2006-2-12 11:25

[转载]Oracle诊断案例-SGA与Swap之二

<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script id=text1393127 style="FONT-SIZE: 12px">信息来源: 邪恶八进制信息安全团队</SPAN></FONT></P>
<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script style="FONT-SIZE: 12px">案例描述:<BR><BR>这是一个大型生产系统<BR>问题出现时系统累计大量用户进程<BR>用户请求得不到及时响应,新的进程不断尝试建立连接<BR>连接数很快被用完<BR><BR>数据库版本:9.2.0.3<BR>操作系统:Solaris8</SPAN></FONT></P>

pub!1c 2006-2-12 11:26

<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script id=text1393129 style="FONT-SIZE: 12px"><STRONG>1.检查alert文件</STRONG><BR><BR>日志中记录如下错误信息,说明磁盘异步IO出现问题:<BR><FONT color=blue><BR>WARNING: aiowait timed out 2 times<BR>Tue Aug 26 15:33:32 2003<BR>WARNING: aiowait timed out 2 times<BR>Tue Aug 26 15:33:34 2003<BR>WARNING: aiowait timed out 2 times<BR>Tue Aug 26 15:33:36 2003<BR>WARNING: aiowait timed out 2 times<BR>Tue Aug 26 15:33:38 2003<BR>WARNING: aiowait timed out 2 times<BR>Tue Aug 26 15:33:43 2003<BR>WARNING: aiowait timed out 1 times<BR>Tue Aug 26 15:33:46 2003<BR>WARNING: aiowait timed out 1 times<BR>Tue Aug 26 15:33:49 2003<BR>WARNING: aiowait timed out 1 times<BR>Tue Aug 26 15:33:51 2003<BR>WARNING: aiowait timed out 1 times<BR>Tue Aug 26 15:33:52 2003<BR>WARNING: aiowait timed out 1 times<BR>Tue Aug 26 15:33:53 2003<BR>WARNING: aiowait timed out 1 times<BR>.............<BR></FONT><BR>我们知道在SUN的某些版本上异步IO存在问题<BR>而异步IO缺省是打开的<BR></FONT>
<BLOCKQUOTE><PRE><FONT size=2><FONT face=verdana,arial,helvetica>代码:</FONT><HR></FONT><CODE><FONT color=#000000>
<FONT color=#0000bb><BR>SQL</FONT><FONT color=#007700>> </FONT><FONT color=#0000bb>show parameter disk_a
<BR>
<BR>NAMETYPEVALUE
<BR></FONT><FONT color=#007700>------------------------------------ ----------- ------------------------------
<BR></FONT><FONT color=#0000bb>disk_asynch_ioboolean</FONT><FONT color=#dd0000>'TRUE'<BR></FONT></FONT></CODE><HR></PRE></BLOCKQUOTE><FONT face="verdana, arial, helvetica"><BR><BR><FONT size=2>针对此问题,我们停用了数据库的异步IO写入。</FONT></SPAN></FONT>

pub!1c 2006-2-12 11:26

<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script id=text1393133 style="FONT-SIZE: 12px"><STRONG>2.共享内存问题<BR></STRONG><BR>alert文件中还记录了以下错误信息:<BR><FONT color=blue><BR>Tue Aug 26 21:37:40 2003<BR>WARNING: EINVAL creating segment of size 0x0000000190400000<BR>fix shm parameters in /etc/system or equivalent<BR></FONT><BR><BR>该信息说明内核参数设置过小或者和SGA不匹配<BR><BR>我们检查system配置文件<BR><BR><FONT color=blue><BR>$ cat /etc/system<BR>.......................<BR>set shmsys:shminfo_shmmax=4096000000 <BR>set shmsys:shminfo_shmmin=1<BR>set shmsys:shminfo_shmmni=200<BR>set shmsys:shminfo_shmseg=200<BR>set semsys:seminfo_semmap=1024<BR>set semsys:seminfo_semmni=2048<BR>set semsys:seminfo_semmns=2048<BR>set semsys:seminfo_semmnu=2048<BR>set semsys:seminfo_semume=200<BR>set semsys:seminfo_semmsl=2048<BR></FONT><BR><BR>我们发现最大共享内存设置仅有4G</SPAN></FONT></P>
<P><FONT face="verdana, arial, helvetica" size=2><BR></FONT></P>

pub!1c 2006-2-12 11:26

<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script id=text1393134 style="FONT-SIZE: 12px"><STRONG>3.检查SGA设置<BR></STRONG><BR>SQL*Plus: Release 9.2.0.3.0 - Production on 星期二 8月 26 21:46:35 2003<BR><BR>Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.<BR><BR><BR>Connected to:<BR>Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production<BR>With the Partitioning, OLAP and Oracle Data Mining options<BR>JServer Release 9.2.0.3.0 - Production<BR><BR>SQL> show sga<BR><BR><FONT color=blue>Total System Global Area 6695660272 bytes</FONT><BR>Fixed Size 740080 bytes<BR>Variable Size 2399141888 bytes<BR>Database Buffers 4294967296 bytes<BR>Redo Buffers 811008 bytes<BR><BR>我们发现SGA设置接近7G,这也就是步骤2中错误提示出现的原因</SPAN></FONT></P>

pub!1c 2006-2-12 11:27

<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script id=text1393136 style="FONT-SIZE: 12px"><STRONG>4.交换区问题</STRONG><BR><BR>我们用top工具检查系统运行状况<BR></FONT>
<BLOCKQUOTE><PRE><FONT size=2><FONT face=verdana,arial,helvetica>代码:</FONT><HR></FONT><CODE><FONT color=#000000>
<FONT color=#0000bb><BR></FONT><FONT color=#ff8000># /usr/local/bin/top
<BR>
<BR></FONT><FONT color=#0000bb>last pid</FONT><FONT color=#007700>: </FONT><FONT color=#0000bb>16899</FONT><FONT color=#007700>;</FONT><FONT color=#0000bb>load averages</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>0.82</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>0.81</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>0.8321</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>49</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>05
<BR>1230 processes</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>1228 sleeping</FONT><FONT color=#007700>, </FONT><FONT color=#0000bb>1 running</FONT><FONT color=#007700>, </FONT><FONT color=#0000bb>1 on cpu
<BR>CPU states</FONT><FONT color=#007700>: </FONT><FONT color=#0000bb>50.1</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>idle</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>7.4</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>user</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>8.6</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>kernel</FONT><FONT color=#007700>, </FONT><FONT color=#0000bb>33.9</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>iowait</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>0.0</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>swap
<BR>Memory</FONT><FONT color=#007700>: </FONT><FONT color=#0000bb>8192M real</FONT><FONT color=#007700>, </FONT><FONT color=#0000bb>118M free</FONT><FONT color=#007700>, </FONT><FONT color=#0000bb>12G swap in </FONT><FONT color=#007700>use, </FONT><FONT color=#0000bb>11G swap free
<BR>
<BR>PID USERNAME THR PRI NICESIZERES STATETIMECPU COMMAND
<BR> 15751 oracle11440 6456M 6408M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>020.49</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 15725 oracle11580 6458M 6410M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>020.46</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR>251 root12480 7096K 1944K sleep126</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.45</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>picld
<BR> 16540 oracle11580 6458M 6411M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>010.45</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 16766 root1430 3744K 2248K cpu</FONT><FONT color=#007700>/</FONT><FONT color=#0000bb>10</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>010.41</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>top
<BR> 16408 oracle11580 6457M 6410M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>010.34</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 15989 oracle11580 6458M 6409M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>010.34</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 15919 oracle11580 6457M 6409M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>020.30</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 16404 oracle11580 6457M 6409M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.28</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 16327 oracle11550 6457M 6410M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.27</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 14870 oracle11580 6457M 6412M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>050.24</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 16851 oracle11350 6457M 6411M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.22</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 16467 oracle11580 6457M 6409M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.21</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 16163 oracle11580 6457M 6408M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>030.21</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR></FONT><FONT color=#dd0000>' 15159 oracle11580 6457M 6408M sleep0:050.21% oracle'<BR></FONT></FONT></CODE><HR></PRE></BLOCKQUOTE><FONT face="verdana, arial, helvetica"><BR><BR><BR><FONT size=2><FONT color=red>Memory: 8192M real, 118M free, 12G swap in use, 11G swap free</FONT><BR><BR>我们发现系统仅有8G RAM,物理内存仅有118M可用<BR>现在SWAP区使用了12G<BR><BR>我们初步作出以下判断:<BR><BR>SGA设置过大(将近7G)导致运行时产生大量交换<BR><BR>大量SWAP交换进而引发磁盘问题<BR>这也就应该是我们第一步看到<BR>WARNING: aiowait timed out 1 times<BR>的原因<BR><BR>大量交换导致数据库性能急剧下降<BR>进而导致用户请求得不到快速响应,堵塞、累积,直至数据库失去响应</FONT></SPAN></FONT>

pub!1c 2006-2-12 11:27

<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script id=text1393139 style="FONT-SIZE: 12px"><STRONG>5.解决方案</STRONG><BR><BR>此问题主要是由于SGA设置不当引起,我们马上缩小了SGA设置:<BR><BR>SQL> show sga<BR><BR><FONT color=blue>Total System Global Area 3591870848 bytes</FONT><BR>Fixed Size 735616 bytes<BR>Variable Size 1442840576 bytes<BR>Database Buffers 2147483648 bytes<BR>Redo Buffers 811008 bytes<BR><BR>此时,数据库减少了交换,达到了稳定运行,用户请求可以得到快速响应。<BR><BR>问题解决完成.</SPAN></FONT></P>

pub!1c 2006-2-12 11:27

<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script id=text1393141 style="FONT-SIZE: 12px"><STRONG>6.系统状态</STRONG><BR><BR>调整后系统运行状况:<BR></FONT>
<BLOCKQUOTE><PRE><FONT size=2><FONT face=verdana,arial,helvetica>代码:</FONT><HR></FONT><CODE><FONT color=#000000>
<FONT color=#0000bb><BR></FONT><FONT color=#007700>$ </FONT><FONT color=#0000bb>top
<BR>
<BR>last pid</FONT><FONT color=#007700>: </FONT><FONT color=#0000bb>12745</FONT><FONT color=#007700>;</FONT><FONT color=#0000bb>load averages</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>0.46</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>0.79</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>0.6522</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>22</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>49
<BR>228 processes</FONT><FONT color=#007700>: </FONT><FONT color=#0000bb>227 sleeping</FONT><FONT color=#007700>, </FONT><FONT color=#0000bb>1 on cpu
<BR>CPU states</FONT><FONT color=#007700>: </FONT><FONT color=#0000bb>92.3</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>idle</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>5.0</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>user</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>1.6</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>kernel</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>1.1</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>iowait</FONT><FONT color=#007700>,</FONT><FONT color=#0000bb>0.0</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>swap
<BR>Memory</FONT><FONT color=#007700>: </FONT><FONT color=#0000bb>8192M real</FONT><FONT color=#007700>, </FONT><FONT color=#0000bb>3817M free</FONT><FONT color=#007700>, </FONT><FONT color=#0000bb>4015M swap in </FONT><FONT color=#007700>use, </FONT><FONT color=#0000bb>15G swap free
<BR>
<BR>PID USERNAME THR PRI NICESIZERES STATETIMECPU COMMAND
<BR> 12610 oracle1510 3511M22M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>041.96</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 12595 oracle1480 3511M22M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>030.92</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 12630 oracle1380 3511M21M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>010.84</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 12614 oracle1460 3511M22M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>010.64</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 12620 oracle1580 3511M22M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>010.53</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 12709 oracle1480 3511M21M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.45</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR>265 root11380 7032K 1920K sleep3</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>160.42</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>picld
<BR> 12729 oracle100 3511M20M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.26</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 12741 oracle1580 2768K 1760K cpu</FONT><FONT color=#007700>/</FONT><FONT color=#0000bb>30</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.19</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>top
<BR> 12745 oracle1440 3506M16M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.17</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 12711 oracle1480 3506M16M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.11</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> 12738 oracle1430 3506M16M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.06</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR>7606 oracle145017M 6928K sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>070.05</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>tnslsnr
<BR> 12721 oracle1340 3506M16M sleep0</FONT><FONT color=#007700>:</FONT><FONT color=#0000bb>000.05</FONT><FONT color=#007700>% </FONT><FONT color=#0000bb>oracle
<BR> </FONT><FONT color=#dd0000>'12723 oracle1530 3506M16M sleep0:000.05% oracle'<BR></FONT></FONT></CODE><HR></PRE></BLOCKQUOTE><FONT face="verdana, arial, helvetica"><BR><BR><FONT size=2>该系统调整完以后,一直稳定运行至今.</FONT></SPAN></FONT>

pub!1c 2006-2-12 11:27

<P><FONT face="verdana, arial, helvetica" size=2><SPAN class=java script id=text1393142 style="FONT-SIZE: 12px"><STRONG>小编一点总结:<BR></STRONG><BR>这个案例和前面我提到的另外一个极其相似<BR>同样都是SGA设置不当引起的数据库问题<BR><BR>本身并不复杂<BR>这一类问题应该在数据库规划和建设阶段就避免掉.<BR><BR>其时,该问题对我更像是个心理测试<BR>当所有老板都站在你背后的时候,你能否冷静快速的找到并解决问题.<BR><BR>关于SUN上的aiowait timed out 有很多总情况及诱因<BR>我后面还有相应的案例说明 .<BR><BR></SPAN></FONT></P>

页: [1]
© 1999-2008 EvilOctal Security Team