mystery performance issue with background process

recently I’ve got someĀ serious Performance issues on an 11.2 database. The mystery: Neither the AWR nor the ADDM report showed any obviously problems, but the linux system showed a very high load. So i tried to get some Kind of live view… Cloud Control was not availeble yet, so I used the great and free tool Mumbai

And it seems quite usual… but wait – what are These resources at the bottom of the line? Resource Group “other”… great, now i know exactly what it is šŸ™ But in the right lower corner is the list of top consuming processes, and there are some “E00x” processes there.

Just to be sure, i’ve checked with native SQL if the chart is correct:

select
   count(*) count
  ,trunc(SAMPLE_TIME, 'HH24') hour
  ,session_id
  ,session_serial#
  ,program
-- and whatever might be interesting, like EVENT, P1 & P1TEXT, etc.
from DBA_HIST_ACTIVE_SESS_HISTORY
where session_type='BACKGROUND'      -- just get background processes
  and wait_class != 'Idle'           -- ignor all idle processes
  and program not like 'rman%'       -- rman is approved backgrouond process to be not idle for a longer time
  and sample_time > sysdate-1        -- only last 24 hours
group by trunc(SAMPLE_TIME, 'HH24'), session_id  ,session_serial#  ,program --dont Forget to add the columns you might added
having count(*) > 180                -- only show if process was active half the time
order by 2 desc
;

the Output was like this:

COUNT   HOUR                    SESSION_ID      SESSION_SERIAL#   PROGRAM
360     201x-xx-xx 02:00:00     1282            21723             oracle@orcl (E002)
359     201x-xx-xx 01:00:00     1282            21723             oracle@orcl (E002)
360     201x-xx-xx 00:00:00     1282            21723             oracle@orcl (E002)

wow – what’s this? just to explain the column “count”: we are querying an ASH table. ASH is captured every second, every 10th capture is flushed to disk – so we have 6 snapshots per minute or 360 per hour… meaning in above query the background process was running all time.

So next step: figure out what the hell the E00x process is doing. But thanks to Oracle, the documentation is (as usual) quite good. It’s the EMON Slave Process, which “Performs database event management and notifications”. Ok, but why are they – or at lease some of these processes – running all time? After some research i found oracle bug 9735536 and a corresponding bugfix… and as Workaround the Suggestion to kill the Ennn processes together with the EMNC (EMON Coordinator Process) and voila – the System runs smoothly again… at least for some hours, after the next EMON process was running mad.

after installing the patch the issue was solved, so we can continue planning the 12c upgrade…

Benjamin

Posted in Uncategorized | Tagged , | Leave a comment