UKOUG 2011 - Ash Outliers

Doug's Oracle Blog

  • Home
  • Papers
  • Books
  • C.V.
  • Fun
  • Oracle Blog
  • Personal Blog

Jan 10: UKOUG 2011 - Ash Outliers

My final UKOUG 2011 post is about another of my favourite presentations -  "ASH Outliers: Detecting Unusual Events in Active Session History" by John Beresniewicz. (JB for short, but Marco Gralike made a fairly good stab at pronouncing his surname correctly during the introduction.)

I'd been looking forward to this presentation because I'd already been aware of his work in this area for a while and was supposed to be helping out (but I'll come back to that later ...). I'd also expected to see it in 2010 but JB withdrew the abstract later. Although I knew a lot of the content already I enjoyed it because of the subject and because JB's always likely to make me see something old from a new angle. Anyway, what was it all about?

One of the most important limitations of ASH data is that, because the data is sampled, it doesn't include every individual timed event that sessions waited for and is inherently biased towards longer duration (and more common) events. As long as you understand this design decision then you can understand sensible ways to use the data and also less sensible ways!

For example, it is not sensible to write queries against ASH data that SUM(time_waited) or AVG(time_waited) because the results look sensible but are fairly meaningless. If you sum all the wait times in ASH but ASH doesn't include all activity then what does the total value represent? Likewise, if you calculate the average 'log file parallel write' time in ASH for a specific period, it's virtually guaranteed that the result will be higher than the true average because ASH will tend to capture more long duration events than short ones.

Generally it's more sensible to use simple COUNT functions and make the assumption that a sample represents a second of time. It's obviously an approximation, but one that works surprisingly well as reasonably long experience of the OEM performance pages and ASH queries has shown me.

However, although the Oracle employees I've discussed this with encourage people to use the TIME_WAITED column in ASH with caution, I've personally found it useful on many occasions when ASH is the only session-level data available because I'm trying to diagnose a problem after the event. In fact, ASH can be extremely useful in that situation because, as well as having session-specific information, it contains it for all the sessions. So I confess that when querying ASH data using SQL, I've often found the TIME_WAITED of individual rows to be a useful indication of what's been going on if unusual or particularly long waits appear immediately prior to a serious problem. By looking at the series of events that led up to a problem, I can see the interactions between those sessions via blocking session information and at least build up an approximate view of what happened.

JB's presentation was essentially about a single query that he's been working on that is designed to look at the TIME_WAITED values in ASH data to identify outliers - events with unusually long wait times relative to the usual wait times for that event. The thinking being that the appearance of long wait times for certain events could indicate the root cause of serious system-wide performance problems. (Note that AWR data would be next to useless for this because the problems might only occur for a few seconds in the lead-up to the problem and that level of detail would be aggregated to insignificance in the wider AWR time scope.)

Having hopefully said enough technical things in this post now to keep grumpy people happy, allow me a humorous interlude. From the start of the presentation, JB kept alluding to a couple of things

a) The singular lack of response he'd had from people he'd shared the query with. Fortunately no names were mentioned but he was clearly griping about the limited response he'd had from me after sending me the query a while back and how long it had taken. What can I say? I'm *busy*! ;-)

b) Why the feedback is so important to him - because people in Oracle Development appear very interested in what customers are doing on their real Production systems. Even though Oracle do have their own internal Production systems, they're only a small fraction of what customers are doing so it's difficult to know what will work. In any case, I guess Oracle don't let developers play around on their production systems, like most companies.

So here's the deal. As JB mentioned, he is probably the least on-line guy there is. He might do DBA 2.0, but he doesn't do Web 2.0! So he suggested that I could get the script out there to other people to try on their own interesting systems and see whether they found it useful.

Here it is

Note that this version references the V$ rather than GV$ version of ASH data so only uses officially supported features as JB alluded to during the presentation. i.e. a RAC version is possible but this is single-instance for now, just to let people give it a try.

To save a veritable deluge of mail (yeah, JB, you *wish*! LOL) it might be best to post any feedback on the script as comments here or I could forward it on.

Oh, and something I heard later that a few people had struggled with during the presentation was the concept of Significance Levels that are central to the script and the presentation. Fortunately, I already host a short document on the subject, again courtesy of JB, here. However, as that's more focussed on Adaptive Thresholds and time grouping in particular, you might find the discussion on this AskTom thread more generic and relevant to the ASH Outliers script.

All in all, a terrific presentation, as I expected and something that people might want to try out. It’s certainly an interesting concept and an attempt to automate something I’ve been doing manually for a few years. Thanks, JB!
Posted by Doug Burns Comments: (11) Trackbacks: (2)

Trackbacks
Trackback specific URI for this entry

PingBack
Weblog: jonathanlewis.wordpress.com
Tracked: Jan 11, 20:06
PingBack
Weblog: timurakhmadeev.wordpress.com
Tracked: Feb 15, 17:39

Comments
Display comments as (Linear | Threaded)

#1 - Raj Jamadagni 2012-01-10 13:12 - (Reply)

Thanks for sharing this Doug, something new to learn and experiment.

Raj

#1.1 - Doug Burns said:
2012-01-10 13:16 - (Reply)

Good to see you round these parts, Raj ;-)

Hope all is going well for you ...

#2 - Marcin Przepiorowski said:
2012-01-10 14:21 - (Reply)

Hi Doug,

Thanks for sharing it. Unfortunately I didn't attend UKOUG and I was looking for any information about John Beresniewicz presentation.

I will definitely come back with my feedback.

regards,
Marcin

#2.1 - Doug Burns said:
2012-01-10 15:22 - (Reply)

Hi Marcin,

That would be cool. At some point, JB's slides will also appear, hopefully not just on the closed UKOUG website, and I'll post a link to those too.

Cheers, Doug

#3 - AlexB 2012-01-11 02:47 - (Reply)

Hi Doug,

Looks quite interesting .. Will give it a try for sure

Regards,
Alex

#4 - Tony Sleight 2012-04-05 11:21 - (Reply)

Apologies for a late addition to this post. I attended John's presentation and was intending to run his script. Business has intervened and I have just got round to running it. I have results for four confidence levels, 0.99, 0.999, 0.9999 and 0.99999. I have enclosed the latter here and can supply all four in a zip file if required. I would be interested to find out how the data is used and the results of any analysis.

Thanks for providing an interesting and entertaining post.

CODE:
SQL> exec :SIGLEVEL := 0.99999; PL/SQL procedure successfully completed. SQL> @ash_outliers.sql EVENT_BUCKET  SAMPLE_ID SESSION_ID BUCKET_SIGLEVEL EVENT                                                            ASH_TIME_WAITED_MILLI SQL_ID                                                         ------------ ---------- ---------- --------------- ---------------------------------------------------------------- --------------------- -------------                                                             9  116761150        793         .999991 db file scattered read                                                             308 2zdjkv4v04qtj                                                             9  116761150       1078         .999991 db file scattered read                                                             308 6vyv0su64kw06                                                             9  116761198        507         .999991 db file scattered read                                                             347 20az2htmb5552                                                             9  116761198        590         .999991 db file scattered read                                                             347 6vyv0su64kw06                                                             9  116762871        683         .999991 db file scattered read                                                             258 6vyv0su64kw06                                                             9  116763252        639         .999991 db file scattered read                                                             301 d2c8s3sdnwjww                                                             9  116763252        989         .999991 db file scattered read                                                             299 afm6rk58krjg5                                                             9  116763252        491         .999992 read by other session                                                              299 1rfyttshvh021                                                             9  116763748        493         .999991 db file scattered read                                                             430 3umbqy8czuh7q                                                             9  116765634        472         .999991 db file scattered read                                                             264 20az2htmb5552                                                             9  116767025        874         .999991 db file scattered read                                                             401 5xxzr0a0du9z4                                                             9  116767025       1078         .999991 db file scattered read                                                             399 6vyv0su64kw06                                                             9  116767025        638         .999992 read by other session                                                              392 045v8f20jyfzn                                                             9  116767336       1078         .999991 db file scattered read                                                             445 6vyv0su64kw06                                                             9  116767743        657         .999991 db file scattered read                                                             264 frctz0hsh0c24                                                            10  116767815       1044               1 db file scattered read                                                             562 1fcy4cv4qyu3a                                                             9  116767932        912         .999991 db file scattered read                                                             287 40c23zt96x0zy                                                             9  116769426        626         .999991 db file scattered read                                                             314 1stkjwtt24q5c                                                             9  116769426        754         .999991 db file scattered read                                                             314 6vyv0su64kw06                                                             9  116772767        479         .999991 db file scattered read                                                             279 6vyv0su64kw06                                                             9  116772767        811         .999991 db file scattered read                                                             279 arthgva4hh26n                                                             9  116772767        478         .999992 read by other session                                                              275 fz3y37ut7jhbj                                                             9  116774494        479         .999991 db file scattered read                                                             429 6vyv0su64kw06                                                             9  116774494        879         .999991 db file scattered read                                                             429 6vyv0su64kw06                                                            10  116774494        969               1 db file scattered read                                                             515 gp015mru5f7uu                                                             9  116775733        491         .999991 db file scattered read                                                             404 f6t2601ms22g9                                                             9  116775841        588         .999991 db file scattered read                                                             271 6vyv0su64kw06                                                             9  116775841        888         .999991 db file scattered read                                                             267 6vyv0su64kw06                                                             9  116775841        611         .999992 read by other session                                                              267 6vyv0su64kw06                                                             9  116778248        711         .999991 db file scattered read                                                             398 2q4x319qzk7dw                                                             9  116779815        711         .999991 db file scattered read                                                             308 6vyv0su64kw06                                                             9  116780000        491         .999991 db file scattered read                                                             354 gw78f9as8g3y4                                                             9  116780000        579         .999991 db file scattered read                                                             356 6vyv0su64kw06                                                             9  116781922        754         .999991 db file scattered read                                                             312 a43zt1pcv1gkk                                                             9  116781922        872         .999991 db file scattered read                                                             311 db1xpnk8xpft1                                                            10  116785158        737         .999999 db file sequential read                                                            587 8bv420vpdumk3                                                            10  116785214        904               1 db file scattered read                                                             609 6vyv0su64kw06                                                            10  116785214        491         .999999 db file sequential read                                                            566 8hb1k97yuu1q0                                                            13  116785767        912         .999999 SQL*Net more data to dblink                                                       4876 3f99fpr9qkhc8                                                             9  116800681        761         .999991 db file scattered read                                                             320 dg35457cz2d97                                                             9  116803002        771         .999991 db file scattered read                                                             399 7gta8b7amd4tm                                                             9  116803002        943         .999991 db file scattered read                                                             424 6vyv0su64kw06                                                             9  116803254        701         .999992 read by other session                                                              499 gty9k9uzyjjvf                                                             9  116803254        832         .999992 read by other session                                                              499 9bf4fdj6c9t7h                                                             9  116803259        853         .999991 db file scattered read                                                             368 736cr74b0ct9w                                                             9  116803874        853         .999991 db file scattered read                                                             400 736cr74b0ct9w                                                             9  116804422        572         .999991 db file scattered read                                                             385 5y5zxva87u4z8                                                             9  116804422        953         .999991 db file scattered read                                                             384 1rfyttshvh021                                                             9  116804428        793         .999991 db file scattered read                                                             312 4vx2mukd2wn5y                                                             9  116804428        943         .999991 db file scattered read                                                             311 c3f5u9v45phhm                                                             9  116804817       1013         .999991 db file scattered read                                                             361 0n3zc01qv3swa                                                             9  116804817        767         .999992 read by other session                                                              361 g5gs5r0y8z3cv                                                             9  116804999        836         .999991 db file scattered read                                                             360 1aq0pjju8b88u                                                            10  116806019        701         .999999 SQL*Net more data to client                                                        882                                                                           9  116807750        474         .999991 db file scattered read                                                             285 6vyv0su64kw06                                                             9  116807814        561         .999991 db file scattered read                                                             312 ddtg2zhtpats3                                                             9  116807907        618         .999991 db file scattered read                                                             305 6a5yb01nkdcty                                                             9  116807907        793         .999992 direct path read temp                                                              305 gh3d2yzmgrm75                                                             9  116808172        813         .999991 db file scattered read                                                             279 19n6dumfp860h                                                             9  116809366        514         .999991 db file scattered read                                                             420 c8muw72tpgh3k                                                             9  116809366        938         .999991 db file scattered read                                                             422 130kgftjys163                                                            10  116810315        938         .999999 read by other session                                                              597 130kgftjys163                                                             9  116810318        601         .999992 read by other session                                                              440 fz3y37ut7jhbj                                                             9  116810491        709         .999991 db file scattered read                                                             346 59rnzvxy0gc52                                                             9  116810491        836         .999991 db file scattered read                                                             346 6vpavatp3dktm                                                             9  116810491        882         .999992 read by other session                                                              346 dwv9x43470yyz                                                             9  116811046        597         .999991 db file scattered read                                                             282 89bgy764bfxtp                                                             9  116811046       1050         .999992 read by other session                                                              282 bbjqxdysm3j6u                                                             9  116811488        465         .999991 db file scattered read                                                             290 6vyv0su64kw06                                                             9  116812146        800         .999992 read by other session                                                              267 b9ukxvp5c0b02                                                             9  116812146       1025         .999992 read by other session                                                              267 69s59855hb6bs                                                             9  116812537        921         .999991 db file scattered read                                                             275 3q47kv8sn552u                                                             9  116812537       1063         .999992 read by other session                                                              275 679x87rmg33sh                                                             9  116813633        807         .999991 db file scattered read                                                             274 7dfnn2wamydt5                                                             9  116813925        561         .999992 read by other session                                                              418 ddtg2zhtpats3                                                            13  116814139        853         .999999 SQL*Net more data to dblink                                                       5550 3f99fpr9qkhc8                                                  76 rows selected.

#4.1 - Doug Burns said:
2012-04-06 21:20 - (Reply)

I know what you mean about business, but you've done a whole lot more than I have! ;-)

Actually, I do run the outliers scripts intermittently on different systems but still getting my head around where it's most useful

#4.2 - JB 2012-04-12 01:33 - (Reply)

Hi Tony -
Thanks for posting your output from the script. As far as how the data is used and what conclusions might be drawn, it would probably be good to recall the motivating use case for developing the script in the first place:

RAC performance experts were noting that the EM database Top Activity page is not particularly useful for diagnosing performance problems where unusually long waits are implicated in cascade-effect performance issues. Top Activity and other EM pages show ASH data rolled up into one or more aggregation buckets, individual events are lost in the aggregation. So the query was designed to provide a way to select out events from ASH satisfying an event-independent measure of "unusually long" based on using v$system _event as representing the statistical distribution of event wait times, and defining "unusual" in terms of statistical rarity. Since the motivating use case assumed RAC, it was important to get the "gv$" version working for the UKOUG presentation.

So for the motivating use case, we would be hoping to pick up some unusually long events in one or more nodes that seem to be time-proximate to known performance events, perhaps also (but not necessarily) signaled by other unusually long events.

Now it seems we might do better than just the motivating use case, and perhaps that is what you are interested in. For instance, your output shows a lot of outliers in "db file scattered read" but the next question is "these may be unusual, but are they unexpected?" If we are looking for a 1-in-a-million event, but there have been 10 million events since we began looking, then we might well expect to see some of these sampled into ASH. So this is a bit more difficult perhaps but more generally useful. We could try to write a query that answers the question:

Which events/classes are showing larger than expected counts of highly unusual events sampled into ASH?

From your data there is not a lot of hard conclusions to draw, but it is interesting that the two very long (~5 second) events were both for "SQL*Net more data to dblink" and same SQL_ID both times. So perhaps this is expected or perhaps it is showing some issue in the application. Other than that there is some clustering of unusual events in time (same sample_id) but not in the extreme, usually only 2 or 3 sessions sampled together.

Thanks for coming to the UKOUG presentation.
JB

#4.2.1 - Doug Burns said:
2012-04-12 10:18 - (Reply)

Thanks, JB. Based on Tony's comments below, it looks like the outliers script could add some useful cross-verification of issues as well as the original intention.

#5 - Tony Sleight 2012-04-12 08:59 - (Reply)

Thank you John for the explanation and analysis of the supplied data. I had thought the outliers were infrequent events that could be disguised within the granularity of the AWR snapshot period. It is interesting to learn the reasons behind the development of the query.
The results of the ASH outliers query has reinforced the AWR analysis I have performed on our system for user perceived performance issues. The same SQL ID have appeared in both reports. I shall be visiting the client site next week to perform further analysis and measure performance gain by making changes to some execution plans which these queries have highlighted as problem areas. The specific 'SQL*Net more data to client' event is due to our use of Multi Master Replication technology and represents the call to the deferred push job. This call can take a while when columns with a large data content are pushed to the secondary database server.
I have now added the ASH outliers query to my standard tool set. It is fast to run and the data produced is concise enough to pinpoint potential problem areas.
Thanks again for the blog post (Doug) and the query/analysis (John).

#5.1 - JB 2012-04-12 18:15 - (Reply)

Great, you made my day!


Add Comment

Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
BBCode format allowed
 
 

Statistics on Partitioned Tables

Contents

Part 1 - Default options - GLOBAL AND PARTITION
Part 2 - Estimated Global Stats
Part 3 - Stats Aggregation Problems I
Part 4 - Stats Aggregation Problems II
Part 5 - Minimal Stats Aggregation
Part 6a - COPY_TABLE_STATS - Intro
Part 6b - COPY_TABLE_STATS - Mistakes
Part 6c - COPY_TABLE_STATS - Bugs and Patches
Part 6d - COPY_TABLE_STATS - A Light-bulb Moment
Part 6e - COPY_TABLE_STATS - Bug 10268597

Comments

jonathanlewis.wordpress.com about 10053 Trace Files - Different Plan in Different Environments
Sat, 01.06.2013 11:26
Doug Burns about 10053 Trace Files - Different Plan in Different Environments
Tue, 02.04.2013 08:57
You're welcome. Now I just nee d to pull my finger out and ac tually come up [...]
Howard Rogers about 10053 Trace Files - Different Plan in Different Environments
Mon, 01.04.2013 23:08
Makes a big difference, so tha nks for that! With two brow ser windows, o [...]

Upcoming Presentations


Bookmark

Open All | Close All

Syndicate This Blog

  • XML RSS 2.0 feed
  • ATOM/XML ATOM 1.0 feed
  • XML RSS 2.0 Comments
  • Feedburner Feed

Powered by

Serendipity PHP Weblog

Show tagged entries

xml 11g
xml ACE
xml adaptive thresholds
xml ASH
xml Audit Vault
xml AWR
xml Blogging
xml conferences
xml Cuddly Toys
xml Database Refresh
xml DBMS_STATS
xml Direct Path Reads
xml Fun
xml grid control
xml hotsos 2010
xml listener
xml Locking
xml oow
xml oow2009
xml optimiser
xml OTN
xml Parallel
xml Partitions
xml Patching
xml swingbench
xml The Reality Gap
xml time matters
xml ukoug
xml ukoug2009
xml Unix/Shell
xml Useful Links

Disclaimer

For the avoidance of any doubt, all views expressed here are my own and not those of past or current employers, clients, friends, Oracle Corporation, my Mum or, indeed, Flatcat. If you want to sue someone, I suggest you pick on Tigger, but I hope you have a good lawyer. Frankly, I doubt any of the former agree with my views or would want to be associated with them in any way.

Design by Andreas Viklund | Conversion to s9y by Carl