Mar 3: Recurring Conversations – Incremental Statistics (Part 2)
Trackbacks
Trackback specific URI for this entry
No Trackbacks
In the first part, I explained that Incremental Statistics are designed to allow a partitioned tables Global Statistics to be updated based on a combination of
1) Statistics gathered by analysing the contents of one or more partitions that have just been loaded or have been updated (and see this blog post for more depth on what 'updated' means!)
2) The statistics of existing partitions which are represented by synopses that are already stored in the SYSAUX tablespace.
Using this combination, we can avoid scanning every partition in the table every time we want to update the Global Stats which is an expensive operation that is likely to be unfeasible every time we load some data into a large table.
For me, the key word here is Incremental. Global Statistic updates are an incremental process, building on previous statistics (represented by the synopses) and only updating the Global Statistics based on the changes introduced by loading new partitions.
Understanding this might clear up another area of confusion I keep coming across. After upgrading their database to 11g, people often want to try out Incremental Global Stats on one of their existing large tables because they've always struggled to keep their Global Stats consistent and up to date. Maybe it's just the sites I work at but I'd say this is the most popular use case. Incrementals for a planned large partitioned table in your new systems might be a sensible idea, but there are a lot more existing systems out there with Global Stats collection problems that people have struggled with for years.
Most people I've spoken to initially had the impression that they simply flick the INCREMENTAL switch and perhaps modify some of the parameters to their existing DBMS_STATS calls so that GRANULARITY is AUTO and they use AUTO sampling sizes. All of which is discussed in the various white papers and blog posts out there.
Then they get a hell of a surprise when the very first gather runs for ages! How long is ages? I don't know in your particular case but I've seen this running for hours and hour and hours and people are crying in to their keyboards wondering why something that was supposed to make things run more quickly is so much slower than their usual stats calls.
The best way I've found to explain this phenomenon is to concentrate on the synopses that describe the existing partitions. Where do you think they come from? How are they calculated and populated if you don't ask Oracle to look at the existing data in your enormous table? That's what needs to happen. In order to make future updates to your Global Stats much more efficient, we first need to establish the baseline describing your existing data that Oracle will use as the foundation for the later incremental updates.
Generating the synopses as the baseline for future improvements will be a relatively painful for the largest tables (if it wasn't, you probably wouldn't be so interested in Incrementals ), but it does only have to happen once. You just need to understand that it does have to happen and plan for it as part of your migration.
My personal suggestion is usually to just delete all of the existing stats and start from scratch with modern default parameter values and tidy up any stats-related junk that might be lingering around large, critical tables. Painful but probably worth it!
#1 - Connor 2014-03-04 02:48 - (Reply)
And just to add...if you flip the switch on incremental stats on a table with many partitions (ie, thousands), then you might get some nasty surprises when your supposed 'faster' gather stats spends more time managing synopses than actually working out any statistics
As with anything...test carefully
#1.1 - Doug Burns said:
2014-03-05 16:25 - (Reply)
Too right! I was going to do one more short post, linking back to the first and the other more detailed blog posts that are out there covering that.
Just wanted to get some of the basics right first
#1.1.1 - antony 2014-04-14 16:13 - (Reply)
'GLOBAL AND PARTITION' will not collect statistics for sub-partitions.Would it be worth including it as side note?
#1.1.1.1 - Doug Burns said:
2014-04-15 06:23 - (Reply)
Yes, excellent point and I should have probably mentioned it. The whole way that Oracle handles stats on subpartitions is a bit of a bolt-on after the partition stats and works a little differently.
Probably even more reason to use AUTO.
#2 - Luis Marques said:
2014-07-07 21:02 - (Reply)
Excelent post Doug.
I've been implementing in a new DW, incremental statistics. All went fine with testing until i've found this:
Bug 13844984 Slow INCREMENTAL Statistics Gathering
That is a really interesting bug, only happens when there's a single column unique index, deleting synopsis takes ages, actually it takes more than deleting all stats and collect again without incremental.
Please be aware
Contents
Part 1 - Default options - GLOBAL AND PARTITION
Part 2 - Estimated Global Stats
Part 3 - Stats Aggregation Problems I
Part 4 - Stats Aggregation Problems II
Part 5 - Minimal Stats Aggregation
Part 6a - COPY_TABLE_STATS - Intro
Part 6b - COPY_TABLE_STATS - Mistakes
Part 6c - COPY_TABLE_STATS - Bugs and Patches
Part 6d - COPY_TABLE_STATS - A Light-bulb Moment
Part 6e - COPY_TABLE_STATS - Bug 10268597
A couple of posts about Incremental Stats confusion
Part 1
Part 2
Design by Andreas Viklund | Conversion to s9y by Carl