Feb 28: Statistics on Partitioned Tables - Part 5
Although there might be other approaches, I'd say that there are two distinct approaches you are likely to use.
1) Create a temporary load table, load it with data, gather statistics on it and then exchange it with the relevant subpartition in the real table.
2) Create a temporary load table, load it with data, exchange it with the relevant subpartition and then gather stats on the subpartition.
Pete Scott left a comment on a previous post stating that he rarely uses approach 1 so no doubt he'll leave another comment here expanding on his reasons

First of all I'll recreate TEST_TAB1 as it was at the start of the series and add a new partition (and, by implication, the related subpartitions) and create a seperate table that I'll load the data into.
SQL> ALTER TABLE TEST_TAB1 2 ADD PARTITION P_20100209 VALUES LESS THAN (20100210); Table altered. SQL> DROP TABLE LOAD_TAB1; Table dropped. SQL> CREATE TABLE LOAD_TAB1 2 AS SELECT * FROM TEST_TAB1 WHERE 1=0; Table created. SQL> CREATE UNIQUE INDEX LOAD_TAB1_IX1 ON LOAD_TAB1 2 (REPORTING_DATE, SOURCE_SYSTEM, SEQ_ID) 3 NOPARALLEL COMPRESS 1; Index created.
Now I'll use LOAD_TAB1 to repeat the same process for the four different subpartitions - INSERT data into LOAD_TAB1, gather stats on it and then exchange it with the relevant subpartition of TEST_TAB1.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'GROT', 400, 'P'); 1 row created. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'GROT', 600, 'P'); 1 row created. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'GROT', 900, 'Z'); 1 row created. SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1'); PL/SQL procedure successfully completed. SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20100209_GROT WITH TABLE load_tab1; Table altered. SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20100209_GROT REBUILD UNUSABLE LOCAL INDEXES; Table altered. SQL> TRUNCATE TABLE LOAD_TAB1; Table truncated. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'JUNE', 400, 'U'); 1 row created. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'JUNE', 600, 'U'); 1 row created. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'JUNE', 900, 'U'); 1 row created. SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1'); PL/SQL procedure successfully completed. SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20100209_JUNE WITH TABLE load_tab1; Table altered. SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20100209_JUNE REBUILD UNUSABLE LOCAL INDEXES; Table altered. SQL> TRUNCATE TABLE LOAD_TAB1; Table truncated. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'HALO', 400, 'N'); 1 row created. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'HALO', 600, 'N'); 1 row created. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'HALO', 900, 'N'); 1 row created. SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1'); PL/SQL procedure successfully completed. SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20100209_HALO WITH TABLE load_tab1; Table altered. SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20100209_HALO REBUILD UNUSABLE LOCAL INDEXES; Table altered. SQL> TRUNCATE TABLE LOAD_TAB1; Table truncated. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'ZZZZ', 400, 'P'); 1 row created. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'ZZZZ', 600, 'P'); 1 row created. SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'ZZZZ', 900, 'Z'); 1 row created. SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1'); PL/SQL procedure successfully completed. SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20100209_OTHERS WITH TABLE load_tab1; Table altered. SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20100209_OTHERS REBUILD UNUSABLE LOCAL INDEXES; Table altered.
All of the P_20100209 subpartitions have stats that were swapped in as part of the partition exchange operation so hopefully there'll be some aggregated global statistics.
SQL> select table_name, global_stats, last_analyzed, num_rows 2 from dba_tables 3 where table_name='TEST_TAB1' 4 and owner='TESTUSER' 5 order by 1, 2, 4 desc nulls last; TABLE_NAME GLO LAST_ANALYZED NUM_ROWS ------------------------------ --- -------------------- ---------- TEST_TAB1 NO SQL> select table_name, partition_name, global_stats, last_analyzed, num_rows 2 from dba_tab_partitions 3 where table_name='TEST_TAB1' 4 and table_owner='TESTUSER' 5 order by 1, 2, 4 desc nulls last; TABLE_NAME PARTITION_NAME GLO LAST_ANALYZED NUM_ROWS ------------------------------ ------------------------------ --- -------------------- ---------- TEST_TAB1 P_20100131 NO TEST_TAB1 P_20100201 NO TEST_TAB1 P_20100202 NO TEST_TAB1 P_20100203 NO TEST_TAB1 P_20100204 NO TEST_TAB1 P_20100205 NO TEST_TAB1 P_20100206 NO TEST_TAB1 P_20100207 NO TEST_TAB1 P_20100209 NO 9 rows selected.
Oh, well, that doesn't seem to have worked. Maybe the LOAD_TAB1 stats weren't gathered correctly or didn't appear as part of the subpartition exchange operation?
SQL> select table_name, subpartition_name, global_stats, last_analyzed, num_rows 2 from dba_tab_subpartitions 3 where table_name='TEST_TAB1' 4 and table_owner='TESTUSER' 5 order by 1, 2, 4 desc nulls last; TABLE_NAME SUBPARTITION_NAME GLO LAST_ANALYZED NUM_ROWS ------------------------------ ------------------------------ --- -------------------- ---------- TEST_TAB1 P_20100131_GROT NO TEST_TAB1 P_20100131_HALO NO TEST_TAB1 P_20100131_JUNE NO TEST_TAB1 P_20100131_OTHERS NO <<output snipped>> TEST_TAB1 P_20100209_GROT NO 28-FEB-2010 21:41:47 3 TEST_TAB1 P_20100209_HALO NO 28-FEB-2010 21:41:49 3 TEST_TAB1 P_20100209_JUNE NO 28-FEB-2010 21:41:49 3 TEST_TAB1 P_20100209_OTHERS NO 28-FEB-2010 21:41:50 3 36 rows selected.
The subpartition stats are ok, then, but the aggregation process hasn't happened and that's because _miminal_stats_aggregation is set to TRUE (the default) which instructs Oracle to minimise aggregation operations and one of the ways it does so is to not aggregate statistics as a result of a partition exchange operation but to leave you to do that manually by gathering stats on the table partition. If we were to modify the parameter to a non-default value (and, being an underscore parameter, that's your own choice at your own risk ...), we would see different behaviour. I ran the same script, but with this small addition that changes the parameter setting at the session level.
SQL> alter session set "_minimal_stats_aggregation"=FALSE; Session altered.
Which will change the end result to this ...
SQL> select table_name, global_stats, last_analyzed, num_rows 2 from dba_tables 3 where table_name='TEST_TAB1' 4 and owner='TESTUSER' 5 order by 1, 2, 4 desc nulls last; TABLE_NAME GLO LAST_ANALYZED NUM_ROWS ------------------------------ --- -------------------- ---------- TEST_TAB1 NO SQL> SQL> select table_name, partition_name, global_stats, last_analyzed, num_rows 2 from dba_tab_partitions 3 where table_name='TEST_TAB1' 4 and table_owner='TESTUSER' 5 order by 1, 2, 4 desc nulls last; TABLE_NAME PARTITION_NAME GLO LAST_ANALYZED NUM_ROWS ------------------------------ ------------------------------ --- -------------------- ---------- TEST_TAB1 P_20100131 NO TEST_TAB1 P_20100201 NO TEST_TAB1 P_20100202 NO TEST_TAB1 P_20100203 NO TEST_TAB1 P_20100204 NO TEST_TAB1 P_20100205 NO TEST_TAB1 P_20100206 NO TEST_TAB1 P_20100207 NO TEST_TAB1 P_20100209 NO 28-FEB-2010 21:41:53 12 9 rows selected.
Note that there are still no statistics at the table level because not all of the partitions have stats yet, so aggregation can't take place, but there are aggregated statistics on the P_20100209 partition, because all of the relevant subpartitions do have stats.
All you need to remember is that the default setting of _minimal_stats_aggregation means that, unless you explicitly gather statistics on the partitions you've just exchanged, aggregation will not take place! Actually, copying stats will also invoke the aggregation process too, but I'll deal with that in the next post. (Updated later. That last sentence might not be true. I've just tried something at home and I'm seeing different results at work, so more investigation needed.)
Oh, and there's much more on this subject over on Randolf Geist's blog post.
#1 - Noons said:
2010-02-28 23:38 - (Reply)
I was wating for this post to come through. Yup, no doubt about it: you just gave me heaps of ideas to solve a big problem I have with our partition archiving.
Thanks heaps, man!
#2 - Doug Burns said:
2010-02-28 23:48 - (Reply)
No problem. Glad it's helping and actually, spawning ideas works just as well for me as coming up with answers!
#3 - Pete Scott said:
2010-03-03 08:31 - (Reply)
I have tried hard not to respond to Doug's challenge. I will write on the Rittman Mead blog about this - but first I need to submit some conference slides.
But a quick reason why I am more of an "exchange and then gather stats" person is that I don't always have my exchange table indexed in the same way as the partitioned table and if Oracle was say gathering column level stats on the indexed columns I would not have a like for like statistics exchange... I suppose this 'not having indexes' is a hang over from some generic PL/SQL we developed for a system with 80+ partitioned tables and a (far too) large number of tablespaces and the need to create on the fly the table we were to exchange in the correct tablespace; adding the indexes seem a step too far in getting the number of moving parts down.
Now if there was a "CREATE EMPTY TABLE AS COPY OF xxx INCLUDING INDEXES" option then I might have worked harder on the second option and also found the issue Doug describes so well here.
#4 - Doug Burns said:
2010-03-03 09:01 - (Reply)
Yeah, yeah, yeah ... we *all* have conference slides to get ready
Thanks for the thoughts, though - makes sense now you explain it that way.
I think one of the reasons why we gather before the exchange is that we want to eliminate the small lag while a subpartition has it's stats gathered after the exchange but I wasn't involved in the initial design or implementation to be honest.
#5 - Pete Scott said:
2010-03-03 09:32 - (Reply)
Yes, that can be a valid reason - I had the luxury that I had time gather more accurate stats before the users started to hit the newly loaded partitions....
#6 - Doug Burns said:
2010-03-03 09:48 - (Reply)
Yeah, we're near-real-time all the way round here - one of the consequences of global organisations.
But now I believe I'm straying on to the territory of your up-coming presentation
#7 - Kamus said:
2010-03-30 15:35 - (Reply)
Very impressed series, next time I will use this topic as a presentation in ACOUG conference.
#8 - Hans-Peter 2010-03-31 07:25 - (Reply)
Hi Doug,
In one of our databases I saw a non partitioned table for which global_stats was set to NO.
How can that happen?
regards Hans-Peter
#9 - Hans-Peter 2010-03-31 13:46 - (Reply)
Furthermore I see partitions with no subpartitions which have global_stats set to NO (version 9.2.0..
While replaying this part of your series on a 10.2.0.4 database my subpartion of partition P_20100209 have global_stats set to 'YES' whereas your test shows NO.
Can you elaborate on these issue's.
regards Hans-Peter
#10 - Doug Burns said:
2010-04-04 23:42 - (Reply)
Hi Hans-Peter,
The reason that immediately jumps to mind is that stats were gathered using the ANALYZE command (which doesn't collect Global Stats). I just tested this on a freshly created non-partitioned table in 10.2.0.4. dbms_stats.gather_table_stats with default options leaves GLOBAL_STATS=YES, analyze table compute statistics leaves it set to NO. You could try the same approach yourself quite quickly and should see the same result.
This is another danger when dealing with stats on complex databases - well meaning people with out of date knowledge can use analyze table and screw everything up
#11 - Doug Burns said:
2010-04-04 23:45 - (Reply)
So, to reiterate an important message - if your tables are partitioned you should be using DBMS_STATS.
Actually, if your tables *aren't* partitioned, you should still be using DBMS_STATS
#12 - Doug Burns said:
2010-04-04 23:51 - (Reply)
Sorry, Hans-Peter, I wasn't able to approve this comment when the spam blocker caught it, due to outages.
On the partitions with global_stats=NO, maybe my other reply answers that question?
As for the difference in test results, could you perhaps email me all of the steps you ran? Because I'm cutting and pasting pieces from a larger test script, it's possible I've screwed up somewhere and I would like to test your steps on my db to confirm.
It shouldn't be too hard for you to work out my mail address from the rest of the site
#13 - Bob Carlin 2010-04-20 23:34 - (Reply)
One of the new capabilities in 11g, is to derive global stats from the partition stats. This means that when you add a new partition, you can gather the partition's stats, and then re-compute (not gather) the global stats. The feature is known as Incremental Statistics
#14 - Doug Burns said:
2010-04-21 08:09 - (Reply)
Hi Bob,
That's where I'm actually going with this series, it's just taking me a while to get there
Cheers,
Doug
#15 - Jeff Moss said:
2010-07-13 09:32 - (Reply)
I was having this problem (minimal sats aggregation) with a PEL process and knew you had this series here, so after reading it, I understood my problem and was able to sort our stats out.
Thanks mate.
Jeff
#16 - Doug Burns said:
2010-07-13 17:36 - (Reply)
No problem, mate. One day I'll finish them
#17 - hemant 2011-11-16 16:42 - (Reply)
ARe you planning to upgrade this docs for 11.2.0.2? I was wondering if you have any research on using scale_factor using some average for past columns.
#18 - Doug Burns said:
2011-11-27 13:19 - (Reply)
Sorry for the delay in replying - I've been up to my eyeballs in conferences, mail etc.
I'm constantly looking at new stats improvements, how incrementals and copy_stats are coming along and will pick up these blog posts at some point in the future.
However, given the current state of copy_stats (for my client's needs at least), we'll probably address those issues with Oracle first before looking at scale_factor. Whilst it scale_factor seems a logical addition, I'm not sure we're likely to use it much because if you use a combination of gathered and copied stats, I think the copied stats should reflect changes in data distribution quite well.
Tracked: Mar 17, 07:44
Tracked: Mar 25, 13:49
Sigh ... these posts have become a bit of a mess. There are so many different bits and pieces I want to illustrate and I've been trying to squeeze them in around normal work. Worse still, because I keep leaving them then coming back to them and re-running
Tracked: Apr 22, 11:53
Sigh ... these posts have become a bit of a mess. There are so many different bits and pieces I want to illustrate and I've been trying to squeeze them in around normal work. Worse still, because I keep leaving them then coming back to them and re-running
Tracked: Apr 22, 12:25