Statistics on Partitioned Tables - Part 6a - C ...

Doug's Oracle Blog

  • Home
  • Papers
  • Books
  • C.V.
  • Fun
  • Oracle Blog
  • Personal Blog

Apr 22: Statistics on Partitioned Tables - Part 6a - COPY_TABLE_STATS - Intro

[Phew. At last. The first draft of this was dated more than two weeks ago .... One of the problems with blogging about copying stats was the balance between explaining it and pointing out some of the problems I've encountered. So I've broken up this post, with a little explanation first ...]

Oracle have a lot of on-site consultants, designers and developers who are out working with customers on very large databases and they no doubt have their own internal systems too, so you shouldn't make the mistake of thinking they're unaware of the problems that people face in gathering accurate and timely statistics on large partitioned tables. I don't say that based on any inside knowledge but by

  • Attending conferences, listening to presentations, reading White Papers and blog posts; and
  • Because they are constantly introducing new features to try to address the issues.
Over the next few posts, I'll look at some of those new features, some of which appear more successful than others. First of all, copying statistics using DBMS_STATS.COPY_TABLE_STATS.

Copying stats is a variation of a technique I've seen used on a couple of Oracle Data Warehouse projects in the past where we decided that spending time and system resources gathering object statistics wasn't useful. Think about some of the problems you'll face when gathering stats on large partitioned objects that I've highlighted so far :-

  • It takes time.

  • It sucks system resources.

  • As you change your strategy and the parameters to reduce the resource and time consumption, the stats are likely to be less accurate.
Then consider why we're gathering these stats in the first place :-

  • To describe database objects and data to the optimiser by setting values in the data dictionary (e.g. Number of Rows, Number of Distinct Values, etc.)

  • To help the optimiser identify the optimal execution plan
If we know the precise contents of specific objects at all times (unlikely, but bear with me), why bother trawling through the database examining the contents when we could just set the relevant values in the data dictionary manually? DBMS_STATS has several procedures that can help you achieve that, for example GET_COLUMN_STATS, SET_COLUMN_STATS, SET_TABLE_STATS etc.

We could set the stats for specific tables, indexes or partitions manually and then choose not to gather stats for those objects. As I said, you really need to know your data for this to work, although I would suggest that it's not as important that the stats are absolutely precise as that they are an accurate enough description of the data to lead to optimal plans.

There are a couple of cases where this might prove particularly useful.

  • You add a new partition each month and so the partition starts out empty and fills up gradually over the course of the month. If you think about it, this is likely to lead to fluctuating execution plans as the contents of the partition change. That's what a cost-based optimiser does - calculates new plans based on changing data distribution - but it also implies plan instability and you might want to avoid that if possible. By setting stats manually you can 'pretend' that you start off with a full partition and have the optimiser evaluate plans based on that fixed assumption.

  • You add new partitions very frequently and have very little time to gather statistics before users are likely to report against the data. It might prove useful to set some approximate synthetic stats quickly to get reasonable plans and then gather more detailed stats later when time allows.
The latter is similar to the situation we're faced with on the system I'm working on at the moment, so we were very keen to investigate copying stats as a means to reduce the stats gathering workload and improve the stats quality into the bargain.

COPY_TABLE_STATS when used on partitioned tables is designed to achieve a similar effect to setting stats manually at a similarly low cost but with a little more intelligence. For example, if we are adding a new partition for each new REPORTING_DATE on our example table, maybe it's reasonable to assume that the data volumes and distribution of values is similar to the previous REPORTING_DATE? If so, why not avoid setting fixed stats based on fixed assumptions about the contents of a partition, but copy the stats from a previous partition to the new partition?

Here is a very basic example of how this might look, based on my example table.

At the end of Part 5, the stats on TEST_TAB1 looked like this (column stats excluded to simplify things). Important Note - the bold text in this output has been edited to correct an earlier error in this post. See part 6b for explanation.

SQL> select  table_name, global_stats, last_analyzed, num_rows
  2  from dba_tables
  3  where table_name='TEST_TAB1'
  4  and owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     GLO LAST_ANALYZED          NUM_ROWS                                           
------------------------------ --- -------------------- ----------                                          
TEST_TAB1                      NO                                                                           

SQL> select  table_name, partition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_partitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     PARTITION_NAME                 GLO LAST_ANALYZED          NUM_ROWS           
------------------------------ ------------------------------ --- -------------------- ----------           
TEST_TAB1                      P_20100131                     NO                                            
TEST_TAB1                      P_20100201                     NO                                            
TEST_TAB1                      P_20100202                     NO                                            
TEST_TAB1                      P_20100203                     NO                                            
TEST_TAB1                      P_20100204                     NO                                            
TEST_TAB1                      P_20100205                     NO                                            
TEST_TAB1                      P_20100206                     NO                                            
TEST_TAB1                      P_20100207                     NO                                            
TEST_TAB1                      P_20100209                     NO  28-MAR-2010 15:38:33         12           

9 rows selected.

SQL> select  table_name, subpartition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_subpartitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     SUBPARTITION_NAME              GLO LAST_ANALYZED          NUM_ROWS           
------------------------------ ------------------------------ --- -------------------- ----------           
TEST_TAB1                      P_20100131_GROT                NO                                            
TEST_TAB1                      P_20100131_HALO                NO                                            
TEST_TAB1                      P_20100131_JUNE                NO                                            
TEST_TAB1                      P_20100131_OTHERS              NO                      

<<snipped>>                                              
 
TEST_TAB1                  P_20100209_GROT            NO  28-MAR-2010 15:38:32      3           
TEST_TAB1                  P_20100209_HALO            NO  28-MAR-2010 15:38:32      3           
TEST_TAB1                  P_20100209_JUNE            NO  28-MAR-2010 15:38:32      3           
TEST_TAB1                  P_20100209_OTHERS          NO  28-MAR-2010 15:38:33      3           

36 rows selected.

Next I'm going to add a new partition (and by implication, the related subpartitions) to contain rows with a REPORTING_DATE of 20100210, then empty LOAD_TAB1 and insert some appropriate rows into it.

SQL> ALTER TABLE TEST_TAB1
  2  ADD  PARTITION P_20100210 VALUES LESS THAN (20100211);

Table altered.

SQL> TRUNCATE TABLE LOAD_TAB1;

Table truncated.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 1000, 'P');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 30000, 'P');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 2000, 'P');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 10000, 'N');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 2400, 'P');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 500, 'P');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 1200, 'P');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 400, 'P');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 600, 'P');

1 row created.

SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 700, 'Z');

1 row created.

SQL> COMMIT;

Commit complete.

There are 10 rows that will be loaded into the new partition, all with a SOURCE_SYSTEM of GROT, so all will be added to that subpartition. Now I'll use partition exchange to move the data in LOAD_TAB1 into the P_20100210_GROT subpartition.

SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20100210_GROT WITH TABLE load_tab1;

Table altered.

SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION  P_20100210_GROT REBUILD UNUSABLE LOCAL INDEXES;

Table altered.

Now to copy some stats. First I'll try copying the statistics from the previous partition (P_20100209) to the new partition. i.e. I am performing a Partition-level copy. Important Note - the bold text in this output has been edited to correct an earlier error in this post. See part 6b for explanation.

SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 

'P_20100209', dstpartname => 'P_20100210');

PL/SQL procedure successfully completed.

Let's see what has been copied

SQL> select  table_name, global_stats, last_analyzed, num_rows
  2  from dba_tables
  3  where table_name='TEST_TAB1'
  4  and owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     GLO LAST_ANALYZED          NUM_ROWS                                       
------------------------------ --- -------------------- ----------                                       
TEST_TAB1                      NO                                                                        

SQL>
SQL> select  table_name, partition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_partitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     PARTITION_NAME                 GLO LAST_ANALYZED          NUM_ROWS        
------------------------------ ------------------------------ --- -------------------- ----------        
TEST_TAB1                      P_20100131                     NO                                         
TEST_TAB1                      P_20100201                     NO                                         
TEST_TAB1                      P_20100202                     NO                                         
TEST_TAB1                      P_20100203                     NO                                         
TEST_TAB1                      P_20100204                     NO                                         
TEST_TAB1                      P_20100205                     NO                                         
TEST_TAB1                      P_20100206                     NO                                         
TEST_TAB1                      P_20100207                     NO                                         
TEST_TAB1                      P_20100209                     NO  28-MAR-2010 15:38:38         12        
TEST_TAB1                      P_20100210                     NO                                         

10 rows selected.

SQL> select  table_name, subpartition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_subpartitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     SUBPARTITION_NAME              GLO LAST_ANALYZED          NUM_ROWS        
------------------------------ ------------------------------ --- -------------------- ----------        
TEST_TAB1                      P_20100131_GROT                NO                                         
TEST_TAB1                      P_20100131_HALO                NO                                         

<<snipped>>

TEST_TAB1                  P_20100209_GROT            NO  28-MAR-2010 15:38:32      3        
TEST_TAB1                  P_20100209_HALO            NO  28-MAR-2010 15:38:32      3        
TEST_TAB1                  P_20100209_JUNE            NO  28-MAR-2010 15:38:32      3        
TEST_TAB1                  P_20100209_OTHERS          NO  28-MAR-2010 15:38:33      3        
TEST_TAB1                  P_20100210_GROT            NO  28-MAR-2010 15:38:33      3    
TEST_TAB1                      P_20100210_HALO                NO                                         
TEST_TAB1                      P_20100210_JUNE                NO                                         
TEST_TAB1                      P_20100210_OTHERS              NO                                         

40 rows selected.

Mmmm, that's quite interesting. Although there were valid stats on all of the P_20100209 subpartitions, it looks like nothing was copied. It appears that in our situation, where we only gather stats on subpartitions and allow them to aggregate up to the partition and table levels, I need to copy the statistics subpartition-by-subpartition, which is what I'll start looking at in the next post.

Important Note - Please go on to read part 6b. An error on my part in the output I pasted above made it look like there no statistics copied at all. In fact, subpartition statistics for P_20100210_GROT were copied but it's still a confusing picture because I was using a Partition-level copy and yet there are no copied statistics for three of the subpartitions even though the source subpartitions had valid stats, nor for the partition itself.

Which means it's still true that the Partition-level copy will not serve our particular purposes, so next I'll try copying statistics at the individual subpartition level.

I've updated the date of this post so hopefully it will re-appear in aggregators for those who read it earlier.



Posted by Doug Burns Comments: (4) Trackbacks: (6)

Trackbacks
Trackback specific URI for this entry

Statistics on Partitioned Tables - Contents
When Jonathan Lewis decided it was time to post a list of the Partition Stats posts on his blog and Noons suggested I made them easier to track down, I listened. So this post will link to the others and, at least in the short term, I've also included link
Weblog: Doug's Oracle Blog
Tracked: Apr 13, 05:55
PingBack
Weblog: jonathanlewis.wordpress.com
Tracked: Apr 13, 06:05
Statistics on Partitioned Tables - Part 6b - COPY_TABLE_STATS
Sigh ... these posts have become a bit of a mess. There are so many different bits and pieces I want to illustrate and I've been trying to squeeze them in around normal work. Worse still, because I keep leaving them then coming back to them and re-running
Weblog: Doug's Oracle Blog
Tracked: Apr 22, 11:53
Statistics on Partitioned Tables - Part 6b - COPY_TABLE_STATS - Mistakes
Sigh ... these posts have become a bit of a mess. There are so many different bits and pieces I want to illustrate and I've been trying to squeeze them in around normal work. Worse still, because I keep leaving them then coming back to them and re-running
Weblog: Doug's Oracle Blog
Tracked: Apr 22, 12:25
Statistics on Partitioned Tables - Part 6c - COPY_TABLE_STATS - Bugs and Patches
I wanted to talk about a few of the bugs and patches you need to be aware of if you plan to use DBMS_STATS.COPY_TABLE_STATS. Believe me, when entering the world of stats on (sub-)partitioned objects, you had better be prepared to spend a lot of time on
Weblog: Doug's Oracle Blog
Tracked: May 02, 23:53
PingBack
Weblog: sshailesh.wordpress.com
Tracked: May 08, 17:09

Comments
Display comments as (Linear | Threaded)

#1 - Pete Scott said:
2010-04-13 12:36 - (Reply)

Well worth the wait :-)
I couldn't agree more with the point about plan stability - in production data warehouses (in particular) it is about getting a consistent performance on a query (I will avoid my rant about adaptive parallel - that's one for the pub and not your blog!) and if something always comes back in 3 minutes I get happy users, sometimes 10 seconds or sometimes 2 hours I don't have happy users.

I think that when you copy statistics you need to have good feeling for the data and know what can be copied - some businesses are highly seasonal although May stats might be 'close enough' in June they are hopelessly wrong in December.

There is also another use for stats copying - this is the preparation of test cases for Oracle support - sometimes a problem that occurs with data warehouse sized volumes can be replicated by a couple of rows and the stats from the full size table.

can't wait until the next part...

#1.1 - Doug Burns said:
2010-04-15 20:23 - (Reply)

I will avoid my rant about adaptive parallel - that's one for the pub and not your blog!

Circa 2004, if not earlier ;-)

May stats might be 'close enough' in June they are hopelessly wrong in December.

... and what's good about copy stats is that you can choose the source, rather than setting fixed stats.

#1.1.1 - Pete Scott said:
2010-04-15 23:29 - (Reply)

indeed yes, the key thing is to choose the *right* source for the new partition stats and not to perpetuate bad stats "because you always done them like that"

Making the right choice needs understanding of the data, or at least a bit more that there's 126,000 rows

#2 - maclean said:
2010-08-06 06:39 - (Reply)

DBMS_STATS.COPY_TABLE_STATS may give wrong stats on 10.2.0.4 and 11.1.0.7, I will write note about that.


Add Comment

Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
BBCode format allowed
 
 

Statistics on Partitioned Tables

Contents

Part 1 - Default options - GLOBAL AND PARTITION
Part 2 - Estimated Global Stats
Part 3 - Stats Aggregation Problems I
Part 4 - Stats Aggregation Problems II
Part 5 - Minimal Stats Aggregation
Part 6a - COPY_TABLE_STATS - Intro
Part 6b - COPY_TABLE_STATS - Mistakes
Part 6c - COPY_TABLE_STATS - Bugs and Patches
Part 6d - COPY_TABLE_STATS - A Light-bulb Moment
Part 6e - COPY_TABLE_STATS - Bug 10268597

A couple of posts about Incremental Stats confusion

Part 1
Part 2

Comments

personal blog about Moving Sideways
Wed, 01.06.2016 18:34
That is a good tip particularl y to those fresh to the blogos phere. Short [...]
odziezprestige.pl about Moving Sideways
Wed, 01.06.2016 17:07
Please let me know if you're l ooking for a article writer fo r your site. [...]
Doug Burns about Moving Sideways
Tue, 10.05.2016 22:43
Oh, I won't give it that long unless I enjoy it ;-)

Bookmark

Open All | Close All

Syndicate This Blog

  • XML RSS 2.0 feed
  • ATOM/XML ATOM 1.0 feed
  • XML RSS 2.0 Comments
  • Feedburner Feed

Powered by

Serendipity PHP Weblog

Show tagged entries

xml 11g
xml ACE
xml adaptive thresholds
xml ASH
xml Audit Vault
xml AWR
xml Blogging
xml conferences
xml Cuddly Toys
xml Database Refresh
xml dbms_stats
xml Direct Path Reads
xml Fun
xml Grid Control
xml hotsos 2010
xml listener
xml locking
xml oow
xml oow2009
xml optimiser
xml OTN
xml Parallel
xml partitions
xml Patching
xml Swingbench
xml The Reality Gap
xml Time Matters
xml ukoug
xml ukoug2009
xml Unix/Shell
xml Useful Links

Disclaimer

For the avoidance of any doubt, all views expressed here are my own and not those of past or current employers, clients, friends, Oracle Corporation, my Mum or, indeed, Flatcat. If you want to sue someone, I suggest you pick on Tigger, but I hope you have a good lawyer. Frankly, I doubt any of the former agree with my views or would want to be associated with them in any way.

Design by Andreas Viklund | Conversion to s9y by Carl