May 7: Statistics on Partitioned Tables - Part 6d - COPY_TABLE_STATS - A Light-bulb Moment
I'm pretty self-concious of the amount of waffle that surrounds any technical content here, so let's get the technical bit out of the way first, then the waffling can come later ...
I finally tracked down the mistake I didn't make in part 6a, but thought I'd identified and fixed in part 6b! Here are the two sets of subpartitions for the P_20100209 partition that's the source of the statistics and for the new P_20100210 partition that the stats were copied to. This is how part 6a originally looked
and here it is after I'd fixed it.
So originally I had seen no statistics on the P_20100210_GROT partition after the call to COPY_TABLE_STATS but then I did see statistics on a later run so I edited the post to reflect that.
Why the difference? Well it's not a COPY_TABLE_STATS issue and COPY_TABLE_STATS does not copy statistics for one subpartition and not the others. It just doesn't make sense and never did but I somehow convinced myself it did. Here's where those subpartition stats actually came from and you can walk through the script and output I posted earlier to see this for yourself.
(Note - I actually repeat steps 1 and 2 twice - once with _minimal_stats_aggregation set to it's default value of TRUE and then with _minimal_stats_aggregation set to FALSE. But both runs have the same error, so you can refer to either)
1) I create LOAD_TAB1 and re-use it 4 times to load 3 rows into each of the 4 new subpartitions using subpartition exchange. Here's an example of one of those loads.
2) The subpartition stats now look like this
3) Now that I have the P_20100209 subpartition stats, I'll add a new partition, populate the P_20100210_GROT subpartition with 10 rows using the same LOAD_TAB1 and subpartition exchange and then copy the stats from the previous partition.
4) The subpartition stats now look like this
so it looks like COPY_TABLE_STATS copied stats for just the subpartition I just loaded via subpartition exchange, but why didn't it copy the stats for the other subpartitions? The answer is that it didn't copy any stats for subpartitions. The stats for P_20100210_GROT were created because we exchanged LOAD_TAB1 with that subpartition and LOAD_TAB1 still had statistics from the previous time it was used!
If I had added a gather_stats call on LOAD_TAB1 immediately after populating it with data (as I did in earlier parts of the test), the stats would still have appeared on P_20100210_GROT, but NUM_ROWS of 10 would have highlighted that these were not copied stats, but stats from the load table. Alternatively, if I had dropped and recreated LOAD_TAB1 each time (or used a different load table entirely) and not gathered stats on it (as you wouldn't if you were using COPY_TABLE_STATS) then I would have seen what I did at the outset - no subpartition stats at all! That would be a much closer version of how you might actually use this in production.
A final alternative would have been to delete the inappropriate stats from LOAD_TAB1 when I reuse it. That's the approach I've used in a very slightly modified version of stats_5_6a.txt that I've put here (just search for delete_table_stats).
The Waffle
As well as being an illustration of how easy it is to screw up tests when you start cutting and pasting sections from other scripts to (ha!) simplify things, this is a poster-child for discussing things with others. It was only when I sat down with my colleague Jari Kuhanen (excellent Oracle guy - looking for work soon
) to go over another aspect of these tests and he asked some pertinent questions, that the problem suddenly hit me. He didn't need to tell me what the problem was - just him asking good questions made me think about it a different way and I suddenly knew what the issue was.
I keep saying this, but one of my favourite aspects of blogging is making and fixing your mistakes in public. It's often the mistakes that provide the real learning experiences and it's also an extension of discussing your ideas with others. However, the down-side is that you can end up with an unholy, confusing mess of different, contradictory posts and it can be hard to judge what you should edit as a result. I never edit posts meaningfully without making it pretty clear I've done so and, although I'm tempted to tidy up this mess to make all the posts more consistent, I've decided I'd rather leave the meandering 'story' as is. No promises on when this might happen but, when these posts are all done, I think there's no avoiding a white paper about managing statistics on partitioned objects that will be coherent and correct. For now, let's leave the mess as it is
I finally tracked down the mistake I didn't make in part 6a, but thought I'd identified and fixed in part 6b! Here are the two sets of subpartitions for the P_20100209 partition that's the source of the statistics and for the new P_20100210 partition that the stats were copied to. This is how part 6a originally looked
TEST_TAB1 P_20100209_GROT NO 22-APR-2010 11:24:12 3 TEST_TAB1 P_20100209_HALO NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100209_JUNE NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100209_OTHERS NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100210_GROT NO TEST_TAB1 P_20100210_HALO NO TEST_TAB1 P_20100210_JUNE NO TEST_TAB1 P_20100210_OTHERS NO
and here it is after I'd fixed it.
TEST_TAB1 P_20100209_GROT NO 22-APR-2010 11:24:12 3 TEST_TAB1 P_20100209_HALO NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100209_JUNE NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100209_OTHERS NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100210_GROT NO 22-APR-2010 11:24:10 3 TEST_TAB1 P_20100210_HALO NO TEST_TAB1 P_20100210_JUNE NO TEST_TAB1 P_20100210_OTHERS NO
So originally I had seen no statistics on the P_20100210_GROT partition after the call to COPY_TABLE_STATS but then I did see statistics on a later run so I edited the post to reflect that.
Why the difference? Well it's not a COPY_TABLE_STATS issue and COPY_TABLE_STATS does not copy statistics for one subpartition and not the others. It just doesn't make sense and never did but I somehow convinced myself it did. Here's where those subpartition stats actually came from and you can walk through the script and output I posted earlier to see this for yourself.
(Note - I actually repeat steps 1 and 2 twice - once with _minimal_stats_aggregation set to it's default value of TRUE and then with _minimal_stats_aggregation set to FALSE. But both runs have the same error, so you can refer to either)
1) I create LOAD_TAB1 and re-use it 4 times to load 3 rows into each of the 4 new subpartitions using subpartition exchange. Here's an example of one of those loads.
SQL> TRUNCATE TABLE LOAD_TAB1;
Table truncated.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'JUNE', 400, 'U');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'JUNE', 600, 'U');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100209, 'JUNE', 900, 'U');
1 row created.
SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1');
PL/SQL procedure successfully completed.
SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20100209_JUNE WITH TABLE load_tab1;
Table altered.
SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20100209_JUNE REBUILD UNUSABLE LOCAL INDEXES;
Table altered2) The subpartition stats now look like this
TABLE_NAME SUBPARTITION_NAME GLO LAST_ANALYZED NUM_ROWS ------------------------------ ------------------------------ --- -------------------- ---------- TEST_TAB1 P_20100209_GROT NO 22-APR-2010 11:24:12 3 TEST_TAB1 P_20100209_HALO NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100209_JUNE NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100209_OTHERS NO 22-APR-2010 11:24:13 3
3) Now that I have the P_20100209 subpartition stats, I'll add a new partition, populate the P_20100210_GROT subpartition with 10 rows using the same LOAD_TAB1 and subpartition exchange and then copy the stats from the previous partition.
SQL> ALTER TABLE TEST_TAB1
2 ADD PARTITION P_20100210 VALUES LESS THAN (20100211);
Table altered.
SQL> TRUNCATE TABLE LOAD_TAB1;
Table truncated.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 1000, 'P');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 30000, 'P');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 2000, 'P');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 10000, 'N');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 2400, 'P');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 500, 'P');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 1200, 'P');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 400, 'P');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 600, 'P');
1 row created.
SQL> INSERT INTO LOAD_TAB1 VALUES (20100210, 'GROT', 700, 'Z');
1 row created.
SQL> COMMIT;
Commit complete.
SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20100210_GROT WITH TABLE load_tab1;
Table altered.
SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20100210_GROT REBUILD UNUSABLE LOCAL INDEXES;
Table altered.
SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1',
srcpartname => 'P_20100209', dstpartname => 'P_20100210');
PL/SQL procedure successfully completed.
4) The subpartition stats now look like this
TEST_TAB1 P_20100209_GROT NO 22-APR-2010 11:24:12 3 TEST_TAB1 P_20100209_HALO NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100209_JUNE NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100209_OTHERS NO 22-APR-2010 11:24:13 3 TEST_TAB1 P_20100210_GROT NO 22-APR-2010 11:24:10 3 TEST_TAB1 P_20100210_HALO NO TEST_TAB1 P_20100210_JUNE NO TEST_TAB1 P_20100210_OTHERS NO
so it looks like COPY_TABLE_STATS copied stats for just the subpartition I just loaded via subpartition exchange, but why didn't it copy the stats for the other subpartitions? The answer is that it didn't copy any stats for subpartitions. The stats for P_20100210_GROT were created because we exchanged LOAD_TAB1 with that subpartition and LOAD_TAB1 still had statistics from the previous time it was used!
If I had added a gather_stats call on LOAD_TAB1 immediately after populating it with data (as I did in earlier parts of the test), the stats would still have appeared on P_20100210_GROT, but NUM_ROWS of 10 would have highlighted that these were not copied stats, but stats from the load table. Alternatively, if I had dropped and recreated LOAD_TAB1 each time (or used a different load table entirely) and not gathered stats on it (as you wouldn't if you were using COPY_TABLE_STATS) then I would have seen what I did at the outset - no subpartition stats at all! That would be a much closer version of how you might actually use this in production.
A final alternative would have been to delete the inappropriate stats from LOAD_TAB1 when I reuse it. That's the approach I've used in a very slightly modified version of stats_5_6a.txt that I've put here (just search for delete_table_stats).
The Waffle
As well as being an illustration of how easy it is to screw up tests when you start cutting and pasting sections from other scripts to (ha!) simplify things, this is a poster-child for discussing things with others. It was only when I sat down with my colleague Jari Kuhanen (excellent Oracle guy - looking for work soon
I keep saying this, but one of my favourite aspects of blogging is making and fixing your mistakes in public. It's often the mistakes that provide the real learning experiences and it's also an extension of discussing your ideas with others. However, the down-side is that you can end up with an unholy, confusing mess of different, contradictory posts and it can be hard to judge what you should edit as a result. I never edit posts meaningfully without making it pretty clear I've done so and, although I'm tempted to tidy up this mess to make all the posts more consistent, I've decided I'd rather leave the meandering 'story' as is. No promises on when this might happen but, when these posts are all done, I think there's no avoiding a white paper about managing statistics on partitioned objects that will be coherent and correct. For now, let's leave the mess as it is


When Jonathan Lewis decided it was time to post a list of the Partition Stats posts on his blog and Noons suggested I made them easier to track down, I listened. So this post will link to the others and, at least in the short term, I've also included li
Tracked: May 07, 05:24
Sigh ... these posts have become a bit of a mess. There are so many different bits and pieces I want to illustrate and I've been trying to squeeze them in around normal work. Worse still, because I keep leaving them then coming back to them and re-running
Tracked: May 07, 22:53