This was first published on https://blog.dbi-services.com/12cr2-new-index-usage-tracking (2016-11-13)
Republishing here for new followers. The content is related to the the versions available at the publication date
A common question is: how to know which indexes are not used, so that I can drop them. If you tried to use index monitoring you probably have seen the limits of it which make it difficult to use. It has been improved in 12.2 so let’s check if it helps to release the stress of performance regression when we drop an index… or not. I’ll check two views here. Here is what documentation says about them:
The documentation about V$INDEX_USAGE_INFO show a column INDEX_STATS_COLLECTION_TYPE where description explains that by default the statistics are collected based on sampling (only a few of the executions are considered when collecting the statistics). The type of collection that collects the statistics for each execution may have a performance overhead.
I’ve found an undocumented to control this collection, which defaults to ‘SAMPLED’ and I’ll set it to ‘ALL’ to get deterministic test case:
17:53:51 SQL> alter session set "_iut_stat_collection_type"=ALL; Session altered.
So this is the first problem with how reliable index usage tracking is. If your boss is running a report once a month which needs a index, you may miss this execution and think that this index is unused and decide to drop it. And you will have a regression. Do you want to take the risk on a sample monitoring?
On the SCOTT schema I’m running a query that uses the index PK_DEPT
17:53:51 SQL> set autotrace on explain Autotrace Enabled Displays the execution plan only. 17:53:51 SQL> select * from emp join dept using(deptno) where ename like 'K%'; DEPTNO EMPNO ENAME JOB MGR HIREDATE SAL COMM DNAME LOC 10 7839 KING PRESIDENT 17-nov 00:00:00 5000 ACCOUNTING NEW YORK Explain Plan ----------------------------------------------------------- Plan hash value: 3625962092 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 117 | 3 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 1 | 117 | 3 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 1 | 117 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL | EMP | 1 | 87 | 2 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 30 | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("EMP"."ENAME" LIKE 'K%') 4 - access("EMP"."DEPTNO"="DEPT"."DEPTNO") Note ----- - dynamic statistics used: dynamic sampling (level=2) - this is an adaptive plan 17:53:52 SQL> set autotrace off Autotrace Disabled
When I look at the index usage tracking views, I don’t see this usage and the reason is that the last flush is from before the execution:
17:53:52 SQL> select * from v$index_usage_info; INDEX_STATS_ENABLED INDEX_STATS_COLLECTION_TYPE ACTIVE_ELEM_COUNT ALLOC_ELEM_COUNT MAX_ELEM_COUNT FLUSH_COUNT TOTAL_FLUSH_DURATION LAST_FLUSH_TIME STATUS_MSG CON_ID 1 0 2 3 30000 8 30790 13-NOV-16 05.48.12.218000000 PM 3 17:53:52 SQL> select * from dba_index_usage where owner='SCOTT'; no rows selected
The statistics are gathered in memory and are flushed to the dictionary every 15 minutes. For the moment, I’ve not found how to flush them manually, so I just wait 900 seconds:
17:53:52 SQL> host sleep 900 18:10:32 SQL> select * from v$index_usage_info; INDEX_STATS_ENABLED INDEX_STATS_COLLECTION_TYPE ACTIVE_ELEM_COUNT ALLOC_ELEM_COUNT MAX_ELEM_COUNT FLUSH_COUNT TOTAL_FLUSH_DURATION LAST_FLUSH_TIME STATUS_MSG CON_ID 1 0 2 3 30000 9 45898 13-NOV-16 06.03.13.344000000 PM 3 18:10:32 SQL> select * from dba_index_usage where owner='SCOTT'; OBJECT_ID NAME OWNER TOTAL_ACCESS_COUNT TOTAL_EXEC_COUNT TOTAL_ROWS_RETURNED BUCKET_0_ACCESS_COUNT BUCKET_1_ACCESS_COUNT BUCKET_2_10_ACCESS_COUNT BUCKET_2_10_ROWS_RETURNED BUCKET_11_100_ACCESS_COUNT BUCKET_11_100_ROWS_RETURNED BUCKET_101_1000_ACCESS_COUNT BUCKET_101_1000_ROWS_RETURNED BUCKET_1000_PLUS_ACCESS_COUNT BUCKET_1000_PLUS_ROWS_RETURNED LAST_USED 73723 PK_DEPT SCOTT 1 1 1 0 1 0 0 0 0 0 0 0 0 13-nov 18:03:13
Here is my index usage recorded. On execution. One row returned from the index.
One drawback of index monitoring was that the statistics gathering was setting the monitoring to ‘YES’. Let’s see if it’s better in 12.2:
18:10:32 SQL> exec dbms_stats.gather_index_stats('SCOTT','PK_DEPT'); PL/SQL procedure successfully completed.
Again, waiting 15 minutes to get it flushed (and check LAST_FLUSH_TIME):
18:10:32 SQL> host sleep 900 18:27:12 SQL> select * from v$index_usage_info; INDEX_STATS_ENABLED INDEX_STATS_COLLECTION_TYPE ACTIVE_ELEM_COUNT ALLOC_ELEM_COUNT MAX_ELEM_COUNT F LUSH_COUNT TOTAL_FLUSH_DURATION LAST_FLUSH_TIME STATUS_MSG CON_ID 1 0 1 3 30000 1 0 48136 13-NOV-16 06.18.13.748000000 PM 3 18:27:12 SQL> select * from dba_index_usage where owner='SCOTT'; OBJECT_ID NAME OWNER TOTAL_ACCESS_COUNT TOTAL_EXEC_COUNT TOTAL_ROWS_RETURNED BUCKET_0_ACCESS_CO UNT BUCKET_1_ACCESS_COUNT BUCKET_2_10_ACCESS_COUNT BUCKET_2_10_ROWS_RETURNED BUCKET_11_100_ACCESS_CO UNT BUCKET_11_100_ROWS_RETURNED BUCKET_101_1000_ACCESS_COUNT BUCKET_101_1000_ROWS_RETURNED BUCKET_10 00_PLUS_ACCESS_COUNT BUCKET_1000_PLUS_ROWS_RETURNED LAST_USED 73723 PK_DEPT SCOTT 2 2 5 0 1 1 4 0 0 0 0 0 0 13-nov 18:18:13
It seems that the index tracking usage has been incremented here. Total rows returned incremented by 4 which is the number of rows in DEPT, read by dbms_stats. This will be very difficult to use to detect unused index because we can expect that even unused indexes have statistics gathering on them.
There’s another risk we have when we drop an index. It may not be used for access, but to avoid a TM Share lock on a child table when deleting rows from the referenced table. This is again something that was not monitored. When the parent table has few rows, like some lookup tables, the index on the foreign key will probably not be used to access to the child rows, or to check that there are no child rows when you delete a parent one. A full scan will be faster. But an index on it is still required to avoid to lock the whole table when we delete rows from the parent.
Let’s create such an index.
18:27:12 SQL> create index FK_EMP on EMP(DEPTNO); Index FK_EMP created.
I’ll delete DEPTNO=50 and I can verify that checking that there are no child rows is done without the need of the index:
SQL_ID 1v3zkdftt0vv7, child number 0 ------------------------------------- select * from emp where deptno=50 Plan hash value: 3956160932 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 3 | |* 1 | TABLE ACCESS FULL| EMP | 1 | 1 | 0 |00:00:00.01 | 3 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("DEPTNO"=50)
Let’s delete the parent row and see if the index is used or not.
19:19:47 SQL> delete from DEPT where deptno='50'; 0 rows deleted. 19:19:47 SQL> commit; Commit complete.
This do not lock the EMP table because of the presence of the index FK_EMP. If the index were not there, a TM Share lock would have been acquired, which prevent concurreny DML on EMP table (at least).
19:19:48 SQL> host sleep 900 19:34:48 SQL> select * from v$index_usage_info; INDEX_STATS_ENABLED INDEX_STATS_COLLECTION_TYPE ACTIVE_ELEM_COUNT ALLOC_ELEM_COUNT MAX_ELEM_COUNT FLUSH_COUNT TOTAL_FLUSH_DURATION LAST_FLUSH_TIME STATUS_MSG CON_ID 1 0 0 3 30000 12 48152 13-NOV-16 07.24.11.086000000 PM 3 19:34:48 SQL> select * from dba_index_usage where owner='SCOTT'; OBJECT_ID NAME OWNER TOTAL_ACCESS_COUNT TOTAL_EXEC_COUNT TOTAL_ROWS_RETURNED BUCKET_0_ACCESS_COUNT BUCKET_1_ACCESS_COUNT BUCKET_2_10_ACCESS_COUNT BUCKET_2_10_ROWS_RETURNED BUCKET_11_100_ACCESS_COUNT BUCKET_11_100_ROWS_RETURNED BUCKET_101_1000_ACCESS_COUNT BUCKET_101_1000_ROWS_RETURNED BUCKET_1000_PLUS_ACCESS_COUNT BUCKET_1000_PLUS_ROWS_RETURNED LAST_USED 73723 PK_DEPT SCOTT 2 2 5 0 1 1 4 0 0 0 0 0 0 13-nov 18:18:13
No additional index usage has been detected. Do you take the risk to drop the index? Probably not. Even making the index invisible do not lower the risk. You may check DBA_TAB_MODIFICATIONS to know if the parent table is subject of deletes, but what if some transactions are updating the referenced key? This is also a case of TM Share lock, and this happens more that we think (for example when Hibernate updates all columns even those that do not change).
The new index usage tracking in 12.2 is very nice to get statistics on index usage, better than a simple ‘YES/NO’ flag as we have before. But detecting which index is not used and can be safely dropped is still something complex and that requires the application knowledge and comprehensive non-regression testing. There is nothing yet that can tell you than all would have been the same if an index were not there.
Hi Franck, Congratulations for the text. I have a doubt: In Oracle 12.2 Is it necessary configure something for index monitoring or it is automatic ? What is the parameter that shows to me that the index is being monitored ?
Hi, the sampling is there by default. You can query dba_index_usage to see it