12cR2 new index usage tracking

A common question is: how to know which indexes are not used, so that I can drop them. If you tried to use index monitoring you probably have seen the limits of it which make it difficult to use. It has been improved in 12.2 so let’s check if it helps to release the stress of performance regression when we drop an index… or not. I’ll check two views here. Here is what documentation says about them:

DBA_INDEX_USAGE displays cumulative statistics for each index.
V$INDEX_USAGE_INFO keeps track of index usage since the last flush. A flush occurs every 15 minutes. After each flush, ACTIVE_ELEM_COUNT is reset to 0 and LAST_FLUSH_TIME is updated to the current time.

The documentation about V$INDEX_USAGE_INFO show a column INDEX_STATS_COLLECTION_TYPE where description explains that by default the statistics are collected based on sampling (only a few of the executions are considered when collecting the statistics). The type of collection that collects the statistics for each execution may have a performance overhead.

SAMPLED

I’ve found an undocumented to control this collection, which defaults to ‘SAMPLED’ and I’ll set it to ‘ALL’ to get deterministic test case:

17:53:51 SQL> alter session set "_iut_stat_collection_type"=ALL;
Session altered.

So this is the first problem with how reliable index usage tracking is. If your boss is running a report once a month which needs a index, you may miss this execution and think that this index is unused and decide to drop it. And you will have a regression. Do you want to take the risk on a sample monitoring?

Execution using index

On the SCOTT schema I’m running a query that uses the index PK_DEPT

17:53:51 SQL> set autotrace on explain
Autotrace Enabled
Displays the execution plan only.

17:53:51 SQL> select * from emp join dept using(deptno) where ename like 'K%';

DEPTNO  EMPNO  ENAME  JOB        MGR  HIREDATE         SAL   COMM  DNAME       LOC
10      7839   KING   PRESIDENT       17-nov 00:00:00  5000        ACCOUNTING  NEW YORK

Explain Plan
-----------------------------------------------------------
Plan hash value: 3625962092

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     1 |   117 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |     1 |   117 |     3   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |         |     1 |   117 |     3   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL         | EMP     |     1 |    87 |     2   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    30 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("EMP"."ENAME" LIKE 'K%')
   4 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - this is an adaptive plan

17:53:52 SQL> set autotrace off
Autotrace Disabled

When I look at the index usage tracking views, I don’t see this usage and the reason is that the last flush is from before the execution:

17:53:52 SQL> select * from v$index_usage_info;
INDEX_STATS_ENABLED  INDEX_STATS_COLLECTION_TYPE  ACTIVE_ELEM_COUNT  ALLOC_ELEM_COUNT  MAX_ELEM_COUNT  FLUSH_COUNT  TOTAL_FLUSH_DURATION  LAST_FLUSH_TIME                  STATUS_MSG  CON_ID
1                    0                            2                  3                 30000           8            30790                 13-NOV-16 05.48.12.218000000 PM              3

17:53:52 SQL> select * from dba_index_usage where owner='SCOTT';

no rows selected

The statistics are gathered in memory and are flushed to the dictionary every 15 minutes. For the moment, I’ve not found how to flush them manually, so I just wait 900 seconds:

17:53:52 SQL> host sleep 900

18:10:32 SQL> select * from v$index_usage_info;
INDEX_STATS_ENABLED  INDEX_STATS_COLLECTION_TYPE  ACTIVE_ELEM_COUNT  ALLOC_ELEM_COUNT  MAX_ELEM_COUNT  FLUSH_COUNT  TOTAL_FLUSH_DURATION  LAST_FLUSH_TIME                  STATUS_MSG  CON_ID
1                    0                            2                  3                 30000           9            45898                 13-NOV-16 06.03.13.344000000 PM              3

18:10:32 SQL> select * from dba_index_usage where owner='SCOTT';
OBJECT_ID  NAME     OWNER  TOTAL_ACCESS_COUNT  TOTAL_EXEC_COUNT  TOTAL_ROWS_RETURNED  BUCKET_0_ACCESS_COUNT  BUCKET_1_ACCESS_COUNT  BUCKET_2_10_ACCESS_COUNT  BUCKET_2_10_ROWS_RETURNED  BUCKET_11_100_ACCESS_COUNT  BUCKET_11_100_ROWS_RETURNED  BUCKET_101_1000_ACCESS_COUNT  BUCKET_101_1000_ROWS_RETURNED  BUCKET_1000_PLUS_ACCESS_COUNT  BUCKET_1000_PLUS_ROWS_RETURNED  LAST_USED
73723      PK_DEPT  SCOTT  1                   1                 1                    0                      1                      0                         0                          0                           0                            0                             0                              0                              0                               13-nov 18:03:13

Here is my index usage recorded. On execution. One row returned from the index.

DBMS_STATS

One drawback of index monitoring was that the statistics gathering was setting the monitoring to ‘YES’. Let’s see if it’s better in 12.2:

18:10:32 SQL> exec dbms_stats.gather_index_stats('SCOTT','PK_DEPT');
PL/SQL procedure successfully completed.

Again, waiting 15 minutes to get it flushed (and check LAST_FLUSH_TIME):

18:10:32 SQL> host sleep 900

18:27:12 SQL> select * from v$index_usage_info;
INDEX_STATS_ENABLED  INDEX_STATS_COLLECTION_TYPE  ACTIVE_ELEM_COUNT  ALLOC_ELEM_COUNT  MAX_ELEM_COUNT  F                                                                                                           LUSH_COUNT  TOTAL_FLUSH_DURATION  LAST_FLUSH_TIME                  STATUS_MSG  CON_ID
1                    0                            1                  3                 30000           1                                                                                                           0           48136                 13-NOV-16 06.18.13.748000000 PM              3

18:27:12 SQL> select * from dba_index_usage where owner='SCOTT';
OBJECT_ID  NAME     OWNER  TOTAL_ACCESS_COUNT  TOTAL_EXEC_COUNT  TOTAL_ROWS_RETURNED  BUCKET_0_ACCESS_CO                                                                                                           UNT  BUCKET_1_ACCESS_COUNT  BUCKET_2_10_ACCESS_COUNT  BUCKET_2_10_ROWS_RETURNED  BUCKET_11_100_ACCESS_CO                                                                                                           UNT  BUCKET_11_100_ROWS_RETURNED  BUCKET_101_1000_ACCESS_COUNT  BUCKET_101_1000_ROWS_RETURNED  BUCKET_10                                                                                                           00_PLUS_ACCESS_COUNT  BUCKET_1000_PLUS_ROWS_RETURNED  LAST_USED
73723      PK_DEPT  SCOTT  2                   2                 5                    0                                                                                                                                 1                      1                         4                          0                                                                                                                                      0                            0                             0                              0                                                                                                                                         0                               13-nov 18:18:13

It seems that the index tracking usage has been incremented here. Total rows returned incremented by 4 which is the number of rows in DEPT, read by dbms_stats. This will be very difficult to use to detect unused index because we can expect that even unused indexes have statistics gathering on them.

Index on Foreign Key to avoid table locks

There’s another risk we have when we drop an index. It may not be used for access, but to avoid a TM Share lock on a child table when deleting rows from the referenced table. This is again something that was not monitored. When the parent table has few rows, like some lookup tables, the index on the foreign key will probably not be used to access to the child rows, or to check that there are no child rows when you delete a parent one. A full scan will be faster. But an index on it is still required to avoid to lock the whole table when we delete rows from the parent.

Let’s create such an index.

18:27:12 SQL> create index FK_EMP on EMP(DEPTNO);
Index FK_EMP created.

I’ll delete DEPTNO=50 and I can verify that checking that there are no child rows is done without the need of the index:

SQL_ID  1v3zkdftt0vv7, child number 0
-------------------------------------
select * from emp where deptno=50

Plan hash value: 3956160932

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |      0 |00:00:00.01 |       3 |
|*  1 |  TABLE ACCESS FULL| EMP  |      1 |      1 |      0 |00:00:00.01 |       3 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DEPTNO"=50)

Let’s delete the parent row and see if the index is used or not.

19:19:47 SQL> delete from DEPT where deptno='50';
0 rows deleted.
19:19:47 SQL> commit;
Commit complete.

This do not lock the EMP table because of the presence of the index FK_EMP. If the index were not there, a TM Share lock would have been acquired, which prevent concurreny DML on EMP table (at least).

19:19:48 SQL> host sleep 900

19:34:48 SQL> select * from v$index_usage_info;
INDEX_STATS_ENABLED  INDEX_STATS_COLLECTION_TYPE  ACTIVE_ELEM_COUNT  ALLOC_ELEM_COUNT  MAX_ELEM_COUNT  FLUSH_COUNT  TOTAL_FLUSH_DURATION  LAST_FLUSH_TIME                  STATUS_MSG  CON_ID
1                    0                            0                  3                 30000           12           48152                 13-NOV-16 07.24.11.086000000 PM              3

19:34:48 SQL> select * from dba_index_usage where owner='SCOTT';
OBJECT_ID  NAME     OWNER  TOTAL_ACCESS_COUNT  TOTAL_EXEC_COUNT  TOTAL_ROWS_RETURNED  BUCKET_0_ACCESS_COUNT  BUCKET_1_ACCESS_COUNT  BUCKET_2_10_ACCESS_COUNT  BUCKET_2_10_ROWS_RETURNED  BUCKET_11_100_ACCESS_COUNT  BUCKET_11_100_ROWS_RETURNED  BUCKET_101_1000_ACCESS_COUNT  BUCKET_101_1000_ROWS_RETURNED  BUCKET_1000_PLUS_ACCESS_COUNT  BUCKET_1000_PLUS_ROWS_RETURNED  LAST_USED
73723      PK_DEPT  SCOTT  2                   2                 5                    0                      1                      1                         4                          0                           0                            0                             0                              0                              0                               13-nov 18:18:13

No additional index usage has been detected. Do you take the risk to drop the index? Probably not. Even making the index invisible do not lower the risk. You may check DBA_TAB_MODIFICATIONS to know if the parent table is subject of deletes, but what if some transactions are updating the referenced key? This is also a case of TM Share lock, and this happens more that we think (for example when Hibernate updates all columns even those that do not change).

So what?

The new index usage tracking in 12.2 is very nice to get statistics on index usage, better than a simple ‘YES/NO’ flag as we have before. But detecting which index is not used and can be safely dropped is still something complex and that requires the application knowledge and comprehensive non-regression testing. There is nothing yet that can tell you than all would have been the same if an index were not there.

2 Comments

Eriovaldo says:

February 25, 2018 at 11 h 58 min

Hi Franck, Congratulations for the text. I have a doubt: In Oracle 12.2 Is it necessary configure something for index monitoring or it is automatic ? What is the parameter that shows to me that the index is being monitored ?

Reply to Eriovaldo
- Franck Pachot says:
  
  February 28, 2018 at 22 h 10 min
  
  Hi, the sampling is there by default. You can query dba_index_usage to see it
  
  Reply to Franck

Follow: Linkedin, Twitter, Youtube, Mastodon, dev.to