Query performance condundrum. [message #678198] |
Fri, 15 November 2019 16:13 |
|
uman2631
Messages: 16 Registered: November 2011
|
Junior Member |
|
|
I have a table as shown below, where I have a key field, an effective date and a termination date (there are, in reality, many more fields, but these are the ones that matter).
I'm trying to find "bad" data, that is, records where the termination_dt < effective_dt.
The table contains hundreds of millions of records, and I am trying to avoid a full table scan.
I can add indexes, but I cannot add columns to the table.
Less than 1/10 of 1% of the table will meet the query condition, and I'm trying to think of a way to query this table that won't require a full table scan.
Since both termination_dt and effective_dt can be any date and cannot be predetermined, an index on either field doesn't really help. It seems like such a simple problem.
Any ideas?
CREATE TABLE foo
(mykey VARCHAR2(5), EFFECTIVE_DT DATE, TERMINATION_DT DATE);
INSERT INTO foo VALUES
('AAA','01-JAN-2019','01-JAN-2018');
SELECT * FROM foo where TERMINATION_DT < EFFECTIVE_DT;
|
|
|
|
Re: Query performance condundrum. [message #678206 is a reply to message #678198] |
Sat, 16 November 2019 01:41 |
John Watson
Messages: 8937 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
orclz>
orclz> create index fooi on foo(TERMINATION_DT - EFFECTIVE_DT);
Index created.
orclz> set autot on exp
orclz> SELECT * FROM foo where TERMINATION_DT - EFFECTIVE_DT < 0;
MYKEY EFFECTIVE_DT TERMINATION_DT
----- ------------------- -------------------
AAA 0001-01-20:19:00:00 0001-01-20:18:00:00
Execution Plan
----------------------------------------------------------
Plan hash value: 2246344638
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 700 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| FOO | 20 | 700 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | FOOI | 4 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TERMINATION_DT"-"EFFECTIVE_DT"<0)
orclz>
btw, I wish you would not say "record" and "field" when you mean "row" and "column".
|
|
|
Re: Query performance condundrum. [message #678207 is a reply to message #678206] |
Sat, 16 November 2019 07:31 |
Solomon Yakobson
Messages: 3275 Registered: January 2010 Location: Connecticut, USA
|
Senior Member |
|
|
If table is quite large and assuming TERMINATION_DT < EFFECTIVE_DT is a rare case we can save on index size:
SQL> create index foo_idx1
2 on foo(case when TERMINATION_DT < EFFECTIVE_DT then 1 end);
Index created.
SQL> explain plan for
2 SELECT * FROM foo where 1 = case when TERMINATION_DT < EFFECTIVE_DT then 1 end;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------
Plan hash value: 359656133
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| FOO | 1 | 25 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | FOO_IDX1 | 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(CASE WHEN "TERMINATION_DT"<"EFFECTIVE_DT" THEN 1 END =1)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
18 rows selected.
SQL>
SY,
|
|
|
|
|
Re: Query performance condundrum. [message #678211 is a reply to message #678210] |
Sat, 16 November 2019 11:41 |
John Watson
Messages: 8937 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
I think "attribute" is fine, as long as we use it with tuple and relation rather than row and table. "Instance" I never understood. Isn't there also some junk about nulls and uniqueness with that terminology? If Ted Codd (RIP) had not invented the relational database, none of us would have jobs. But he did make it complicated.
|
|
|