# The magic LIKE optimization

It appears that there's a special optimization for predicates of the form:

col like 'pq%'

where the number of leading characters is 2. Consider the following example (8174 and 9204 show the same behaviour).

```SQL> create table o3 as
2  select chr(trunc(rownum/1000)+65)||to_char(rownum,'fm000000000000000') x,
4  from all_Objects
5  where rownum <= 26000
6  order by dbms_random.value;

Table created.
```

So we've got a table with 1000 rows for each letter of the alphabet,and we index on col X

```SQL> create index o3x on o3 ( x );

Index created.

SQL> analyze table o3 compute statistics;

Table analyzed.

---------------------------
SQL> set autotrace traceonly explain
SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J%';

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=300 Card=1002 Bytes=34068)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=300 Card=1002 Bytes=34068)
2    1     INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=5 Card=1002)

---------------------------
SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J0%';

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=5 Bytes=170)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=3 Card=5  Bytes=170)
2    1     INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=2 Card=5)

Wow! Look at the "brilliant" cost

--------------------------
SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J00%';

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=326 Card=1085 Bytes=36890)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=326 Card=1085 Bytes=36890)
2    1     INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=6 Card=1085)

--------------------------
SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J000%';

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=173 Card=575 Bytes=19550)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=173 Card=575 Bytes=19550)
2    1     INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=4 Card=575)

--------------------------
SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J0000%';

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=92 Card=305 Bytes=10370)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=92 Card=305 Bytes=10370)
2    1     INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=3 Card=305)

--------------------------
SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J00000%';

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=49 Card=162 Bytes=5508)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=49 Card=162 Bytes=5508)
2    1     INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=2 Card=162)

```

With all the tests, as you increase the number of leading chars, the index gets cheaper which seems reasonable. The exception is the "magical" 2-leading char case which always comes ultra-cheap. A 10053 trace showed the computed cardinality to about spot-on for a single leading char (ie cmptcd=1000), but this drops to 5 for the 2-char case.

A friend has hypothesized that this may be linked to the fact that for years Oracle Apps would do case insensitive searching by generating predicates such as:

```   col1 like 'AB%'
or col1 like 'aB%'
or col1 like 'Ab%'
or col1 like 'ab%'
and upper (col1) like upper('AbcdE%')'
```

in the times before function-based indexes became available. This allowed for the use of four index probes rather than a scan to do case insensitve searching.