Lecture 20
-----------------

Exam#2 grading taking longer, hopefully to be done by tomorrow


Indexing:

- multi-attribute indexes


----------------

SELECT relname, relpages
FROM pg_class pc, pg_user pu
WHERE pc.relowner = pu.usesysid and pu.usename = user
ORDER BY relpages desc;

-------

select * from bakers where baker = 'Briony' ;

partial index on a subset of tuples:

CREATE INDEX access_log_client_ip_ix ON access_log (client_ip)
WHERE NOT (client_ip > inet '192.168.100.0' AND
           client_ip < inet '192.168.100.255');


CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));


----------------

Index on R(X)

X = 4

4 <= X and X < 10

--------------

R(A,B,C,D,E),  Assume: 100,000 tuples  10,000 pages  ( 10 tuples per page)

Index I1 on R(A):  400 pages at leaf, 3 levels (root+internal+leaf)

Index I2 on R(B): 400 pages at leaf, 3 levels (root+internal+leaf)

Index I3 on R(A,B): 1,000 pages at leaf, 3 levels (root+internal+leaf)

Index I4 on R(B,A): 1,000 pages at leaf, 3 levels (root+internal+leaf)

--------------

SELECT A,B,C,D,E FROM R WHERE 10 <= A and A <= 20 ; (returns: 100 tuples)

Query plan 1 -> Scan the whole relation: Cost = 10,000

Query plan 2: Scan I1 (find tuple pointers that match 10<=A<=20) +
Read all tuples that I find based on the index from relation R

100,000/400 = 250 tuples per node

100 tuples for this query will fit in a single leaf node!

Cost to scan the index = 1 (root) + 1 (internal) + 1 or 2 (leaf) = 4 max


To read the matching 100 tuples: I need to read pages of R

How many pages do I need to read?

Best case: 10 pages  (all packed sorted by A)

Worst case: 100

Cost of this query (max) : 4 + 100 = 104 pages

------

Query plan 3: Use index I3:

100,000 (tuples of R) / 1,000 = 100 tuples per node

Cost of index search = 4 (max)

Cost of searching the relation = 100 

Total cost = 104 (max)
-------

SELECT A,B,C,D,E FROM R WHERE 10 <= A and A <= 20 ; (returns: 100 tuples)

Query plan 4: Use index I4:

Index I4 on R(B,A): 1,000 pages at leaf, 3 levels (root+internal+leaf)

Scan all leaf, cost of index search = 1 (root) + 1 (internal) + 1,000
(all leaf) = 1002

Scan the tuples = 100

Cost = 1,102

===========================


SELECT B FROM R WHERE 10 <= A and A <= 20 ; (returns: 100 tuples)

Query plans:

1. Scan all of R = 10,000
2. Use I1 on R(A)  = 104 (max)
3. Use I3 on R(A,B) = 4 (index only search)
4. Use I4 on R(B,A) = 1,002 (index only search)