Lecture 20 ----------------- Exam#2 grading taking longer, hopefully to be done by tomorrow Indexing: - multi-attribute indexes ---------------- SELECT relname, relpages FROM pg_class pc, pg_user pu WHERE pc.relowner = pu.usesysid and pu.usename = user ORDER BY relpages desc; ------- select * from bakers where baker = 'Briony' ; partial index on a subset of tuples: CREATE INDEX access_log_client_ip_ix ON access_log (client_ip) WHERE NOT (client_ip > inet '192.168.100.0' AND client_ip < inet '192.168.100.255'); CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1)); ---------------- Index on R(X) X = 4 4 <= X and X < 10 -------------- R(A,B,C,D,E), Assume: 100,000 tuples 10,000 pages ( 10 tuples per page) Index I1 on R(A): 400 pages at leaf, 3 levels (root+internal+leaf) Index I2 on R(B): 400 pages at leaf, 3 levels (root+internal+leaf) Index I3 on R(A,B): 1,000 pages at leaf, 3 levels (root+internal+leaf) Index I4 on R(B,A): 1,000 pages at leaf, 3 levels (root+internal+leaf) -------------- SELECT A,B,C,D,E FROM R WHERE 10 <= A and A <= 20 ; (returns: 100 tuples) Query plan 1 -> Scan the whole relation: Cost = 10,000 Query plan 2: Scan I1 (find tuple pointers that match 10<=A<=20) + Read all tuples that I find based on the index from relation R 100,000/400 = 250 tuples per node 100 tuples for this query will fit in a single leaf node! Cost to scan the index = 1 (root) + 1 (internal) + 1 or 2 (leaf) = 4 max To read the matching 100 tuples: I need to read pages of R How many pages do I need to read? Best case: 10 pages (all packed sorted by A) Worst case: 100 Cost of this query (max) : 4 + 100 = 104 pages ------ Query plan 3: Use index I3: 100,000 (tuples of R) / 1,000 = 100 tuples per node Cost of index search = 4 (max) Cost of searching the relation = 100 Total cost = 104 (max) ------- SELECT A,B,C,D,E FROM R WHERE 10 <= A and A <= 20 ; (returns: 100 tuples) Query plan 4: Use index I4: Index I4 on R(B,A): 1,000 pages at leaf, 3 levels (root+internal+leaf) Scan all leaf, cost of index search = 1 (root) + 1 (internal) + 1,000 (all leaf) = 1002 Scan the tuples = 100 Cost = 1,102 =========================== SELECT B FROM R WHERE 10 <= A and A <= 20 ; (returns: 100 tuples) Query plans: 1. Scan all of R = 10,000 2. Use I1 on R(A) = 104 (max) 3. Use I3 on R(A,B) = 4 (index only search) 4. Use I4 on R(B,A) = 1,002 (index only search)