Announcements --------------- - Exercise today - Hw given next monday Indexing - Nov 12, 2020 ----------------------------------- 1. Primary vs. Secondary indices 2. Sparse vs Dense indices B-trees --------- B-tree order of n ~~~~~~~~~~~~~~~~~~~ Each node: a disk page Rule: each node should have between n/2 and n entries (except for root) Leaf nodes: point to tuples in the relation Internal nodes: point to nodes in the level below 3. B-tree order example: max/avg/min <#nodes/levels> 8K - a disk page R(A,B,C,D) -> 2,000,000 tuples Index I1 on R(C), and C is 50 bytes long, a page address is 15 bytes long Each index node has 1,200 bytes of header info n -> Maximum number of entries I can store in a single node: n = (8*1024 - 1200 - 15)/(50+15) = 107 53 - 107 entries in each ------- How big is this B-tree? Max//Smallest b-tree: Assume each node has 107 entries ceil(2000000/107) = 18692 leaf nodes ceil(18692/107) = 175 internal nodes above leaf ceil(175/107) = 2 internal nodes 1 root (depth 3, 4 levels) Min//Largest b-tree: Assume each node has 53 entries ceil(2000000/53) = 37736 leaf nodes ceil(37736/53) = 712 internal nodes above leaf ceil(712/53) = 14 internal nodes 1 root (4 levels) number of levels is in the order of log_n(K) where K is the number of tuples in the relation 4. B-tree search/insertion/deletion 5. B-trees with duplicate entries