Announcements
---------------
- Exercise today
- Hw given next monday
Indexing - Nov 12, 2020
-----------------------------------
1. Primary vs. Secondary indices
2. Sparse vs Dense indices
B-trees
---------
B-tree order of n
~~~~~~~~~~~~~~~~~~~
Each node: a disk page
Rule: each node should have between n/2 and n entries (except for root)
Leaf nodes: point to tuples in the relation
Internal nodes: point to nodes in the level below
3. B-tree order example: max/avg/min
<#nodes/levels>
8K - a disk page
R(A,B,C,D) -> 2,000,000 tuples
Index I1 on R(C), and C is 50 bytes long, a page address is 15 bytes long
Each index node has 1,200 bytes of header info
n -> Maximum number of entries I can store in a single node:
n = (8*1024 - 1200 - 15)/(50+15) = 107
53 - 107 entries in each
-------
How big is this B-tree?
Max//Smallest b-tree: Assume each node has 107 entries
ceil(2000000/107) = 18692 leaf nodes
ceil(18692/107) = 175 internal nodes above leaf
ceil(175/107) = 2 internal nodes
1 root
(depth 3, 4 levels)
Min//Largest b-tree: Assume each node has 53 entries
ceil(2000000/53) = 37736 leaf nodes
ceil(37736/53) = 712 internal nodes above leaf
ceil(712/53) = 14 internal nodes
1 root
(4 levels)
number of levels is in the order of log_n(K) where K is the number of
tuples in the relation
4. B-tree search/insertion/deletion
5. B-trees with duplicate entries