- In a binary search tree, each node has key & value, w/ ordering restriction to support efficient search
- Binary search tree (BST): Binary tree where each node has
Comparablekey (& associated value) & satisfies restriction that key in any node is larger than keys in all nodes in node's left subtree & smaller than keys in all nodes in node's right subtree
- BST represents set of keys (& associated values)
-
$$\exists$$ many different BSTs that represent same set
-
- Assume keys (uniformly) random → inserted in random order
- BSTs dual to quicksort
- Node at root of tree = first partitioning item in quicksort (no keys to left larger, no keys to right smaller)
- Subtrees built recursively, corresponds to quicksort's recursive subarray sorts
- Adding depths of all nodes = internal path length of tree
- Search hits in BST built from
$$N$$ random keys require$$\sim 2 \ln{N}$$ compares, on average-
$$C_{N}$$ = total internal path length of BST built from inserting$$N$$ randomly ordered distinct keys $$C_{N} \sim 2 N \ln{N}$$
-
- Insertions & search misses in BST built from
$$N$$ random keys require$$\sim 2 \ln{N}$$ compares, on average- Insertions and search misses take 1 more compare, on average, than search hits
-
$$E = I + 2n$$ -
$$E$$ = external path length -
$$I$$ = internal path length -
$$n$$ = # of internal nodes - Need at least 1 extra up from
$$I$$ for each node w/ at least 1 external child → if is leaf node w/ 2 external children, have to count second child by tracing nodes down from root → need$$2n$$ extra up from$$I$$ - Can consider recursively/inductively as well
-
- # of empty links or external nodes
$$\approx n$$ →$$\frac{C_{N} + n}{n} = \frac{C_{n}}{n} + 1$$ → need to add 1 for root comparison →$$\frac{C_{n}}{n} + 2$$ - Like quicksort, standard deviation of # of compares known to be low → formulas increasingly accurate as
$$N$$ ↑
- Important reason BSTs widely used b/c allow keeping keys in order → basis for implementing ordered symbol-table API
- Floor of key = largest key in BST
$$\leq$$ key- If given key
key$$<$$ key at root of BST → floor ofkeymust be in left subtree - If
key$$>$$ key at root → floor ofkeycould be in right subtree- In right subtree only if
$$\exists$$ key$$\leq$$ keyin right subtree - If not (or if
key= key at root) → key at root = floor ofkey
- In right subtree only if
- If given key
- Interchanging right & left (and
$$<$$ &$$>$$ ,$$\leq$$ &$$\geq$$ ) =ceiling()
- Choice of using only successor as replacement = arbitrary & not symmetric
- Worthwhile to choose at random between predecessor & successor
- Given tree → height determines worst-case of all BST operations (except range search, which incurs additional cost proportional to # of keys returned)
- Average height of BST built from random keys logarithmic → approaches
$$2.99 \lg{N}$$ for large$$N$$ - BST & quicksort both rely on randomization for optimal performance
- Worst case for both when input is in (reverse) sorted order