Tries Guide | CS 61B Spring 2018

Tries Guide

Author: Josh Hug

Overview
Recommended Problems
- C level
- B level
- A level

Overview

Summary. The sort problem is to take a sequence of objects and put them into the correct order. The search problem is to store a collection of objects such that they can be rapidly retrieved (i.e. how do we implement a Map or Set). We made the obersvation that BST maps are roughly analagous to comparison based sorting, and hash maps are roughly analagous to counting based (a.k.a. integer) sorting. We observed that we have a 3rd type of sort, which involves sorting by digit, which raised the question: What sort of data structure is analogous to LSD or MSD sort?

Terminology.

Length of string key usually represented by L.
Alphabet size usually represented by R.

Tries. Analogous to LSD sort. Know how to insert and search for an item in a Trie. Know that Trie nodes typically do not contain letters, and that instead letters are stored implicitly on edge links. Know that there are many ways of storing these links, and that the fastest but most memory hungry way is with an array of size R. We call such tries R-way tries.

TSTs. Instead of R links, a TST node has only 3 links. Know how to insert and search for an item in a ternary search trie. Be aware that TSTs can become unbalanced. (As an aside that we won’t cover in class, TSTs are analgous to a sort known as 3-way radix quicksort, which is just quicksort applied digit by digit). Each node typically contains a character, except the root, which contains no character.

Advantages of Tries and TSTs. Both flavors of tries have very fast lookup times, as we only ever look at as many characters as they are in the data we’re trying to retrieve. However, their chief advantage is the ability to efficiently support various operations not supported by other map/set implementations including:

longestPrefixOf
prefixMatches
spell checking

Overview

Recommended Problems

C level

B level

A level