2023-08-03 16:59:19 -07:00
..
2023-08-03 16:39:19 -07:00
2023-08-03 16:32:11 -07:00
2023-08-03 16:59:19 -07:00
2023-08-03 16:56:05 -07:00
2023-08-03 16:53:01 -07:00
2023-08-03 16:24:22 -07:00

tries


  • tries, also called prefix tree, are a variant of n-ary tree in which characters are stored in each node.

  • each trie node represents a string (a prefix) and each path down the tree represents a word. note that not all the strings represented by trie nodes are meaningful.

  • the root is associated with the empty string.

  • the * nodes (null nodes) are often used to indicate complete words (usually represented by a special type of child) or a boolean flag that terminates the parent node.

  • a node can have anywhere from 1 through alphabet_size + 1 child.

  • can be used to store the entire english language for quick prefix lookup (O(k), where k is the length of the string). they are also widely used on autocompletes, spell checkers, and ip routing (longest prefix matching).

  • tries structures can be represented by arrays and maps or trees.


class Trie:

    def __init__(self):
        self.root = {}

    def insert(self, word: str) -> None:
        node = self.root
        for c in word:
            if c not in node:
                node[c] = {}
            node = node[c]
        node['$'] = None
        
    def match(self, seq, prefix=False):
        node = self.root
        for c in seq:
            if c not in node:
                return False
            node = node[c]
        return prefix or ('$' in node)
        
    def search(self, word: str) -> bool:
        return self.match(word)
        
    def starts_with(self, prefix: str) -> bool:
        return self.match(prefix, True)


insertion


  • similar to a bst, when we insert a value to a trie, we need to decide which path to go depending on the target value we insert.
  • the root node needs to be initialized before you insert strings.



  • all the descendants of a node have a common prefix of the string associated with that node, so it should be easy to search if there are any words in the trie that starts with the given prefix.
  • we go down the tree depending on the given prefix, once we cannot find the child node, the search fails.
  • we can also search for a specific word rather than a prefix, treating this word as a prefix and searching in the same way as above.
  • if the search succeeds, we need to check if the target word is only a prefix of words in the trie or if it's exactly a word (for example, by adding a boolean flag).


comparison with hash tables


  • hash table wins in terms of time complexity, as its insert is usually O(1) (worst case O(log(N)) and trie's are O(M) (where M is the maximum length of a key).
  • trie wins in terms of space complexity. both O(M *N) in theory, but tries can be much smaller as there will be a lot of words that have similar prefix.