[Leetcode] Trie

Introduction Video: https://www.youtube.com/watch?v=CX777rfuZtM

Problem: find a data structure for storing strings that allows for fast insertion, search, and deletion

Trie is usually used to solve string prefix problems. Insert and find O(n) time complexity.

In a lot of problems, a trie can be represented by a nested dict.

1
2
'dad'
{d: {a: {d: {'#': '#'}}}}

Standard Trie

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
class TrieNode {
Constructor(letter = '') {
this.val = letter;
this.children = {}
this.completesString = false;
}
}

class Trie {
constructor () {
this.rootNode = new TrieNode();
}
insert(word) {
/*
Iterate through word
Map word's characters to a path in the trie
If no path exists, create one
When iteration is complete, indicate that the last character of word completes a string in the trie
*/
let node = this.rootNode;
for (let i = 0; i < word.length; i++) {
let curChar = word[i];
if (rootNode.children(curChar)) {
node = rootNode.children(curChar);
} else {
let newNode = new TrieNode(curChar);
node.children[curChar] = newNode;
node = newNode;
}
}
node.completeString = true;
}

find(word) {
/*Iterate through word
Map words's characters to a path in the trie
If a full path exists, the trie contains word
If not, the trie does not exist
*/
let node = this.rootNode;
for (let i = 0; i < word.length; i++) {
let curChar = word[i];
if (node.children[curChar]) {
node = node.children[curChar]
} else {
return false;
}
}
return true;
}
remove(word) {
/*
Remove only the unique suffixes of word
Be mindful of the cases where
- The last character of word has children
- Characters other than the last have children or complete a string in the trie(other than word)
No characters of word have children(i.e. the entirety of word's path through the trie can be removed)
*/
let node = this.rootNode();
let suffixes = [];

// For case where no part of the word can be removed from trie
for (let i = 0; i < word.length; i++) {
let curChar = word[i];
if (node.children[curChar]) {
node = node.children[curChar];
suffixes.unshift(node);
if (i === word.length && Object.keys(node.children).length) {
throw new Error(`suffixes in trie depend on "${word}`);
}
}
}

// For case where some parts of word can be removed from trie
for (let i = 1; i < suffixes.length; i++) {
let parent = suffixes[i];
let child = word[suffixes.length - i];
delete parent.children[child];
if (parent.completeString || Object.keys(parent.children).length) {
return `some suffixes of "${word}" removed from trie`;
}
}

// For case where all parts of word can be removed from trie
delete this.rootNode.children[word[0]];
return `removed: "${word}; no other "${word[0]}" -words remain`;
}
}

Video: https://www.youtube.com/watch?v=GTJr8OvyEVQ
Good explanation: https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/

Brute Force: Time Complexity: O(mn)
m: length of text
n: length of the pattern

KMP: Time Complexity: O(m + n)
Space Comple: O(n)

1
2
3
4
5
6
7
8
9
Text:    "a b x a b c a b c a b y"
pattern: "a b c a b y"

Build a temp array of pattern on preffix and suffix (arr[i] = x meas the ith character is the xth character that is both in suffix and preffix)

A proper prefix is prefix with whole string not allowed. For example, prefixes of “ABC” are “”, “A”, “AB” and “ABC”. Proper prefixes are “”, “A” and “AB”.

lps[i] = the longest proper prefix of pat[0..i]
which is also a suffix of pat[0..i].

Unlike Naive algorithm, where we slide the pattern by one and compare all characters at each shift, we use a value from lps[] to decide the next characters to be matched. The idea is to not match a character that we know will anyway match.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# Construct prefix and suffix array

def prefArray(pattern):
"""
Use two pointers i and j
if s[i] = s[j], arr[j] = i + 1, forward i and j
else i = s[i]
"""
cont=[0]
for i in range(1,len(A)):
index=cont[i-1]
while(index>0 and A[index]!=A[i]):
index=cont[index-1]
cont.append(index+(1 if A[index]==A[i] else 0))
return cont

def substringSearch(text, pattern):
"""
Given a text, search the place of the pattern in the text in
O(m + n) time complexity O(n) space complexity
"""
i = 0
j = 0
# Edge case
if len(pattern) == 0:
return 0
arr = prefArray(pattern)
while i < len(text):
if text[i] == pattern[j]:
if j == len(pattern) - 1:
return i - j
j += 1
i += 1
else:
if j > 0:
j = arr[j - 1]
else:
i += 1
return -1

print(substringSearch("abxabcabcaby", "aby"))

214. Shortest Palindrome

Leetcode: https://leetcode.com/problems/shortest-palindrome/

Solution 1 (Brute Force):
Find the longest palindrome substring starting at index 0.
Time Complexity: O(n ^ 2)

1
2
3
4
5
6
7
8
def shortestPalindrome(self, s: str) -> str:
if not s or s == s[::-1]:
return s
longest = [0, 0]
for i in range(len(s)):
if s[:i] == s[:i][::-1] and len(s[:i]) > longest[0]:
longest = [len(s[:i]), i]
return s[longest[1]:][::-1] + s

Solution 2: KMP

  • Match s with reversed s to find the longest palindrome substring starts from index 0
  • Construct let full = s + '#' + s[::-1]
  • Construct a KMP array of full, KMP[-1] is the length of the longest palindrome substring
  • Append reversed rest of string in front of s
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def shortestPalindrome(self, s: str) -> str:
if not s or s == s[::-1]:
return s
full = s + "#" + s[::-1]
kmp = [0]
for i in range(1, len(full)):
index = kmp[i - 1]
while index > 0 and full[index] != full[i]:
index = kmp[index - 1]
if full[index] == full[i]:
kmp.append(index + 1)
else:
kmp.append(0)
return s[kmp[-1]:][::-1] + s

336. Palindrome Pairs

Leetcode: https://leetcode.com/problems/palindrome-pairs/

[Solution]

  • For each word, start from their first character as prefix, and the rest of the characters as suffix, check if the reverse is in the dictionary, if so, append the reverse to the head of the prefix.
    1
    2
    3
    4
    5
    6
    "abcd"
    pref = ""
    looking for "dcba"
    pref = "a"
    looking for "dcb"
    ...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Solution(object):
def palindromePairs(self, words):
"""
:type words: List[str]
:rtype: List[List[int]]
"""
words = {word: i for i, word in enumerate(words)}
res = []

for word in words.keys():
for i in range(len(word) + 1):
pref = word[:i]
suff = word[i:]
if self.isPal(pref):
if suff[::-1] in words and suff[::-1] != word:
res.append([words[suff[::-1]], words[word]])
if i != len(word) and self.isPal(suff):
if pref[::-1] in words and word != pref[::-1]:
res.append([words[word], words[pref[::-1]]])
return res

def isPal(self, s):
return s == s[::-1]

211. Add and Search Word - Data structure design

Leetcode: https://leetcode.com/problems/add-and-search-word-data-structure-design/

[Solution]

  • Add word: use a nested dictionary to represent the Trie structure. Use # to represent the end of the word

    1
    2
    'dad'
    {d: {a: {d: {'#': '#'}}}}
  • Search word: use recursive solution to search word

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class WordDictionary:

def __init__(self):
self.trie = {}

def addWord(self, word):
trie=self.trie
for c in word:
if c not in trie:
trie[c]={}
trie=trie[c]
trie['#']='#'

def search(self, word: 'str', trie = None) -> 'bool':
if trie == None: # Initialze
trie = self.trie
if not word:
return '#' in trie #Reach to the end of the word

ch = word[0]
if ch == '.':
for node in trie:
if node != '#' and self.search(word[1:], trie[node]):
return True
elif ch in trie:
return self.search(word[1:], trie[ch])
return False

Take Away

  • To make original function recursivable with more arguments, we can set default value to the arguments. trie = None here.

642. Design Search Autocomplete System

Leetcode: https://leetcode.com/problems/design-search-autocomplete-system

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# Input: <str> end with '#'(# means sentence ends)
# Output: <list> top 3 sentences have same prefix as input
# If >= 3 sentences exist:
# sorted by hot degree: number of times user has typed EXACT SAME sentence before
# Else:
# return as many as I can
# Edge case:
# '#': return []

# i
# ' ' s r
# l l 0
# o(0) a(0) n(0)
# ...

# "i love you" : 5 times
# "island" : 3 times
# "ironman" : 2 times
# "i love leetcode" : 2 times
sentences = ["i love you", "island", "ironman", "i love leetcode"]
times = [5, 3, 2, 2]

class TrieNode:
def __init__(self):
self.children = {}
self.time = 0
self.data = None
self.end = False

class AutocompleteSystem:

def __init__(self, sentences: 'List[str]', times: 'List[int]'):
"""
Operations:
Insert: insert into sentences, with times: 1
Update: if exists, update +1
Search: search by prefix
Sort: Sort by times, and ASCII
"""
self.root = TrieNode()
self.keyword = ""
for index, sentence in enumerate(sentences):
self.insert(sentence, times[index])

def input(self, c: 'str') -> 'List[str]':
"""
INPUT: c <str>: next character typed by user
OUTPUT: res <list[str]>: top 3 historical hot sentences
"""
# Search sentences with prefixes
# '#' Insert sentence into trie
results = []
if c != '#':
self.keyword += c
results = self.search(self.keyword)
else:
self.insert(self.keyword, 1)
self.keyword = ""
return [item[1] for item in sorted(results)[:3]]

def insert(self, s: 'str', num: 'int'):
node = self.root
for ch in s:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.time -= num
node.data = s
node.end = True

def dfs(self, root):
"""
Return a List[str] of all the branches based on the root node
"""
res = []
if root:
if root.time != 0:
res.append((root.time, root.data))
for ch in root.children:
res.extend(self.dfs(root.children[ch]))
return res

def search(self, prefix):
node = self.root
for ch in prefix:
if ch not in node.children:
return []
node = node.children[ch]
# Locate to the last character of prefix
return self.dfs(node) # Return all the branches based on the last character


system = AutocompleteSystem(sentences, times)
print(system.input('i'))
# Your AutocompleteSystem object will be instantiated and called as such:
# obj = AutocompleteSystem(sentences, times)
# param_1 = obj.input(c)

208. Implement Trie (Prefix Tree)

Leetcode: https://leetcode.com/problems/implement-trie-prefix-tree/

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
class Trie:

def __init__(self):
"""
Initialize your data structure here.
"""
self.trie = {}

def insert(self, word: str) -> None:
"""
Inserts a word into the trie.
"""
level = self.trie
for ch in word:
if ch not in level:
level[ch] = {}
level = level[ch]
level['#'] = '#'

def search(self, word: str) -> bool:
"""
Returns if the word is in the trie.
"""
level = self.trie
for ch in word:
if ch not in level:
return False
level = level[ch]
return '#' in level

def startsWith(self, prefix: str) -> bool:
"""
Returns if there is any word in the trie that starts with the given prefix.
"""
level = self.trie
for ch in prefix:
if ch not in level:
return False
level = level[ch]
return True

14. Longest Common Prefix

Leetcode: https://leetcode.com/problems/longest-common-prefix/

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def longestCommonPrefix(self, strs: List[str]) -> str:
pref = ""
trie = collections.defaultdict(dict)
for s in strs:
if s == "":
return ""
node = trie
for ch in s:
if ch not in node:
node[ch] = {}
node = node[ch]
node['#'] = {}
node = trie
res = ""
while node.keys():
if len(node.keys()) == 1:
if list(node.keys())[0] != "#":
res += list(node.keys())[0]
node = node[list(node.keys())[0]]
else:
break
return res

212. Word Search II

Leetcode: https://leetcode.com/problems/word-search-ii/

421. Maximum XOR of Two Numbers in an Array

Leetcode: https://leetcode.com/problems/maximum-xor-of-two-numbers-in-an-array/