Suffix Tree
It is a data structure that stores all sub-string of given string S.
m (leaves) + m-1 (maximum number of internal nodes) + 1 (root) = 2m.
The tree will have maximum number of nodes when string has all characters same eg aaaa or bbbbb etc
Generalised suffix tree
More than one string stored in one suffix tree. Each ending with different special character.
Applications
It is a data structure that stores all sub-string of given string S.
- It has exactly m leaves for string of length m (one for each suffix).
- All the internal nodes except root have atleast 2 child nodes.
- Each edge is labelled with non empty sub-string of S.
- Any two edges from same node does not have any common prefix.
m (leaves) + m-1 (maximum number of internal nodes) + 1 (root) = 2m.
The tree will have maximum number of nodes when string has all characters same eg aaaa or bbbbb etc
Generalised suffix tree
More than one string stored in one suffix tree. Each ending with different special character.
Applications
- Given String P, find if P is sub-string of S.
- Let ST be suffix Tree of S. Keep searching in tree until either P is exhausted (match found) or no more match is possible (sub-string not found). This can be done in O(m), m being length of P. Reason - every sub-string of S is prefix of some suffix of S, so will be present on the path from root to leaf corresponding to that suffix of S.
- Find Longest repeated string.
- Find deepest internal node.
- If longest string which is repeated at-least k times is required, then find the deepest internal node with at least k descendent leaf. This can be done by preprocessing the tree to count number of leaf descendants for each internal node.