Uses of suffix trees
A node (with contents from root to that node as ‘c’) which has ‘k’ leaf level children indicates that ‘c’ occurs ‘k’ times in the text. For example, in the suffix tree above, ‘e’ has 3 leaf-level-children, so it must occur 3 times in ‘peeper’. Similarly, ‘pe’ has 2 leaf level children, so it occurs 2 times in ‘peeper’.
Hence to find a string that occurs more than ‘m’ times in a text, we find all nodes in the trie which have more than ‘m’ leaf level children. For all those nodes, strings from root to those nodes occur more than ‘m’ times.
We search the pattern normally as we search a given string. If the search terminates at a leaf, then the pattern occurs exactly once, else all its occurrences can be found in the paths leading from that node to all its leaves. To further speed this up, one can link all the leaves of the trie into a linked list and keep first-leaf and last-leaf in every node. To find all occurrences then, we can directly traverse from first-leaf to last-leaf.
To find common substring (not subsequence*) among N strings, following can be done:
1) Append a unique end-of-string marker to each input string like $1, $2, $3 etc.
2) Form a suffix tree by getting suffixes from all the above strings.
3) In this suffix tree, find the node which has the longest depth from the root and its leaf nodes have at least one suffix node in each of the input strings ($1, $2, $3... end-of-string markers help here to determine that the chosen node’s leaves are present in all the strings).
* Elements in a subsequence need not be contiguous like a substring
Palindrome is a string which reads the same from front to end as it reads from end to front. Example: abcdcba
To find the longest palindrome in string S1, following can be done:
1) Reverse the string S1 to get another string S2
This is a variation of #1 above. We need to find the deepest node which has more than 1 children. The characters from root to that deepest node form the longest repeated substring and the leaves form those suffixes where that LRS is present.
A naive LRS finding algorithm is shown below:
( Note the order of searching LRS is O(n)3 )
|Email:||(Your email is not shared with anybody)|