Creation of suffix trees can be best understood by taking a word and creating suffix tree for it by considering
one character at a time.
Some of the important things to be noted in the above are:
A branch once created can never merge with any other branch.
This is so because two branches differ at the root level and once they are separate, no amount of words added to the end will join them.
A branch once created will just grow linearly after that.
It may branch more at any place along its length, but the main branch will continue growing.
When a new character is appended to a suffix tree:
Existing branches just grow by one element and
A new branch may be created only at the top-level.
This new branch is created if there is no branch beginning with the new character.
This implies that for a string having characters only from 'a' to 'z', the maximum number
of children from any node can never be more than 26.
With the above considerations in mind, suffix trees can be created easily with the following algorithm:
When adding new character, search at the root level for its occurrence.
If not found, create a branch at the root level itself.
If its found, create a branch at the next level, as done in figure 3 above.
So new character is always added - either at top-level or at level-one.
Since all branches will grow by the new character, no need to append it at the end of each branch.
Instead just keep a marker (say a pointer or an index) to the current end.
As that traverses the characters of a list, the branches automatically get incremented.
Example application of the above O(n) algorithm:
Got a thought to share or found a bug in the code? We'd love to hear from you: