Aho–Corasick string matching algorithm implementation in C for fun.
-
Build trie by search keys.
-
Find failure node for every nodes in the trie.
The failure node, say B, of a node, say A, must be the longest proper suffix for pattern in node A.
Ex: Given node B has pattern "bcde", node C has pattern "cde" node A has pattern "abcde", then node A's failure node will be node B.However, if longest proper suffix cannot be found, set failure node to root.
While doing searching, we read the input file by each character and, accordingly, transit the DFA state.
Then at every state we transited to, check if the 'current node' or the 'failure nodes tracing back from the current node' are in final state, if so, generate hit event.