|
Aleph-w 3.0
A C++ Library for Data Structures and Algorithms
|
Aho-Corasick multi-pattern automaton. More...
#include <Aho_Corasick.H>
Classes | |
| struct | Match |
| One match occurrence in the searched text. More... | |
| struct | Node |
Public Member Functions | |
| Aho_Corasick () | |
| Build an empty automaton (root only). | |
| void | clear () |
| Clear all patterns and reset to an empty automaton. | |
| size_t | add_pattern (std::string pattern) |
| Add one pattern to the automaton. | |
| void | build () |
| Build failure links and transition completion. | |
| Array< Match > | search (const std::string_view text) const |
| Search all patterns in a text. | |
| bool | contains_any (const std::string_view text) const |
| Return true if at least one pattern appears in the text. | |
| size_t | pattern_count () const noexcept |
| Return the number of inserted patterns. | |
| bool | is_built () const noexcept |
Return whether build() has been executed since last insertion. | |
| const std::string & | pattern (const size_t id) const |
| Return the pattern text by id. | |
Private Member Functions | |
| size_t | create_node () |
Private Attributes | |
| Array< Node > | nodes_ |
| Array< std::string > | patterns_ |
| bool | built_ = false |
Aho-Corasick multi-pattern automaton.
The automaton operates on bytes (unsigned char alphabet size 256). All matches (including overlaps) are reported.
Definition at line 62 of file Aho_Corasick.H.
|
inline |
Build an empty automaton (root only).
Definition at line 111 of file Aho_Corasick.H.
References nodes_.
|
inline |
Add one pattern to the automaton.
| [in] | pattern | Pattern to insert (must be non-empty). |
| std::invalid_argument | If pattern is empty. |
Definition at line 136 of file Aho_Corasick.H.
References ah_invalid_argument_if, Aleph::Array< T >::append(), built_, create_node(), Aleph::divide_and_conquer_partition_dp(), nodes_, pattern(), patterns_, and Aleph::Array< T >::size().
Referenced by main(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), TEST(), and TEST().
|
inline |
Build failure links and transition completion.
Finalizes the automaton structure. Must be called after all add_pattern() calls and before invoking search().
add_pattern() has been called for all desired patterns and no further modifications (no additional add_pattern() calls) have occurred since.Definition at line 175 of file Aho_Corasick.H.
References Aleph::and, Aleph::Array< T >::append(), built_, Aleph::divide_and_conquer_partition_dp(), Aleph::Array< T >::is_empty(), Aleph::next(), nodes_, and Aleph::Array< T >::size().
|
inline |
Clear all patterns and reset to an empty automaton.
Definition at line 117 of file Aho_Corasick.H.
References built_, Aleph::Array< T >::empty(), nodes_, and patterns_.
Return true if at least one pattern appears in the text.
| [in] | text | Text to scan. |
| std::runtime_error | If build() has not been called. |
Definition at line 282 of file Aho_Corasick.H.
References ah_runtime_error_unless, Aleph::and, built_, Aleph::divide_and_conquer_partition_dp(), Aleph::next(), nodes_, and output.
|
inlineprivate |
|
inlinenoexcept |
Return whether build() has been executed since last insertion.
Definition at line 323 of file Aho_Corasick.H.
References built_.
Return the pattern text by id.
| [in] | id | Pattern id. |
| std::out_of_range | If id >= pattern_count(). |
Definition at line 336 of file Aho_Corasick.H.
References ah_out_of_range_error_if, patterns_, and Aleph::Array< T >::size().
Referenced by add_pattern().
|
inlinenoexcept |
Return the number of inserted patterns.
Definition at line 314 of file Aho_Corasick.H.
References patterns_, and Aleph::Array< T >::size().
Search all patterns in a text.
| [in] | text | Text to scan. |
| std::runtime_error | If build() has not been called. |
Definition at line 234 of file Aho_Corasick.H.
References ah_runtime_error_unless, Aleph::and, Aleph::Array< T >::append(), built_, Aleph::divide_and_conquer_partition_dp(), k, Aleph::next(), nodes_, output, patterns_, and Aleph::Array< T >::size().
Definition at line 101 of file Aho_Corasick.H.
Referenced by add_pattern(), build(), clear(), contains_any(), is_built(), and search().
Definition at line 99 of file Aho_Corasick.H.
Referenced by Aho_Corasick(), add_pattern(), build(), clear(), contains_any(), create_node(), and search().
|
private |
Definition at line 100 of file Aho_Corasick.H.
Referenced by add_pattern(), clear(), pattern(), pattern_count(), and search().