Package org.apache.nutch.util
Class TrieStringMatcher
- java.lang.Object
-
- org.apache.nutch.util.TrieStringMatcher
-
- Direct Known Subclasses:
PrefixStringMatcher,SuffixStringMatcher
public abstract class TrieStringMatcher extends Object
TrieStringMatcher is a base class for simple tree-based string matching. This class is thread-safe during string matching but not when adding strings to the trie.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected classTrieStringMatcher.TrieNodeNode class for the character tree.
-
Field Summary
Fields Modifier and Type Field Description protected TrieStringMatcher.TrieNoderoot
-
Constructor Summary
Constructors Modifier Constructor Description protectedTrieStringMatcher()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected voidaddPatternBackward(String s)Adds any necessary nodes to the trie so that the givenStringcan be decoded in reverse and the first character is represented by a terminal node.protected voidaddPatternForward(String s)Adds any necessary nodes to the trie so that the givenStringcan be decoded and the last character is represented by a terminal node.abstract StringlongestMatch(String input)Returns the longest substring ofinputthat is matched by a pattern in the trie, ornullif no match exists.protected TrieStringMatcher.TrieNodematchChar(TrieStringMatcher.TrieNode node, String s, int idx)Get the nextTrieStringMatcher.TrieNodevisited, given that you are atnode, and that the next character in the input is theidx'th character ofs.abstract booleanmatches(String input)Returns true if the givenStringis matched by a pattern in the trieabstract StringshortestMatch(String input)Returns the shortest substring ofinputthat is matched by a pattern in the trie, ornullif no match exists.
-
-
-
Field Detail
-
root
protected TrieStringMatcher.TrieNode root
-
-
Method Detail
-
matchChar
protected final TrieStringMatcher.TrieNode matchChar(TrieStringMatcher.TrieNode node, String s, int idx)
Get the nextTrieStringMatcher.TrieNodevisited, given that you are atnode, and that the next character in the input is theidx'th character ofs. Can return null.- Parameters:
node- InputTrieStringMatcher.TrieNodecontaining child nodess- String to match character at indexed positionidx- Indexed position in input string- Returns:
- child
TrieStringMatcher.TrieNode - See Also:
TrieStringMatcher.TrieNode.getChild(char)
-
addPatternForward
protected final void addPatternForward(String s)
Adds any necessary nodes to the trie so that the givenStringcan be decoded and the last character is represented by a terminal node. Zero-lengthStringsare ignored.- Parameters:
s- String to be decoded.
-
addPatternBackward
protected final void addPatternBackward(String s)
Adds any necessary nodes to the trie so that the givenStringcan be decoded in reverse and the first character is represented by a terminal node. Zero-lengthStringsare ignored.- Parameters:
s- String to be decoded.
-
matches
public abstract boolean matches(String input)
Returns true if the givenStringis matched by a pattern in the trie- Parameters:
input- A String to be matched by a pattern- Returns:
- true if there is a match, flase otherwise
-
shortestMatch
public abstract String shortestMatch(String input)
Returns the shortest substring ofinputthat is matched by a pattern in the trie, ornullif no match exists.- Parameters:
input- A String to be matched by a pattern- Returns:
- shortest string match or null if no match is made
-
-