public class IndexedWord extends java.lang.Object implements AbstractCoreLabel, java.lang.Comparable<IndexedWord>
CoreLabel that uses its
DocIDAnnotation, SentenceIndexAnnotation, and IndexAnnotation to implement
Comparable/compareTo, hashCode, and equals. This means no other annotations,
including the identity of the word, are taken into account when using these
methods. Historically, this class was introduced for and is mainly used in
the RTE package, and it provides a number of methods that are really specific
to that use case. A second use case is now the Stanford Dependencies code,
where this class directly implements the "copy nodes" of section 4.6 of the
Stanford Dependencies Manual, rather than these being placed directly in the
backing CoreLabel. This was so there can stay one CoreLabel per token, despite
there being multiple IndexedWord nodes, additional ones representing copy
nodes.
The actual implementation is to wrap a CoreLabel.
This avoids breaking the equals() and
hashCode() contract and also avoids expensive copying
when used to represent the same data as the original
CoreLabel.
TypesafeMap.Key<VALUE>| Modifier and Type | Field and Description |
|---|---|
static IndexedWord |
NO_WORD
The identifier that points to no word.
|
| Constructor and Description |
|---|
IndexedWord()
Default constructor; uses
CoreLabel default constructor |
IndexedWord(CoreLabel w)
Construct an IndexedWord from a CoreLabel just as for a CoreMap.
|
IndexedWord(Label w)
Copy Constructor - relies on
CoreLabel copy constructor
It will set the value, and if the word is not set otherwise, set
the word to the value. |
IndexedWord(java.lang.String docID,
int sentenceIndex,
int index)
Constructor for setting docID, sentenceIndex, and
index without any other annotations.
|
| Modifier and Type | Method and Description |
|---|---|
java.lang.String |
after()
Return the whitespace String after the word.
|
CoreLabel |
backingLabel()
Return the CoreLabel behind this IndexedWord
|
java.lang.String |
before() |
int |
beginPosition()
Return the beginning char offset of the label (or -1 if none).
|
int |
compareTo(IndexedWord w)
NOTE: For this compareTo, you must have a DocIDAnnotation,
SentenceIndexAnnotation, and IndexAnnotation for it to make sense and
be guaranteed to work properly.
|
<VALUE> boolean |
containsKey(java.lang.Class<? extends TypesafeMap.Key<VALUE>> key)
Returns true if contains the given key.
|
int |
copyCount() |
java.lang.String |
docID() |
int |
endPosition()
Return the ending char offset of the label (or -1 if none).
|
boolean |
equals(java.lang.Object o)
This .equals is dependent only on docID, sentenceIndex, and index.
|
static LabelFactory |
factory() |
<VALUE> VALUE |
get(java.lang.Class<? extends TypesafeMap.Key<VALUE>> key)
Returns the value associated with the given key or null if
none is provided.
|
IndexedWord |
getOriginal() |
<KEY extends TypesafeMap.Key<java.lang.String>> |
getString(java.lang.Class<KEY> key)
Return a non-null String value for a key.
|
<KEY extends TypesafeMap.Key<java.lang.String>> |
getString(java.lang.Class<KEY> key,
java.lang.String def) |
int |
hashCode()
This hashCode uses only the docID, sentenceIndex, and index.
|
int |
index() |
boolean |
isCopy(IndexedWord otherWord) |
java.util.Set<java.lang.Class<?>> |
keySet()
Collection of keys currently held in this map.
|
LabelFactory |
labelFactory()
Returns a factory that makes labels of the exact same type as this one.
|
java.lang.String |
lemma()
Return the lemma value of the label (or null if none).
|
IndexedWord |
makeCopy()
This copies the whole IndexedWord, incrementing a copy count.
|
IndexedWord |
makeSoftCopy() |
IndexedWord |
makeSoftCopy(int count) |
java.lang.String |
ner()
Return the named entity class of the label (or null if none).
|
java.lang.String |
originalText()
Return the String which is the original character sequence of the token.
|
double |
pseudoPosition()
In most cases, this is just the index of the word.
|
<VALUE> VALUE |
remove(java.lang.Class<? extends TypesafeMap.Key<VALUE>> key)
Removes the given key from the map, returning the value removed.
|
int |
sentIndex() |
<VALUE> VALUE |
set(java.lang.Class<? extends TypesafeMap.Key<VALUE>> key,
VALUE value)
Associates the given value with the given type for future calls
to get.
|
void |
setAfter(java.lang.String after)
Set the whitespace String after the word.
|
void |
setBefore(java.lang.String before)
Set the whitespace String before the word.
|
void |
setBeginPosition(int beginPos)
Set the beginning character offset for the label.
|
void |
setCopyCount(int count) |
void |
setDocID(java.lang.String docID) |
void |
setEndPosition(int endPos)
Set the ending character offset of the label (or -1 if none).
|
void |
setFromString(java.lang.String labelStr)
Set the contents of this label to this
String
representing the
complete contents of the label. |
void |
setIndex(int index) |
void |
setLemma(java.lang.String lemma)
Set the lemma value for the label (if one is stored).
|
void |
setNER(java.lang.String ner)
Set the named entity class of the label.
|
void |
setOriginalText(java.lang.String originalText)
Set the String which is the original character sequence of the token.
|
void |
setPseudoPosition(double position) |
void |
setSentIndex(int sentIndex) |
void |
setTag(java.lang.String tag)
Set the tag value for the label (if one is stored).
|
void |
setValue(java.lang.String value)
Set the value for the label (if one is stored).
|
void |
setWord(java.lang.String word)
Set the word value for the label (if one is stored).
|
int |
size()
Returns the number of keys in the map.
|
java.lang.String |
tag()
Return the tag value of the label (or null if none).
|
java.lang.String |
toCopyIndex() |
java.lang.String |
toPrimes() |
java.lang.String |
toString()
Returns the value-tag of this label.
|
java.lang.String |
toString(CoreLabel.OutputFormat format)
Allows choices of how to format Label.
|
java.lang.String |
value()
Return a String representation of just the "main" value of this label.
|
java.lang.String |
word()
Return the word value of the label (or null if none).
|
public static final IndexedWord NO_WORD
public IndexedWord()
CoreLabel default constructorpublic IndexedWord(Label w)
CoreLabel copy constructor
It will set the value, and if the word is not set otherwise, set
the word to the value.w - A Label to initialize this IndexedWord frompublic IndexedWord(CoreLabel w)
w - A Label to initialize this IndexedWord frompublic IndexedWord(java.lang.String docID,
int sentenceIndex,
int index)
docID - The document ID (arbitrary string)sentenceIndex - The sentence number in the document (normally 0-based)index - The index of the word in the sentence (normally 0-based)public IndexedWord makeCopy()
makeSoftCopy().public IndexedWord makeSoftCopy(int count)
public IndexedWord makeSoftCopy()
public IndexedWord getOriginal()
public CoreLabel backingLabel()
public <VALUE> VALUE get(java.lang.Class<? extends TypesafeMap.Key<VALUE>> key)
TypesafeMapget in interface TypesafeMappublic <VALUE> boolean containsKey(java.lang.Class<? extends TypesafeMap.Key<VALUE>> key)
TypesafeMapcontainsKey in interface TypesafeMappublic <VALUE> VALUE set(java.lang.Class<? extends TypesafeMap.Key<VALUE>> key, VALUE value)
TypesafeMapset in interface TypesafeMappublic <KEY extends TypesafeMap.Key<java.lang.String>> java.lang.String getString(java.lang.Class<KEY> key)
AbstractCoreLabelgetString in interface AbstractCoreLabelKEY - A key type with a String valuekey - The key to return the value of.null
and the String value of the key otherwisepublic <KEY extends TypesafeMap.Key<java.lang.String>> java.lang.String getString(java.lang.Class<KEY> key, java.lang.String def)
getString in interface AbstractCoreLabelpublic <VALUE> VALUE remove(java.lang.Class<? extends TypesafeMap.Key<VALUE>> key)
TypesafeMapremove in interface TypesafeMappublic java.util.Set<java.lang.Class<?>> keySet()
TypesafeMapkeySet in interface TypesafeMappublic int size()
TypesafeMapsize in interface TypesafeMappublic java.lang.String value()
Labelpublic void setValue(java.lang.String value)
Labelpublic java.lang.String tag()
HasTagpublic void setTag(java.lang.String tag)
HasTagpublic java.lang.String word()
HasWordpublic void setWord(java.lang.String word)
HasWordpublic java.lang.String lemma()
HasLemmapublic void setLemma(java.lang.String lemma)
HasLemmapublic java.lang.String ner()
HasNERpublic void setNER(java.lang.String ner)
HasNERpublic double pseudoPosition()
pseudoPositionpublic void setPseudoPosition(double position)
pseudoPositionpublic void setSentIndex(int sentIndex)
setSentIndex in interface HasIndexpublic java.lang.String before()
before in interface HasContextpublic void setBefore(java.lang.String before)
HasContextsetBefore in interface HasContextbefore - the whitespace String before the wordpublic java.lang.String originalText()
HasContextoriginalText in interface HasContextoriginalText in interface HasOriginalTextpublic void setOriginalText(java.lang.String originalText)
HasContextsetOriginalText in interface HasContextsetOriginalText in interface HasOriginalTextoriginalText - The original character sequence of the tokenpublic java.lang.String after()
HasContextafter in interface HasContextpublic void setAfter(java.lang.String after)
HasContextsetAfter in interface HasContextafter - The whitespace String after the wordpublic int beginPosition()
HasOffsetbeginPosition in interface HasOffsetpublic int endPosition()
HasOffsetendPosition in interface HasOffsetpublic void setBeginPosition(int beginPos)
HasOffsetsetBeginPosition in interface HasOffsetbeginPos - The beginning positionpublic void setEndPosition(int endPos)
HasOffsetsetEndPosition in interface HasOffsetendPos - The end character offset for the labelpublic int copyCount()
public void setCopyCount(int count)
public java.lang.String toPrimes()
public java.lang.String toCopyIndex()
public boolean isCopy(IndexedWord otherWord)
public boolean equals(java.lang.Object o)
equals in class java.lang.Objectpublic int hashCode()
hashCode in class java.lang.Objectpublic int compareTo(IndexedWord w)
compareTo in interface java.lang.Comparable<IndexedWord>w - The IndexedWord to compare withpublic java.lang.String toString()
public java.lang.String toString(CoreLabel.OutputFormat format)
format - An instance of the OutputFormat enum (the same as for a CoreLabel)public void setFromString(java.lang.String labelStr)
String
representing the
complete contents of the label. A class implementing label may
throw an UnsupportedOperationException for this
method (only). Typically, this method would do
some appropriate decoding of the string in a way that sets
multiple fields in an inverse of the toString()
method.setFromString in interface LabellabelStr - the String that translates into the content of the
labelpublic static LabelFactory factory()
public LabelFactory labelFactory()
null if no appropriate factory is known.labelFactory in interface Label