inpro.sphinx
Class ResultUtil
java.lang.Object
inpro.sphinx.ResultUtil
public class ResultUtil
- extends java.lang.Object
Method Summary |
static java.util.List<edu.cmu.sphinx.decoder.search.Token> |
getTokenList(edu.cmu.sphinx.decoder.search.Token inputToken,
boolean words,
boolean units)
return a list of word and/or unit tokens extracted from a linked token list
if both words and units are to be returned, then the word token
will always precede the unit tokens belonging to this word
the algorithm iterates through the tokens *against* temporal order
(as this is the natural order of the tokens)
and finally reverses the list before it is output. |
static boolean |
hasWordTokensLast(edu.cmu.sphinx.decoder.search.Token token)
try to guess the ordering of tokens
the ordering of word/segment tokens depends on the sphinx decoder used;
this operation guesses the decoder from the employed search state
subclasses and returns the word-order for the types of decoders it knows |
private static boolean |
isSegmentToken(edu.cmu.sphinx.decoder.search.Token t)
|
private static boolean |
isWordToken(edu.cmu.sphinx.decoder.search.Token t)
|
static java.lang.String |
stringForSearchState(edu.cmu.sphinx.linguist.SearchState state)
ancient code, but still used in the sphinx-native LabelWriter and TEDviewNotifier |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ResultUtil
public ResultUtil()
getTokenList
public static java.util.List<edu.cmu.sphinx.decoder.search.Token> getTokenList(edu.cmu.sphinx.decoder.search.Token inputToken,
boolean words,
boolean units)
- return a list of word and/or unit tokens extracted from a linked token list
if both words and units are to be returned, then the word token
will always precede the unit tokens belonging to this word
the algorithm iterates through the tokens *against* temporal order
(as this is the natural order of the tokens)
and finally reverses the list before it is output.
This means that even though word tokens precede their segment tokens in
the final output, the opposite is the case while the list is being
constructed.
- Parameters:
inputToken
- the start token (that is, the last in time)words
- whether word tokens should be returnedunits
- whether unit tokens hsould be returned
- Returns:
- a list of tokens of word and/or unit tokens NOTICE: that no provisions
are taken to ensure that there are the "right" segment tokens for every word,
especially, no provisions are taken to assert that each silence segment is
accompanied by a silence word → it turns out that (at least for LexTree),
SIL tokens are usually not preceded by <sil> words.
isWordToken
private static boolean isWordToken(edu.cmu.sphinx.decoder.search.Token t)
isSegmentToken
private static boolean isSegmentToken(edu.cmu.sphinx.decoder.search.Token t)
hasWordTokensLast
public static boolean hasWordTokensLast(edu.cmu.sphinx.decoder.search.Token token)
- try to guess the ordering of tokens
the ordering of word/segment tokens depends on the sphinx decoder used;
this operation guesses the decoder from the employed search state
subclasses and returns the word-order for the types of decoders it knows
- Parameters:
token
- this token's search state will be inspected
- Returns:
- true if word tokens are preceded by their segments
stringForSearchState
public static java.lang.String stringForSearchState(edu.cmu.sphinx.linguist.SearchState state)
- ancient code, but still used in the sphinx-native LabelWriter and TEDviewNotifier