inpro.sphinx
Class ResultUtil

java.lang.Object
  extended by inpro.sphinx.ResultUtil

public class ResultUtil
extends java.lang.Object


Constructor Summary
ResultUtil()
           
 
Method Summary
static java.util.List<edu.cmu.sphinx.decoder.search.Token> getTokenList(edu.cmu.sphinx.decoder.search.Token inputToken, boolean words, boolean units)
          return a list of word and/or unit tokens extracted from a linked token list if both words and units are to be returned, then the word token will always precede the unit tokens belonging to this word the algorithm iterates through the tokens *against* temporal order (as this is the natural order of the tokens) and finally reverses the list before it is output.
static boolean hasWordTokensLast(edu.cmu.sphinx.decoder.search.Token token)
          try to guess the ordering of tokens the ordering of word/segment tokens depends on the sphinx decoder used; this operation guesses the decoder from the employed search state subclasses and returns the word-order for the types of decoders it knows
private static boolean isSegmentToken(edu.cmu.sphinx.decoder.search.Token t)
           
private static boolean isWordToken(edu.cmu.sphinx.decoder.search.Token t)
           
static java.lang.String stringForSearchState(edu.cmu.sphinx.linguist.SearchState state)
          ancient code, but still used in the sphinx-native LabelWriter and TEDviewNotifier
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ResultUtil

public ResultUtil()
Method Detail

getTokenList

public static java.util.List<edu.cmu.sphinx.decoder.search.Token> getTokenList(edu.cmu.sphinx.decoder.search.Token inputToken,
                                                                               boolean words,
                                                                               boolean units)
return a list of word and/or unit tokens extracted from a linked token list if both words and units are to be returned, then the word token will always precede the unit tokens belonging to this word the algorithm iterates through the tokens *against* temporal order (as this is the natural order of the tokens) and finally reverses the list before it is output. This means that even though word tokens precede their segment tokens in the final output, the opposite is the case while the list is being constructed.

Parameters:
inputToken - the start token (that is, the last in time)
words - whether word tokens should be returned
units - whether unit tokens hsould be returned
Returns:
a list of tokens of word and/or unit tokens NOTICE: that no provisions are taken to ensure that there are the "right" segment tokens for every word, especially, no provisions are taken to assert that each silence segment is accompanied by a silence word → it turns out that (at least for LexTree), SIL tokens are usually not preceded by <sil> words.

isWordToken

private static boolean isWordToken(edu.cmu.sphinx.decoder.search.Token t)

isSegmentToken

private static boolean isSegmentToken(edu.cmu.sphinx.decoder.search.Token t)

hasWordTokensLast

public static boolean hasWordTokensLast(edu.cmu.sphinx.decoder.search.Token token)
try to guess the ordering of tokens the ordering of word/segment tokens depends on the sphinx decoder used; this operation guesses the decoder from the employed search state subclasses and returns the word-order for the types of decoders it knows

Parameters:
token - this token's search state will be inspected
Returns:
true if word tokens are preceded by their segments

stringForSearchState

public static java.lang.String stringForSearchState(edu.cmu.sphinx.linguist.SearchState state)
ancient code, but still used in the sphinx-native LabelWriter and TEDviewNotifier