org.extex.language.word
Interface WordTokenizer

All Superinterfaces:
java.io.Serializable
All Known Subinterfaces:
Language, ManagedLanguage, ModifiableLanguage
All Known Implementing Classes:
BaseHyphenationTable, CompressedLiangsHyphenationTable, ExTeXWords, FutureLanguage, LiangsHyphenationTable, TeXWords

public interface WordTokenizer
extends java.io.Serializable

This interface describes the contract for a tokenizer which is able to split a list of nodes into words. This kind of tokenizer might be language specific.

Version:
$Revision: 4446 $
Author:
Gerd Neugebauer

Method Summary
 int findWord(NodeList nodes, int start, UnicodeCharList word)
          Extract a word from a node list.
 void insertShy(NodeList nodes, int insertionPoint, boolean[] spec, CharNode hyphenNode)
          Insert hyphenation points into a list of nodes.
 UnicodeCharList normalize(UnicodeCharList word, TypesetterOptions options)
          Normalize a word for the lookup.
 

Method Detail

findWord

int findWord(NodeList nodes,
             int start,
             UnicodeCharList word)
             throws HyphenationException
Extract a word from a node list.

Parameters:
nodes - the nodes to extract the word from
start - the start index
word - the target list for the letters of the word
Returns:
the index of the first node beyond the word
Throws:
HyphenationException - in case of an error

insertShy

void insertShy(NodeList nodes,
               int insertionPoint,
               boolean[] spec,
               CharNode hyphenNode)
               throws HyphenationException
Insert hyphenation points into a list of nodes.

Parameters:
nodes - the node list to modify
insertionPoint - the index to insert something into the nodes
spec - the specification where to insert hyphenation marks. If spec[i] is true then a hyphen needs to be inserted before the ith character at or after insertionPoint in nodes
hyphenNode - the hyphen as node
Throws:
HyphenationException - in case of an error

normalize

UnicodeCharList normalize(UnicodeCharList word,
                          TypesetterOptions options)
                          throws HyphenationException
Normalize a word for the lookup.

Parameters:
word - the word to normalize
options - the options to use
Returns:
the normalized word
Throws:
HyphenationException - in case of an error