I want to generate anagram character for the string. Below is Lucene 4.1 Lib I have used for it.
Reader Reader = New String Reader (Text); NigamTokanizer Gramatokanizer = New NGRM Talkener (Reader, 3, 5); // 3, 4 and 5 characters hold the extreme sequence CharTermAttribute charTermAttribute = gramTokenizer.addAttribute (CharTermAttribute.class); While (gramtokinizer.re CritToken ()) {string token = fourtamaitivet.tasting (); System.out.println (token);}
However, I want to use Lucene 5.0.0 to do this. NGramTokenizer references a lot in the Lucene 5.0.0 from the previous version.
Does anybody know how to use Lucene 5.0.0 to do ngrams?
The following code:
stringreader stringword = new string reader (" a B C D"); NGramTokenizer tokensiser = new NGRM connectifier (1, 2); Tokenizer.setReader (stringReader); Tokenizer.reset (); CharTermAttribute termAtt = tokenizer.getAttribute (CharTermAttribute.class); While (tokenizer.incrementToken ()) {string token = termAtt.toString (); Println (token); }
will generate:
ABBCBC CDD
No comments:
Post a Comment