Saturday, February 25, 2006

Lucene Term Vectors

Using term vectors, we can provide choices to the users of our application that match the current choice. For instance, on a site that I am developing, I plan to use term vectors to offer my site users the ability to see "related articles" when they are reading an article.

Term Vectors are specified by setting an instance of Field.TermVector to true when calling a method similar to the following:

Field(String, String, Field.Store, Field.Index, Field.TermVector)



If you don't want to store the term vector

Field(name, value, Field.Store.YES, Field.Index.TOKENIZED)


If you do want to store the term vector

Field(name, value, Field.Store.YES, Field.Index.TOKENIZED, storeTermVector)



Term vector can also be stored for "Unstored" fields:
Don't store term vector, but tokenize and index the field, without storing the field

Field(name, value, Field.Store.NO, Field.Index.TOKENIZED)


If you do want to store term vector, tokenize and index the field, but not storing it in the index:

Field(name, value, Field.Store.NO, Field.Index.TOKENIZED, storeTermVector)




To find out whether term vector is stored for a field, we can use: isTermVectorStored

public final boolean isTermVectorStored()
IndexReader.getTermFreqVector(int,String)
// TermFreqVector myTermFreqVector = myreader.getTermFreqVector(id, "field_name");
IndexReader.getTermFreqVector(int, String)


Note from the manual about the above:
These methods do not provide access to the original content of the field, only to terms used to index it. If the original content must be preserved, use the stored attribute instead.


Other related functions:

isStoreOffsetWithTermVector


isStorePositionWithTermVector

0 Comments:

Post a Comment

<< Home

eXTReMe Tracker