Lucene StandardAnalyzer類

這是最複雜的分析，並能處理的姓名，電子郵件地址等，它小寫每個標記，並刪除常用詞和標點符號(如有)。

類聲明

以下是org.apache.lucene.analysis.StandardAnalyzer類的聲明：

public final class StandardAnalyzer extends StopwordAnalyzerBase

字段

static int DEFAULT_MAX_TOKEN_LENGTH - 默認情況下允許的最大長度令牌
static Set<?> STOP_WORDS_SET - 一個不可修改的組包含一些常用的英語單詞，通常不用於搜索有用的。

類構造函數

S.N.

構造函數和說明

StandardAnalyzer(Version matchVersion)
建立使用默認停用詞(STOP_WORDS_SET)分析儀

StandardAnalyzer(Version matchVersion, File stopwords)
不推薦使用。使用StandardAnalyzer(版本，讀取器)來代替。

StandardAnalyzer(Version matchVersion, Reader stopwords)
建立來自給定的讀取器停用詞的分析。

StandardAnalyzer(Version matchVersion, Set<?> stopWords)
建立使用給定的停止字的分析。

類方法

S.N.

方法及說明

protected ReusableAnalyzerBase.TokenStreamComponents createComponents(String fieldName, Reader reader)
創建此analyzer 的新 ReusableAnalyzerBase.TokenStreamComponents 實例。

int getMaxTokenLength()

void setMaxTokenLength(int length)
設置允許的最大長度的令牌。

方法繼承

這個類從以下類繼承的方法：

org.apache.lucene.analysis.StopwordAnalyzerBase
org.apache.lucene.analysis.ReusableAnalyzerBase
org.apache.lucene.analysis.Analyzer
java.lang.Object

使用

private void displayTokenUsingStandardAnalyzer() throws IOException{ String text = "Lucene is simple yet powerful java based search library."; Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36); TokenStream tokenStream = analyzer.tokenStream(LuceneConstants.CONTENTS, new StringReader(text)); TermAttribute term = tokenStream.addAttribute(TermAttribute.class); while(tokenStream.incrementToken()) { System.out.print("[" + term.term() + "] "); } }

應用程序示例

讓我們創建一個測試Lucene的應用程序使用BooleanQuery測試搜索。

步驟

描述

創建名稱爲LuceneFirstApplication的項目在packagecom.yiibai.lucene下的Lucene用於解釋 Lucene應用程序理解搜索過程。

創建LuceneConstants.java作爲Lucene的解釋- 第一應用程序一章。保持其它的文件不變。

創建LuceneTester.java如下所述。

清理和構建應用程序，以確保業務邏輯按要求工作。

LuceneConstants.java

這個類是用來提供可應用於示例應用程序中使用的各種常量。

package com.yiibai.lucene; public class LuceneConstants { public static final String CONTENTS="contents"; public static final String FILE_NAME="filename"; public static final String FILE_PATH="filepath"; public static final int MAX_SEARCH = 10; }

LuceneTester.java

這個類是用來測試Lucene庫的搜索能力。

package com.yiibai.lucene; import java.io.IOException; import java.io.StringReader; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.StandardAnalyzer; import org.apache.lucene.analysis.tokenattributes.TermAttribute; import org.apache.lucene.util.Version; public class LuceneTester { public static void main(String[] args) { LuceneTester tester; tester = new LuceneTester(); try { tester.displayTokenUsingStandardAnalyzer(); } catch (IOException e) { e.printStackTrace(); } } private void displayTokenUsingStandardAnalyzer() throws IOException{ String text = "Lucene is simple yet powerful java based search library."; Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36); TokenStream tokenStream = analyzer.tokenStream( LuceneConstants.CONTENTS, new StringReader(text)); TermAttribute term = tokenStream.addAttribute(TermAttribute.class); while(tokenStream.incrementToken()) { System.out.print("[" + term.term() + "] "); } } }

運行程序：

一旦創建源，準備好這一步是編譯和運行程序。要做到這一點，請在LuceneTester.Java文件選項卡中，使用Eclipse IDE的 Run 選項，或使用Ctrl+ F11來編譯和運行應用程序LuceneTester。如果您的應用程序一切正常，這將在Eclipse IDE的控制檯打印以下消息：

[lucene] [simple] [yet] [powerful] [java] [based] [search] [library]

Lucene教程

Lucene StandardAnalyzer類

類聲明

字段

類構造函數

類方法

方法繼承

使用

應用程序示例

運行程序：