Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

Eclipse Plug-in Developer Guide
Previous Page Home Next Page

Lucene Analyzer

Identifier:
org.eclipse.help.base.luceneAnalyzer

Since:
3.0 (originally added in release 2.0 as org.eclipse.help.luceneAnalyzer)

Description:
This extension point is used to register text analyzers for use by help when indexing and searching documentation.

Help exploits capabilities of the Lucene search engine, that allows indexing of token streams (streams of words). Analyzers create tokens from the character stream. They examine text content and provide tokens for use with the index. The text stream can be tokenized in many unique ways. A trivial analyzer can tokenize streams at white space, a different one can perform filtering of tokens, based on the application needs. Since the documentation is mostly human-readable text, it is desired that analyzers used by the help system perform language and grammar aware tokenization and normalization of indexed text. For some languages, the quality of search increases significantly if stop word removal and stemming is performed on the indexed text.

The analyzer contributed to this extension point will override the one provided by the Eclipse help system for a given locale.

Configuration Markup:

<!ELEMENT extension ( analyzer*)>

<!ATTLIST extension

point CDATA #REQUIRED

id    CDATA #IMPLIED

name  CDATA #IMPLIED

>


<!ELEMENT analyzer EMPTY>

<!ATTLIST analyzer

locale CDATA #REQUIRED

class  CDATA #REQUIRED

>

  • locale - a string identifying locale for which the supplied analyzer is to bue sued. If two letters, language is provided, and the analyzer will be available to all locales of that language.
  • class - a fully qualified name of the Java class extending org.apache.lucene.analysis.Analyzer.

Examples:
Following is an example of Lucene Analyzer configuration:


 <extension id=
"com.xyx.XYZ"
 point=
"org.eclipse.help.base.luceneAnalyzer"
>
  <analyzer locale=
"ll_CC"
 class=
"com.xyz.ll_CCAnalyzer"
/>
 </extension>

Supplied Implementation:
The Eclipse help system provides analyzers for all languages. For English and German, the analyzers perform stop word filtering, lowercase filtering, and stemming. For all the other languages the supplied analyzer only performs lowercase filtering.


Copyright (c) 2000, 2005 IBM Corporation and others.
All rights reserved. This program and the accompanying materials are made available under the terms of the Eclipse Public License v1.0 which accompanies this distribution, and is available at https://www.eclipse.org/legal/epl-v10.html


 
 
  Published under the terms of the Eclipse Public License Version 1.0 ("EPL") Design by Interspire