Problem with OODT CAS-Filemanger’s Lucene query tool when using underscores

When using the OODT CAS-FileManager “query” tool [1], the standard query format needs to adhere to Lucene’s query syntax.

A simple query often used is a query to obtain all products of a given product type:

cd $FILEMGR_HOME/bin
./query_tool --url http://localhost:9000 --lucene -query CAS.ProductType:GenericFile

The above query works fine, and returns productIDs for all products matching the product type ‘GenericFile’ if products exist. However, if the productType (or any other metadata key for that matter) has an underscore in it, the result will not match the intended product type.

ie. below will come back empty:

cd $FILEMGR_HOME/bin
./query_tool --url http://localhost:9000 --lucene -query CAS.ProductType:My_Custom_ProductType

The reason the above command fails is because Lucene treats all keywords separated by underscores “_” as separate search strings. This is akin to replacing the underscore “_” with an “AND”. Documentation for this issue is available at [2].

So it seems the problem lies with Lucene’s parsing syntax. Fortunately, there is a way around this issue: use sql querying instead.

Solution:

cd $FILEMGR_HOME
./query_tool --url http://localhost:9000 --sql -query "SELECT CAS.ProductId FROM My_Custom_ProductType"


[1] http://oodt.apache.org/components/maven/apidocs/org/apache/oodt/cas/filemgr/tools/QueryTool.html
[2] http://stackoverflow.com/questions/2520479/lucene-search-and-underscores

Advertisements

4 thoughts on “Problem with OODT CAS-Filemanger’s Lucene query tool when using underscores

  1. My partner and I absolutely love your blog and find most of your post’s to be precisely what I’m looking for.
    can you offer guest writers to write content for you?
    I wouldn’t mind writing a post or elaborating on many of the
    subjects you write in relation to here. Again, awesome weblog!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s