Lucene Search Query Syntax on iExperiment

iExperiment search is based on the open source Appache Lucene search engine library. The video below demonstrates ease of using the Lucene search engine in iExperiment. The table at bottom of this page summarizes types of queries that can be preformed when searching the records stored in iExperiment’s electronic notebook.

The Lucene Query Syntax table describes the various types of queries that can be submitted. The table is arrange such that for each query type an example is given in the adjacent column followed by a discussion of that query type in the last column.

Lucene Query Syntax
Query Type Example Discussion
Single Term protein For a single term, like “protein”, the program will be searched for the word “protein” in all of the records you have permission to view.
Single Character Wildcard te?t The single character wildcard is a question mark (?). Our example “te?t” will match both “test” or “text”. Wildcard cannot be the first character of a search term. If you do place a wildcard at the beginning of a word, you will receive a message reminding you about this limitation..
Multiple Character Wildcard carbo* The multiple character wildcard is an asterisk (*). Our example “carbo*” will match “carbon”, “carbonic”, “carbonyl”, and many other similar words.
Fuzzy Searches carbo~ Lucene fuzzy search uses the tilde character (~). Our example “carbo~” will match both “carbon” and “harbor”.
Phrase “protein purification” A phrase is a group of words enclosed within quotes. The search returns exact matches to the phrase. Without the quotes, the search looks for any records that contains any of the words.
Proximity Searches “phosphate sodium”~3 A proximity search contain a group of words in quotes with a tilde character (~) after the quote last quote mark, followed by an integer. The integer is the number of hops a word must make to match the phrase. Our example will find “phosphate sodium”, “sodium phosphate”, and “sodium dihydrogen phosphate”.
Field author:david A field is denoted by a colon (:) after the field name followed by the field search term. Field name must match exactly. If the field is not exact, the search will return no matches. We have provided an “Add Field” pull down menu, so you do not have to remember the field names.
Range Searches [2011-01-01 TO 2011-01-31] Range searches are contained in a pair of square brackets[], and contains a pair of terms separated by a “TO”. Range search are useful for searching dates. Due the lexigraphical ordering of words the year must proceed the month which proceeds the day. Dates in iExperiment are stored as YYYY-MM-DD, where the YYYY is the year, MM is the numerical month, and DD is the numerical day. Our example search will return experiments that have dates in January 2011.
AND Operator protein AND purification The default operator between search terms is OR. Using the AND operator requires that the search return results that match both terms.
Required (+) Operator protein +purification The required operator is a plus sign (+) before a search term. Our example will return result that contain “purification”, and may contain “protein”.
NOT Operator purification NOT “lipid purification” The NOT operator allows you to exclude terms from your searches. A minus sign (-) can also be used as a NOT operator. The example could be expressed as ‘purification -”lipid purification”.’
Grouping title:(carboxypeptidae AND purification) Terms can be grouped by enclosing the terms with parentheses (). This is useful for performing both multiple term field searches, as shown in the example, and for controlling the logic of a search with AND, and OR operators.

For more details see http://lucene.apache.org/java/3_0_3/queryparsersyntax.html.

Comments are closed.