
|
If you were logged in you would be able to see more operations.
|
|
|
Fuzzy search does not work with phrases. If individual terms are indexed, then doing a fuzzy search with spaces will always fail. For example, I have a field with the phrase "banana split," then all of these searches (created with CompassQueryBuilder) succeed:
queryBuilder.fuzzy("zzz-all", "banana");
queryBuilder.fuzzy("zzz-all", "split");
queryBuilder.bool().addMust( queryBuilder.fuzzy("zzz-all", "banana") ).addMust( queryBuilder.fuzzy("zzz-all", "split") ).toQuery();
queryBuilder.queryString("banana split").toQuery();
queryBuilder.queryString("\"banana split\"").toQuery();
However, this does not:
queryBuilder.fuzzy("zzz-all", "banana split");
Needless to say, this is unexpected behavior. Unfortunately, this is due to an underlying issue with Lucene, and the behavior is intentional, as fuzzy phrase searches are extremely demanding. (See discussion at, e.g., http://markmail.org/message/5x6hn65eb22imiol#query:lucene%20fuzzy%20phrase%20search+page:1+mid:2nlztixbe6j2jefk+state:results .) Since we can't change the behavior, I suggest that Compass log at the WARN level when someone tries to execute a fuzzy search that has no chance of success because it contains characters (e.g. spaces) that aren't indexed in one or more of the fields included in the query. (Of course, figuring that out each time a query is executed might not be feasible... any other ideas?)
At the very least, the CompassQueryBuilder JavaDocs (http://www.compass-project.org/docs/2.2.0/api/org/compass/core/CompassQueryBuilder.html#fuzzy%28java.lang.String,%20java.lang.String%29 ) should have a warning that fuzzy() only supports searches over individual terms, not phrases.
|
|
Description
|
Fuzzy search does not work with phrases. If individual terms are indexed, then doing a fuzzy search with spaces will always fail. For example, I have a field with the phrase "banana split," then all of these searches (created with CompassQueryBuilder) succeed:
queryBuilder.fuzzy("zzz-all", "banana");
queryBuilder.fuzzy("zzz-all", "split");
queryBuilder.bool().addMust( queryBuilder.fuzzy("zzz-all", "banana") ).addMust( queryBuilder.fuzzy("zzz-all", "split") ).toQuery();
queryBuilder.queryString("banana split").toQuery();
queryBuilder.queryString("\"banana split\"").toQuery();
However, this does not:
queryBuilder.fuzzy("zzz-all", "banana split");
Needless to say, this is unexpected behavior. Unfortunately, this is due to an underlying issue with Lucene, and the behavior is intentional, as fuzzy phrase searches are extremely demanding. (See discussion at, e.g., http://markmail.org/message/5x6hn65eb22imiol#query:lucene%20fuzzy%20phrase%20search+page:1+mid:2nlztixbe6j2jefk+state:results .) Since we can't change the behavior, I suggest that Compass log at the WARN level when someone tries to execute a fuzzy search that has no chance of success because it contains characters (e.g. spaces) that aren't indexed in one or more of the fields included in the query. (Of course, figuring that out each time a query is executed might not be feasible... any other ideas?)
At the very least, the CompassQueryBuilder JavaDocs ( http://www.compass-project.org/docs/2.2.0/api/org/compass/core/CompassQueryBuilder.html#fuzzy%28java.lang.String,%20java.lang.String%29 ) should have a warning that fuzzy() only supports searches over individual terms, not phrases. |
Show » |
| There are no comments yet on this issue.
|
|