Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added wildcard to bika listing search #8

Open
wants to merge 1 commit into
base: client/uw-wip
Choose a base branch
from

Conversation

mikejmets
Copy link

Added a wildcard to listing search so that words starting with the given text will be matched. This does not solve the issues of (a) matches anywhere in the text or (b) numbers in the text.

@mikejmets mikejmets requested a review from rockfruit April 6, 2017 14:51
rockfruit pushed a commit that referenced this pull request May 3, 2017
zylinx pushed a commit that referenced this pull request May 16, 2017
Sample print sheet: new fields added
@@ -594,7 +594,7 @@ def _process_request(self):
continue
##logger.info("Or: %s=%s"%(index, value))
if idx.meta_type in('ZCTextIndex', 'FieldIndex'):
self.Or.append(MatchRegexp(index, value))
self.Or.append(MatchRegexp(index, '%s*' % value))
Copy link

@jean jean Jun 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

\b
Matches the empty string, but only at the beginning or end of a word.

I'm probably missing something, but * doesn't seem to be the right one to match at start of word:

'*'
Causes the resulting RE to match 0 or more repetitions of the preceding RE

Unless you mean to use MatchGlob. It looks like MatchRegexp wants a re object as parameter, but didn't dig to see if it accepts strings too. I'm guessing it does ..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing this Jean. MatchRegexp seems to handle * correctly and when it makes no difference to the result if I use MatchGlob. BTW, this works well for text but not so well for numbers. This client has strings like "1,1,1-Methyl Hydride" but advanced query is not a friend of numbers. Any ideas?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that sounds surprising. A regexp like term* would match any string that contains ter with or without m, anywhere in the string (for re.search()) or at the start (for re.match()). That doesn't sound like what was intended.

A glob search for term* would match strings starting with term. (The regexp would also match this, which looks correct, but it would also match terq, which is wrong).

I had a quick look at AdvancedQuery but it doesn't import re or glob. I don't know where the regexp evaluation actually happens.

Regarding numbers, that's on the level of the index implementation, below AdvancedQuery. Text indexes normally filter out things like numbers and stopwords. To nicely match chemical names, you'll need a custom index. This is probably much better done using collective.solr.

I see Bika doesn't use ManagableIndex ..

AdvancedQuery works best when used together with
Products.ManagableIndex and dm.incrementalsearch.
Some of its features depend on these products, e.g. matching
and incremental filtering. Furthermore, these additional
components can speed up queries by several orders of magnitude.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants