Stop Words and Searching

Currently, (13 August 2014) common “stop words” are ignored by the Solr search engine.

This sometimes lead to confusing or unexpected results from the users perspective. So, for example, if you were to do a search like “internet of things” you would retrieve any documents with the words “internet” and “things” in them, next to each other. That is, the stop word “of” simply does not exist from the search engines perspective. If you enter – internet of things – (without the quotes) you would retrieve any documents that contain either “internet” or “things”, but those articles with both scoring more highly in the results list.

We have been considering whether to have stop words at all – and at this stage we are of the opinion that there should be no stop words at all, so as to avoid this sort of search behaviour. Happy to hear anyone’s opinion on this.

Posted in Focus