ArchiveOrangemail archive

solr-user.lucene.apache.org


(List home) (Recent threads) (34 other Apache Lucene lists)

Subscription Options

  • RSS or Atom: Read-only subscription using a browser or aggregator. This is the recommended way if you don't need to send messages to the list. You can learn more about feed syndication and clients here.
  • Conventional: All messages are delivered to your mail address, and you can reply. To subscribe, send an email to the list's subscribe address with "subscribe" in the subject line, or visit the list's homepage here.
  • High traffic list: 30+ messages per day
  • This list contains about 82,301 messages, beginning Jan 2006
  • 13 messages added yesterday
Report the Spam
This button sends a spam report to the moderator. Please use it sparingly. For other removal requests, read this.
Are you sure? yes no

Solr Score threshold 'reasonably', independent of results returned

Ad
Ramzi Alqrainy 1345517973Tue, 21 Aug 2012 02:59:33 +0000 (UTC)
Usually, search results are sorted by their score (how well the document
matched the query), but it is common to need to support the sorting of
supplied data too.
Boosting affects the scores of matching documents in order to affect ranking
in score-sorted search results. Providing a boost value, whether at the
document or field level, is optional.
When the results are returned with scores, we want to be able to only "keep"
results that are above some score (i.e. results of a certain quality only).
Is it possible to do this when the returned subset could be anything?
I ask because it seems like on some queries a score of say 0.008 is
resulting in a decent match, whereas other queries a higher score results in
a poor match.
I have written pseudo code to achieve what I said.
Note: I have attached my code as screenshot

http://lucene.472066.n3.nabble.com/file/n4002... 

https://issues.apache.org/jira/browse/SOLR-37...
Mou 1345647821Wed, 22 Aug 2012 15:03:41 +0000 (UTC)
Hi,
I think that this totally depends on your requirements and thus applicable
for a user scenario. Score does not have any absolute meaning, it is always
relative to the query. If you want to watch some particular queries and want
to show results with score above previously set threshold, you can use this. 

If I always have that x% threshold in place , there may be many queries
which would not return anything and I certainly do not want that.
Ravish Bhagdev 1345651780Wed, 22 Aug 2012 16:09:40 +0000 (UTC)
Commercial solutions often have %age that is meant to signify the quality
of match.  Solr has relative score and you cannot tell by just looking at
this value if a result is relevant enough to be in first page or not.
 Score depends on "what else is in the index" so not easy to normalize in
the way you suggest.

RavishOn Wed, Aug 22, 2012 at 4:03 PM, Mou  wrote:

> Hi,
> I think that this totally depends on your requirements and thus applicable
> for a user scenario. Score does not have any absolute meaning, it is always
> relative to the query. If you want to watch some particular queries and
> want
> to show results with score above previously set threshold, you can use
> this.
>
> If I always have that x% threshold in place , there may be many queries
> which would not return anything and I certainly do not want that.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Score...
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Ramzi Alqrainy 1345926666Sat, 25 Aug 2012 20:31:06 +0000 (UTC)
It will never return no result because its relative to score in previous
result

If score<0.25*last_score then stop

Since score>0 and last score is 0 for initial hit it will not stop
Ramzi Alqrainy 1345927124Sat, 25 Aug 2012 20:38:44 +0000 (UTC)
You are right Mr.Ravish, because this depends on (ranking and search fields)
formula, but please allow me to tell you that Solr score can help us to
define this document is relevant or not in some cases.
Lance Norskog 1346016905Sun, 26 Aug 2012 21:35:05 +0000 (UTC)
Not really. The percentage given in other search packages is fairly
bogus. You have to do a global batch analysis of all of the index to
get a true scale for relevance.On Sat, Aug 25, 2012 at 1:38 PM, Ramzi Alqrainy
 wrote:
> You are right Mr.Ravish, because this depends on (ranking and search fields)
> formula, but please allow me to tell you that Solr score can help us to
> define this document is relevant or not in some cases.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score...
> Sent from the Solr - User mailing list archive at Nabble.com.
Chris Hostetter 1346180686Tue, 28 Aug 2012 19:04:46 +0000 (UTC)
: Not really. The percentage given in other search packages is fairly
: bogus. You have to do a global batch analysis of all of the index to
: get a true scale for relevance.

Exactly...

https://wiki.apache.org/solr/FAQ#Why_Aren.27t...
https://wiki.apache.org/lucene-java/ScoresAsP...

*you* -- as the person in control of your solr instance, who kows 
everything about every document in the index, and has total control over 
the set of valid queries being executed against the index -- you *MAY* be 
able to compute a meaningful "threshold" of scores, based on the 
constraints you know/enforce.  But Solr can't do this, because in 
general Solr doesn't know those constraints (or if those constraints even 
exist) for an arbitrary index.


-Hoss
Home | About | Privacy