Probabilistic relevance model
The probabilistic relevance model[1][2] was devised by Stephen E. Robertson and Karen Spärck Jones as a framework for probabilistic models to come. It is a formalism of information retrieval useful to derive ranking functions used by search engines and web search engines in order to rank matching documents according to their relevance to a given search query.
It is a theoretical model estimating the probability that a document dj is relevant to a query q. The model assumes that this probability of relevance depends on the query and document representations. Furthermore, it assumes that there is a portion of all documents that is preferred by the user as the answer set for query q. Such an ideal answer set is called R and should maximize the overall probability of relevance to that user. The prediction is that documents in this set R are relevant to the query, while documents not present in the set are non-relevant.
Related models
There are some limitations to this framework that need to be addressed by further development:
- There is no accurate estimate for the first run probabilities
- Index terms are not weighted
- Terms are assumed mutually independent
To address these and other concerns, other models have been developed from the probabilistic relevance framework, among them the Binary Independence Model from the same author. The best-known derivatives of this framework are the Okapi (BM25) weighting scheme and its multifield refinement, BM25F.
References
- ^ Robertson, S. E.; Jones, K. Spärck (May 1976). "Relevance weighting of search terms". Journal of the American Society for Information Science. 27 (3): 129–146. doi:10.1002/asi.4630270302.
- ^ Robertson, Stephen; Zaragoza, Hugo (2009). "The Probabilistic Relevance Framework: BM25 and Beyond". Foundations and Trends in Information Retrieval. 3 (4): 333–389. CiteSeerX 10.1.1.156.5282. doi:10.1561/1500000019.
Content Disclaimer
Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.
- The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
- There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
- It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
- Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
- Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.