Anserini/target/appassembler/bin/SearchCollection -topicreader Trec -index /path/to/index/ -hits 1000 -topics Anserini/src/main/resources/topics-and-qrels/topics.51-100.txt -bm25 -axiom -axiom.beta 0.4 -output run_axiom_beta_0.4.txt
Anserini/target/appassembler/bin/SearchCollection -topicreader Trec -index /path/to/index/ -hits 1000 -topics Anserini/src/main/resources/topics-and-qrels/topics.51-100.txt -ql -axiom -axiom.beta 0.4 -output run_axiom_beta_0.4.txt
- Rank the documents and pick the top M documents as the reranking documents pool RP
- Randomly select (R-1)*M documents from the index and add them to RP so that we have R*M documents in the reranking pool
- Build the inverted term-docs list RTL for RP
- For each term in RTL, calculate its reranking score as the mutual information between query terms and itself:
s(q,t)=I(X_q, X_t|RP)=SUM(p(X_q,X_t|W)*log(p(X_q,X_t|W)/p(X_q|W)/p(X_t|W)))
whereX_q
andX_t
are two binary random variables that denote the presence/absence of query term q and term t in the document. - The final reranking score of each term t in RTL is calculated by summing up its scores for all query terms:
s(t) = SUM(s(q,t))
- Pick top K terms from RTL based on their reranking scores with their weights s(t)
- Rerank the documents by using the K reranking terms with their weights. In Lucene, it is something like (term1^0.2 term2^0.01 ...)
Axiomatic Reranking algorithm is a non-deterministic algorithm since it randomly pick (R-1)*M documents as part of the reranking pool (see algorithm above for details). Here we just list the performance references for major TREC collections. The ranking model we used is BM25 and the parameter beta is set as 0.4 for all collections although this is definitely not the optimal value for individual collection. We report MAP for all collections except ClueWeb collections where ndcg@20 is reported.
Please refer to the paper [Yang et al, 2013] Yang, P., and Fang, H. (2013). Evaluating the Effectiveness of Axiomatic Approaches in Web Track. In TREC 2013. for more details.