1st International Workshop on
LEARning Next gEneration Rankers

co-located with The 3rd ACM International Conference on the Theory of Information Retrieval (ICTIR 2017)
October 1, 2017, Amsterdam, Netherlands

Photo: Baby Steps by Kalexanderson / CC BY-NC-ND 2.0


Ranking is forever at the core of Information Retrieval since it allows to sift out non relevant information and to select a list of items ordered by their estimated relevance to a given query. Documents, Information needs, search tasks and interaction mechanisms between users and information systems are getting more and more complex and diversified, and this calls for more and more sophisticated techniques able to cope with this emerging complexity and the high expectations of users.

Learning to Rank (LtR), and machine learning in general, have proven to be very effective methodologies to address these issues, significantly improving over state-of-the-art traditional algorithms. Popular areas of investigation in LtR are related to efficiency, feature selection, supervised learning, but many new angles are still overlooked. The goal of this workshop is to investigate how to improve ranking, in particular LtR, by bringing in new perspectives which have not explored or fully addressed yet by our community after the 2011 Yahoo Learning to Rank Challenge.

In particular, we wish to encourage researchers to discuss the opportunities, challenges, results obtained in the development and evaluation of novel approaches to LtR. New perspectives on LtR may concern innovative models, study of their formal properties as well as experimental validation of their efficiency and effectiveness. We are in particular interested in proposal dealing with novel LtR algorithms, evaluation of LtR algorithms, LtR dataset creation and curation, and domain specific applications of LtR.

We invite papers from researchers and practitioners working in Information Retrieval, Machine Learning and related application areas to submit their original papers to this workshop.

The workshop proceedings are available online as a volume of the CEUR-WS proceeding.

Important Dates

Submission deadline: August 14, 2017

Notification of acceptance: September 4, 2017

Camera ready: October 16, 2017

Workshop day: October 1, 2017

Call for Position Papers

General areas of interests include, but are not limited to, the following topics:

  • Next Generation LtR Algorithms:
    • Unsupervised approaches to LtR, active learning for LtR, transfer learning for LtR;
    • Incremental LtR, online, or personalized LtR;
    • Embedding user behaviour and dynamic in LtR;
    • Cost-Aware LtR;
    • List-based approaches for result list diversification and/or clustering;
    • Bias/Variance and other theoretical characterizations or ranking models;
    • Feature engineering for ranking;
    • Deep neural networks for ranking;
    • Understanding and explaining complex LtR models, also via visual analytics solutions.
  • Evaluation of LtR Algorithms:
    • Quality measures accounting for user behaviour and perceived quality;
    • Quality measures accounting for models failures, redundancy, robustness, sensitivity, etc.;
    • Evaluation of ranking efficiency vs. quality trade-off;
    • Visual analytics solutions for exploring and interpreting experimental data;
    • Reproducibility of LtR experiments.
  • Datasets:
    • Measuring quality of training datasets: noise, contradictory examples, redundancy, difficulty of building a good model, features quality, coverage of application domain use cases;
    • Creation and curation of datasets: compression, negative sampling, aging, dimensionality reduction;
    • Contributing novel datasets to the community.
  • Applications:
    • Application of LtR to verticals or to other domains (e.g., recommendation, news, product search, social media, job search, ...);
    • LtR beyond documents: keyword-based access to structured data, multimedia, graphs, etc.

Papers should be formatted according to the ACM SIG Proceedings Template.

Papers should be four-six pages (maximum) in length.

Papers will be peer-reviewed by members of the program committee through single-blind peer review, i.e. authors do *not* need to be anonymized. Selection will be based on originality, clarity, and technical quality. Papers should be submitted in PDF format to the following address:


Accepted papers are published online at the following link:



Nicola Ferro, University of Padua, Italy

Claudio Lucchese, ISTI-CNR, Italy

Maria Maistro, University of Padua, Italy

Raffaele Perego, ISTI-CNR, Italy

Program Committee

Roi Blanco, Amazon, Spain

Jiafeng Guo, Chinese Academy of Sciences, China

Craig Macdonald, University of Glasgow, UK

Fabrizio Silvestri, Facebook, UK

Arjen de Vries, Radboud Universiteit, The Netherlands

Hamed Zamani, University of Massachusetts, Amherst, USA

Accepted Papers

Arpita Das, Saurabh Shrivastava and Manoj Chinnakotla. Discovery and Promotion of Subtopic Level High Quality Domains for Programming Queries in Web Search.

Nicola Ferro, Paolo Picello and Gianmaria Silvello. A Software Library for Conducting Large Scale Experiments on Learning to Rank Algorithms.

Rolf Jagerman, Harrie Oosterhuis and Maarten de Rijke. Query-Level Ranker Specialization.

Or Levi. Online Learning of a Ranking Formula for Revenue and Advertiser ROI Optimization.

Short Presentations

Brian Brost. Multileaving for Online Evaluation of Rankers.

Hui Fang and Chengxiang Zhai. When Learning to Rank Meets Axiomatic Thinking.

Claudio Lucchese, Franco Maria Nardini, Raffaele Perego, and Salvatore Trani. The Impact of Negative Samples on Learning to Rank.

Darío Garigliotti and Krisztian Balog. Learning to Rank Target Types for Entity-Bearing Queries.

Keynote Talk:

Craig Macdonald

School of Computing Science, University of Glasgow, UK

Craig Macdonald


Craig Macdonald is Lecturer at the University of Glasgow, UK. Currently, his main research topics deal with Information Retrieval (IR) in general, for instance in settings such as Web, Enterprise, social media and Smart cities. He regularly participates in TREC, and jointly co-ordinated the TREC Blog track from 2006-2010, the Microblog track (from 2011-2012), and the Web track (2014-). He is a lead developer for the Terrier IR platform, and also uses Terrier in his research publications.

Hypothesis Testing for Risk-Sensitive Evaluation and Learning to Rank in Web Search

When a user is unsatisfied with the quality of results of a web search engine, they may switch to another, leading to a loss of ad revenue to the engine. Use of a robust retrieval approach is therefore essential, to that the experience of the users of the search engine is not damaged by poorly-performing queries. For this reason, there has been growing interest in measuring robustness using a new class of risk-sensitive evaluation measures, which assess the extent to which a system exhibit risk, i.e. performs worse than a given baseline system on a set of queries.
In this talk, we describe our recent advances in two families of risk-sensitive evaluation measures both based upon hypothesis testing, and their integration into a state-of-the-art learning to rank algorithm, to create effective yet robust retrieval models.
Firstly, we argue that risk-sensitive evaluation is akin to the underlying methodology of the Student's t-test for matched pairs. Hence, we introduce a risk-reward tradeoff measure TRisk that generalises the existing URisk measure, and which is theoretically grounded in statistical hypothesis testing.
Secondly, we argue that using a single system as the baseline suffers from the fact that retrieval performance highly varies among IR systems across topics. Thus, a single system would in general fail in providing enough information about the real baseline performance for every topic under consideration, and hence cannot in general measure the real risk associated with any given system. Based upon the Chi-squared statistic, we describe a second family of risk-reward tradeoff measures that take into account multiple baselines when measuring risk.
Experiments using 10,000 topics from the MSLR learning to rank dataset from the Bing search engine demonstrate that our proposed t-test and Chi-square based objective functions that reduces the number of poorly performing queries exhibited by a state-of-the-art learning to rank algorithm.

Djoerd Hiemstra

Department of Computer Science, University of Twente, The Netherlands

Craig Macdonald


Djoerd Hiemstra is part-time associate professor at the University of Twente. He also heads Searsia, a University of Twente spin-off that develops an open source federated search engine. Djoerd contributed to over 200 research papers in the field of information retrieval, covering topics such as language models, structured information retrieval, multimedia retrieval, and federated web search. Djoerd published papers with research labs of Microsoft (where he did an internship in 2000), Yahoo (where he was a visiting researcher in 2008), and Yandex (which he visited in 2011).

The best learning-to-rank method of all: A final verdict?

Like most information retrieval methods, learning-to-rank methods are evaluated on benchmark datasets, such as the many datasets provided by Microsoft and the datasets provided by Yahoo and Yandex. Many of the learning-to-rank datasets offer feature set representations of the to-be-ranked documents instead of the documents themselves. Therefore, any difference in ranking performance is due to the ranking algorithm and not the features used. This opens up a unique opportunity for cross-benchmark comparison of learning-to-rank methods.
In this talk, I propose a way to compare learning to rank methods based on a sparse set of evaluation results on many benchmark datasets. Our comparison methodology consists of two components: (1) the Normalized Winning Number, a measure that gives insight in the ranking accuracy of the learning to rank method, and (2) the Ideal Winning Number, which gives insight in the degree of certainty concerning the ranking accuracy.
Evaluation results of 87 learning-to-rank methods on 20 well-known benchmark datasets are collected. I report on the best performing methods by Normalized Winning Number and Ideal Winner Number and suggest what methods need more research to make our analysis more robust. Finally, we test the robustness of our results by comparing the results to situations where one of the datasets is not included in the analysis.

Workshop Schedule

14:00-14:10 - Opening and Welcome
14:10-15:00 - Keynote with Prof. Craig Macdonald, University of Glasgow
15:00-15:30 - Paper Session

Online Learning of a Ranking Formula for Revenue and Advertiser ROI Optimization

Or Levi

Query-Level Ranker Specialization

Rolf Jagerman, Harrie Oosterhuis and Maarten de Rijke.
15:30-16:00 - Coffee Break
16:00-16:50 - Keynote with Prof. Djoerd Hiemstra, University of Twente
16:50-17:20 - Paper Session

Discovery and Promotion of Subtopic Level High Quality Domains for Programming Queries in Web Search

Arpita Das, Saurabh Shrivastava and Manoj Chinnakotla

A Software Library for Conducting Large Scale Experiments on Learning to Rank Algorithms

Nicola Ferro, Paolo Picello and Gianmaria Silvello

The Impact of Negative Samples on Learning to Rank

Claudio Lucchese, Franco Maria Nardini, Raffaele Perego, and Salvatore Trani
17:20-17:50 - Short Invited Presentations

Multileaving for Online Evaluation of Rankers

Brian Brost

When Learning to Rank Meets Axiomatic Thinking

Hui Fang and Chengxiang Zhai

Learning to Rank Target Types for Entity-Bearing Queries

Darío Garigliotti and Krisztian Balog
17:50-18:00 - Wrap Up, Discussion and Closing
18:00 - Drinks at CASA 400