(Long lecture) Introduction to Information Retrieval

Alistair Moffat

   This presentation provides an overview of the whole field of information retrieval, and will set the scene for the rest of the summer school. Topics introduced include set and Boolean retrieval; ranked lists and ranking functions; implementation issues including inverted file indexing, index representations, and query evaluation strategies; test collection construction and batch evaluation methodologies; user-based experimentation and interactive evaluation methodologies; and e-commerce and recommender systems.

(Short lecture) Introduction to IR Evaluation

Mark Sanderson

   Evaluation of information retrieval (IR) systems is a critically important part of reporting in the IR research community. In this tutorial, I will provide an overview of the main offline and online techniques that are used to evaluate search engines. Topics covered in the tutorial will include the classic test collection-based approach, statistical significance testing, user evaluation, ethics approval, AB testing, and the recent innovation in counterfactual evaluation, where offline collections can be induced from online click data. At the end of this tutorial, students will understand the principles, methods, and resources available for evaluation. I will also briefly discuss how users can get access to data to enable them to evaluate.

(Short Lecture) Recommender systems in E-commerce

Dawei Yin

   Recommender systems in E-commerce can assist users in the information-seeking tasks by suggesting items that best fit their needs and preferences. Personalized recommender systems have shown their great successes in commercial applications, e.g. Amazon, eBay, Taobao, etc.. In this talk, I will first introduce the design of an e-commerce recommender system, and then zoom into the research problems in such a commercial recommender system, including candidate retrieval, user behavior understanding, and ranking in recommendation. Finally, I will discuss the recent advances and potential solutions for these problems in commercial recommender systems.

(Long Lecture) Unbiased Learning to Rank: Counterfactual, Online, and Reinforcement Learning Approaches

Maarten de Rijke

   Learning to Rank (LTR) has long been a core task in Information Retrieval (IR), as ranking models form the basis of most search and recommendation systems. Traditionally, LTR was approached as a supervised task where there is a dataset with perfect relevance annotations. However, over time the limitations of this approach have become apparent. As a result, interest in LTR from user interactions has increased significantly in recent years. User interactions, often in the form of user clicks, provide implicit feedback, and while cheap to collect, they are also heavily biased. Naively ignoring these biases during the learning process will result in biased ranking models not fully optimized for user preferences]. Thus the field of LTR from user interactions is mainly focussed on methods that remove biases from the learning process, resulting in unbiased LTR. In the tutorial, we will discuss three approaches to unbiased LTR: Counterfactual Learning to Rank, Online Learning to Rank, and Reinforcement Learning to Rank. We provide an overview of all three approaches and their underlying theory. We discuss the situations for which each approach was designed, and the places were they are applicable. Furthermore, we compare the properties of the three approaches and give guidance on how the decision between them should be made. For the field of IR we aim to provide an essential guide on unbiased LTR to understanding and choosing between methodologies. The tutorial is based on material jointly developed with Rolf JAgerman and Harrie Oosterhuis.

(Short Lecture) Introduction to Statistical Tools for IR Experiments

Tetsuya Sakai

   In this 90-minute lecture, I will cover the following topics.
   - How to conduct paired and two-sample t-tests with R
   - How to conduct ANOVA with R;
   - How to conduct the Tukey HSD test with R;
   - How to conduct the randomised Tukey HSD test;
   - How to use topic set size design tools;
   - How to use power analysis tools.
   More information can be found in my book:
   Extended slide deck available here:

(Short Lecture) Commercial Search Engine system and its technical challenges

Daxin Jiang

   In this tutorial, I will first give an overview about how NLP techniques are applied in Bing search engine. I will then focus on the question answering task, including three parts: 1) curated answer triggering; 2) knowledge-based QA; and 3) passage-based QA. I will present several technique challenges, the literature, and our approach. In particular, I will also demonstrate how the recent DNN models and large-scale pre-train models contribute to the QA task.

(Long Lecture) Introduction to Semantic Matching in IR

Jun Xu

   Matching, which measures the relevance of a document to a query, is the key problem in information retrieval (IR). It has been observed that a large amount of the dissatisfaction cases in search are due to mismatch between queries and documents (e.g., query “ny times” does not match well with a document only containing “New York Times”). Significant effort has been expended to address the problem. Previously, machine learning methods have been exploited, which learns a matching function from labeled data, referred to as "learning to match". In recent years, deep learning has been successfully applied to the problem and significant progresses have been made. The key to the success of the deep learning approach is its strong ability in learning of representations and generalization of matching patterns from raw queries and documents. T his lecture aims to give a comprehensive survey on recent progress in deep learning for matching. The lecture mainly consists of three parts. The first part introduces the general problem of semantic matching in IR. The second part explains how traditional machine learning techniques are utilized to address the problem. The last part elaborates how deep learning can be effectively used to solve the problem.

(Long Lecture) Interactive Information Retrieval: Models, Algorithms, and Evaluation

Chengxiang Zhai

   Information Retrieval (IR) is, in general, an iterative process with users interacting with a search engine in various ways to complete an information seeking task. As such, it is highly important to study Interactive Information Retrieval (IIR), where we would attempt to model and optimize an entire interactive retrieval process (rather than a single query) with consideration of many different ways a user can potentially interact with a search engine. In this tutorial, I will systematically review the progress of research in IIR with an emphasis on the most recent progress in the development of models, algorithms, and evaluation strategies for IIR. I will start with a broad overview of research in IIR, which will be followed by an introduction to formal models for IIR using a cooperative game framework and covering decision-theoretic models such as the Interface Card Model and Probability Ranking Principle for IIR. I will then review some representative specific techniques and algorithms for IIR, such as various forms of feedback techniques and diversification of search results. Finally, I will discuss how an IIR system should be evaluated and review multiple strategies proposed recently for evaluating IIR using user simulation. I will end the tutorial with a brief discussion of the major open challenges in IIR and some of the most promising future research directions.

(Long Lecture) Introduction to Conversational IR

Jian-Yun Nie

   Information seeking is an interactive process, which often involves multiple interactions between the user and the system. While some types of interaction have been captured and used in current search engines, they remain implicit and limited in coverage. Recently, great progress has been made in conversation systems based on deep neural networks (DNN). DNN offers new opportunities for the IR community to implement some conversational IR, which may appear more natural to end users. In this talk, I will recall some studies on interactions in IR and the implementations in current search engines. Then approaches to conversation systems are described, including both traditional rule-based and more recent deep learning approaches. I will present the new opportunities for conversational IR using the existing approaches, and discuss about the remaining challenges for the technology to be adapted to the IR context.