Search Solutions and Tutorials 2023

Innovations in Search & Information Retrieval.

Search Solutions is the BCS Information Retrieval Specialist Group’s annual event focused on practitioner issues in the arena of search and information retrieval.

Search Solutions consists of two parts: a tutorial day and a conference day.

We bring together practitioners, researchers, analysts and end users to discuss the latest developments in the information retrieval (IR) community and to share insights between research and practice.

Tutorials – 21 November

This year there were two tutorials, a half-day tutorial on how Large Language Models can Improve Your Search Project and a full day tutorial on Uncertainty Quantification for Text Classification, held on Tuesday, 21 November 2023, at BCS London.

Tutorial 1 - Half day

How Large Language Models Can Improve Your Search Project

Tutor:

Alessandro Benedetti, Sease Ltd. Apache Lucene/Solr committer and PMC member and Director and R&D Software Engineer.

Introduction:

In this tutorial, you will build up their knowledge of information retrieval from the basics up to the latest BERT-based techniques. Moreover, hands-on exercises will give give you practical experience using these techniques. By the end of the tutorial, you will be familiar with the latest techniques, including neural language models for re-ranking, learned sparse retrieval, and dense retrieval.

Schedule:

10:00 - 14:00

Tutorial scope:

Large Language Models (LLMs) are becoming ubiquitous: everyone is talking about them, everyone wants to use them and everyone claims is getting benefits out of them…

But... is it that simple?

This tutorial aims to demystify the Open-Source landscape of large language models, exploring what it means to use them to improve your search engine ecosystem and what are the most common pitfalls.

The talk starts by introducing the reasons to add LLMs to your search application and the complexities it adds (choosing the right model, measuring success, and rabbit holes).

Building upon the introduction we’ll present how search changes with the advent of this innovative technology, what is ‘fine-tuning’ and what are the Open-Source solutions available in terms of models, components to interact with the models, and search engines that integrate with such technologies.

During the session, we’ll have multiple demos showing effective ways of using LLMs in your search project, using open-source software and publicly available datasets.

Join us as we explore this new exciting Open-Source landscape and learn how you can leverage it to improve your search experience!

Target audience: Software engineers, data scientists, researchers, information retrieval practitioners.

Learning outcomes:

The basics of Large Language Models
How to navigate the Open-Source Sea of LLMs
What Open-Source frameworks and projects to adopt if you want to use/interact with LLMs
How to integrate LLMs with popular Open-Source search engines.

Tutorial logistics/materials: slides and code snippets will be provided. Bring your own laptop.

Tutorial 2 - Full day

Uncertainty Quantification for Text Classification

Tutor(s):

Dell Zhang, Thomson Reuters Labs, London, UK.
Murat Sensoy, Amazon Alexa AI, London, UK.
Lin Gui, King’s College London, London UK
Yulan He, King’s College London & Alan Turing Institute, London, UK.

Schedule:

10:00 - 14:30

Tutorial scope:

This full-day tutorial introduces modern techniques for practical uncertainty quantification specifically in the context of multi-class and multi-label text classification. First, we explain the usefulness of estimating aleatoric uncertainty and epistemic uncertainty for text classification models. Then, we describe several state-of-the-art approaches to uncertainty quantification and analyze their scalability to big text data: Next, we talk about the latest advances in uncertainty quantification for pre-trained language models (including asking language models to express their uncertainty, interpreting uncertainties of text classifiers built on large-scale language models, uncertainty estimation in text generation, calibration of language models, and calibration for in-context learning).

After that, we discuss typical application scenarios of uncertainty quantification in text classification (including in-domain calibration, cross-domain robustness, and novel class detection). Finally, we list popular performance metrics for the evaluation of uncertainty quantification effectiveness in text classification. Practical hands-on examples/exercises are provided to the attendees for them to experiment with different uncertainty quantification methods on a few real-world text classification datasets such as CLINC150.

Logistics/Materials:

Bring your own laptop.

Search Solutions conference 2023 – 23 November

The objective of the conference this year is to explore the implications and opportunities of AI-based technologies in enhancing the user experience in enterprise, e-commerce and systematic search. The conference marks the first anniversary of the launch of ChatGPT on 30 November 2022.

Access the full conference playlist on YouTube

Session 1 – 09:55-11:00

09:55 - Understanding the Dangers of using LLMs. Professor Julie Weeds, Professor in Artificial Intelligence, University of Sussex.
10:30 - Using AI tools for discovery Hong Zhou Wiley Scientific.

Session 2 – 11:20-13:00

11:20 - IR, AI and ‘search’ – Creating synergy Steve Zimmerman, Samy Ateia and Martin White consider the opportunities and the issues, with the assistance of the audience!
12:20 - BCS Search Industry Awards – Introduction Tony Russell-Rose.
12:30 - Panel Session looking back at the morning presentations.

Session 3 – 13:45-14:45

13:45 - Pragmatic AI-powered Search – Keeping it Simple, not Stupid. Charlie Hull Open Source Connections.
14:15 - Retrieval-augmented text generation (RAG) for legal IR Grace Lee.
14:45 - Ontologies in the age of AI-based discovery Peter Winstanley.

Session 4 – 13:45-14:45

15:50 - Presentation of the BCS Search Industry Awards.
16:00 - New technologies for systematic reviews: are large language models really a gamechanger? James Thomas.
16:30 - Integrating ChatGPT into existing applications Paul Cleverley.
17:00 - Panel session – agreeing the take-aways from the conference.
17:30 - SS2023 Best Paper Award.

Organisers

Ingo Frommholz
Frank Hopfgartner
Udo Kruschwitz
Tony Russell-Rose
Martin White
Haiming Liu (tutorials chair)

Contact

For further details, contact irsg@bcs.org.uk.