Innovations in Search & Information Retrieval.
Search Solutions is the BCS Information Retrieval Specialist Group’s annual event focused on practitioner issues in the arena of search and information retrieval. It is a unique opportunity to bring together academic research and practitioner experience.
The Search Solutions event consists of a Tutorial day and a Conference day, each of which has a separate registration. Information on the Conference day can be found here.
Search Solutions 2023 Tutorials 21 November
This year there are two tutorials, a half-day tutorial on how Large Language Models can Improve Your Search Project and a full day tutorial on Uncertainty Quantification for Text Classification
Registration form on Eventbrite
Details of the tutorials are given below.
Tutorial fee (for the day)
BCS Members £80
Both the Tutorials and the Conference take place at the BCS London Headquarters, Ground Floor, 25 Copthall Avenue EC2R 7BP. This is a 10 minute walk from Liverpool Street Station (Elizabeth Line and London Underground) and a 20 minute walk from London Bridge Station.
Tutorial 1 How Large Language Models Can Improve Your Search Project (Half-day)
Alessandro Benedetti, Sease Ltd. Apache Lucene/Solr committer and PMC member and Director and R&D Software Engineer
Tutorial Schedule (10AM – 2PM)
Large Language Models (LLMs) are becoming ubiquitous: everyone is talking about them, everyone wants to use them and everyone claims is getting benefits out of them…
But… is it that simple?
This tutorial aims to demystify the Open-Source landscape of large language models, exploring what it means to use them to improve your search engine ecosystem and what are the most common pitfalls.
The talk starts by introducing the reasons to add LLMs to your search application and the complexities it adds (choosing the right model, measuring success, and rabbit holes).
Building upon the introduction we’ll present how search changes with the advent of this innovative technology, what is ‘fine-tuning’ and what are the Open-Source solutions available in terms of models, components to interact with the models, and search engines that integrate with such technologies.
During the session, we’ll have multiple demos showing effective ways of using LLMs in your search project, using open-source software and publicly available datasets.
Join us as we explore this new exciting Open-Source landscape and learn how you can leverage it to improve your search experience!
- Target audience:Software engineers, data scientists, researchers, information retrieval practitioners,
- Learning outcomes:
– The basics of Large Language Models
– How to navigate the Open-Source Sea of LLMs
– What Open-Source frameworks and projects to adopt if you want to use/interact with LLMs
– How to integrate LLMs with popular Open-Source search engines.
Tutorial logistics/materials: slides and code snippets will be provided. Bring your own laptop.
Tutorial 2 Uncertainty Quantification for Text Classification (Full Day)
Schedule 10.00 – 16.30
Dell Zhang, Thomson Reuters Labs, London, UK.
Murat Sensoy, Amazon Alexa AI, London, UK.
Lin Gui, King’s College London, London UK
Yulan He, King’s College London & Alan Turing Institute, London, UK.
This full-day tutorial introduces modern techniques for practical uncertainty quantification specifically in the context of multi-class and multi-label text classification. First, we explain the usefulness of estimating aleatoric uncertainty and epistemic uncertainty for text classification models. Then, we describe several state-of-the-art approaches to uncertainty quantification and analyze their scalability to big text data: Next, we talk about the latest advances in uncertainty quantification for pre-trained language models (including asking language models to express their uncertainty, interpreting uncertainties of text classifiers built on large-scale language models, uncertainty estimation in text generation, calibration of language models, and calibration for in-context learning).
After that, we discuss typical application scenarios of uncertainty quantification in text classification (including in-domain calibration, cross-domain robustness, and novel class detection). Finally, we list popular performance metrics for the evaluation of uncertainty quantification effectiveness in text classification. Practical hands-on examples/exercises are provided to the attendees for them to experiment with different uncertainty quantification methods on a few real-world text classification datasets such as CLINC150.
Tutorial Logistics/Materials: Bring your own laptop.