Spring 2021

In this issue

It has taken me a couple of years to realise that I have never actually written an editorial because when I took over from Udo Kruschwitz I decided to give you a summary of the contents of the issue. So the section has been renamed as ‘In this issue’ and you have no idea how long it took me to come up with that title. My only flight of editorial fancy is the And finally, which in this issue presents some of my thoughts on perceptual speed, scanning and abstract writing.

Also new in this issue is IRSG Committee, which is a contribution from Steve Zimmerman, the IRSG Secretary. In principle it will record the outcomes of recent Committee meetings but in so doing the Committee would like to persuade you to come and be a member of Committee. All you need is a passion for information retrieval.

This issue is very much about conferences and awards. It starts out with looking forward to Search Solutions 2021 which will take place on 23-24 November, hopefully (please!) on-site at the BCS London offices but that depends on what the Covid rules of distancing are at that time. The event will include the presentation of the BCS Search Industry Awards. Then next year comes ECIR 2022 in Stavanger, with ECIR 2023 following along in Dublin. There are of course many other events in our sector, and as usual Andy Macfarlane has them all listed.

Looking back there is a brief account of the very successful Fairness and Bias in IR event held in Glasgow in March, held over from March 2020. There is also a report on the BIRDS workshop –  Bridging the Gap between Information Science, Information Retrieval and Data Science, which this year was held in April in conjunction with CHIIR.

Then you will come to the ECIR 2021 Conference Supplement (all good newsletters have Supplements from time to time) for which I have provided a sub-contents list.

The UKeiG Farradane Award for 2020 has been presented to Emeritus Professor Tom Wilson. I asked Tom if he would write a feature article on his work in information behaviour, and it seemed a good opportunity to review his new book on information behaviour, which is in fact work-in-progress. I know he would value your comments and suggestions.

To round off the issue there is a note about a paper on searching for Covid information in the enterprise that was presented by Paul Cleverley at Search Solutions 2020. At the time the paper was under review but it has now been published.  There are also some insights on current issues in open source search, a link to Search Insights 2021 and finally And Finally!

Search Solutions 2021 23-24 November.

A conference committee has been set up for this meeting. Since this was my suggestion I’ve ended up being the Chair. The other members are Tony Russell-Rose, Charlie Hull, Ingo Frommholz and Haiming Liu. We are anticipating that the conference will take place on 24 November with tutorials on 23 November. These dates have been agreed with the BCS but much will depend on how many attendees can be accommodated at the BCS offices in London. We may not have a reliable indication for this for perhaps another month or so. For now please add these dates to your calendar. If you have been impressed by particular speakers over the last year or so and think that they might be a good fit with Search Solutions do please let me know. We will be opening up a Call for Proposals later in the year.

We are planning to reintroduce the BCS Search Industry awards at this event. They were last given in 2019. More details below from Tony Russell-Rose who is taking the lead for the Awards.

ECIR 2022 / ECIR 2023

The 44th European Conference on Information Retrieval will take place from 10-14 April 2022 in Stavanger, Norway. The web site is now up, highlighting that this will be the most northerly ECIR ever.

If you are planning your 2023 vacation then you may like to know that ECIR 2023 will returning to Dublin, the venue for ECIR 2011, which was the first I had attended.

Fairness and Bias in Information Retrieval Glasgow 23 March 2021

2020 was a notable year in information retrieval research and development. First there was the realization that AI applications were going to need a level of transparency that would reassure customers that they were supporting, not controlling, the search process. Second there was the realization that the search community had a responsibility to think beyond the Holy Grail of perfect relevance and take into account issues around bias, fairness and reproducibility.

Back in March 2020 I was looking forward to attending a workshop being run by Graham McDonald, Iadh Ounis and Craig Macdonald at the University of Glasgow on the subject of Fair Information Retrieval in Industry.  It turned out to the first of many cancelled events last year! So I was delighted that the team were able to run a virtual version in March 2021 with a slightly different title.  The papers are listed below, with video links where available.

Read more…

ECIR 2021 Conference Supplement

ECIR 2021 was an outstanding event from every perspective. In this first-ever Supplement you will find

A report on the Industry Day presentations will be published in the Summer 2021 issue of Informer.

I am very grateful to Raffaele Perego and Fabrizio Sebastiani, the Co-Chairs of the event, for their support in the preparation of this supplement.

The BIRDS were flying again – Bridging Gaps at CHIIR 2021

Can Data Science, Information Science, Information Retrieval and Human-Computer Interaction get together and learn from each other? Bringing together these different communities is the aim of the BIRDS workshop. While in Information Retrieval we learnt over the last decades how to meld user- and system-oriented approaches, one of the questions is how we can make use of the results and experiences gained through this long process in Information Retrieval in a broader Data Science and Data Exploration context. Read more…

Thinking about information behaviour – Professor Tom Wilson

Each year the UK Electronic Information Group, a special interest group of CILIP, makes the Farradane Award. The Award honours Jason Farradane, who first made an impact on the LIS community with a paper on the ‘scientific approach to documentation’ presented at a Royal Society Scientific Information Conference in 1948. He was instrumental in establishing the Institute of Information Scientists in 1958, alongside the first academic information science courses in 1963 at the precursor to City University, London, where he became Director of the Centre for Information Science in 1966.

The  winner of the 2020 Jason Farradane Award was Tom D. Wilson, Professor Emeritus, University of Sheffield, UK. I first discovered ‘information behaviour’  during a visit to the University of Sheffield in 2002 when Tom was on the staff team presenting the work of the Department (at that time it was the Department of Library and Information Studies) and as someone with a long interest in information management I will admit to having a ‘Eureka’ moment as many of the issues I had been grappling with in my work on information management started to fall into place. On learning of this award I asked Tom if he would write a feature article for Informer on his work on information behaviour.

T.D. Wilson Professor Emeritus, University of Sheffield

How people discover and relate to “information” has changed significantly over the last 50 years: when I worked in the nuclear energy industry in the late 1950s, organizing a library and doing literature searches for the scientists, there was nothing digital – everything you needed was in printed form.  Discovering what my clients needed involved searching printed abstracting services such as Chemical Abstracts, Nuclear Science Abstracts, and Metallurgical Abstracts, buying journals, using the inter-library lending system, visiting the local public library and the university library.

My “information behaviour” at that time was, therefore, highly constrained by the printed form of the information resources and their local availability and the same would be true for any of the scientists or engineers who chose to do their own information searching.

Let us call this early information behaviour Search Mode I

Read more…

Book Review  Information Behaviour by Tom Wilson

Nowadays we are very familiar with information retrieval, interactive information retrieval, information risk and information management but information behaviour (not information behaviours – as Tom Wilson emphasizes in this book) is rarely discussed even though thanks to Luciano Floridi we have a good understanding of the philosophy of information.

There is now no excuse for not delving into information behaviour. This open access book by Tom Wilson is also a very accessible read, and what I found especially fascinating was his openness about the changes he has made in his view of information behaviour over the last thirty years or more. The book is a slim one but then so is Floridi’s masterful Information – A Very Short Introduction! The sections (rather than chapters) cover Information Behaviour, Modelling Behaviour, Information Behaviour: A General Model, Models and Theories, Researching Information Behaviour, Using Information Behaviour Research, and Conclusion, and then an excellent bibliography of over 230 citations. A feature of the book is the adroit use of diagrams to pull together some inevitably challenging concepts and directions of travel.

Read more…

Paul Cleverley’s presentation to Search Solutions 2020 has now been published

At Search Solutions 2020 Paul Cleverley presented a paper on the research he had carried out with Fionualla  Cousins and Simon Burnett on the search patterns for Covid-19 within a very large oil and gas company during the initial stages of the lockdown in early 2020. Paul was not able to share his presentation slides as the paper was going through peer review prior to publication in the Journal of Information Science.

The paper has now been published and can be downloaded from Paul’s web site. For me the highlight is Fig.8 which shows a trend from single word queries at the outset of the pandemic to three and even four word queries within a few months as search users gained a wider vocabulary and also had more focus about the information they were seeking. If anyone tells you that typically enterprise search users only use single query terms this is all the evidence you need to show how little they know about enterprise search user behaviour.

Search Insights 2021 is now available

Since 2018 The Search Network has been publishing an annual Search Insights report which is a set of short essays on a wide range of issues broadly (but not totally)  related to enterprise search. Search Insights 2021 can be downloaded free-of-charge, and with no requirement to register, from the web site of The Search Network. As well as the essays the report includes a list of search vendors, an extensive glossary and a list of books and blogs on search topics. [Disclosure – I am a member of the Network]

Open source search and OpenSearch

At the beginning of 2021 ElasticSearch took the open source search community by surprise when it announced some changes to its licensing model and subsequently highlighted the differences between the ElasticSearch and AWS offerings.

Charlie Hull, Managing Consultant at Open Source Solutions, has been tracking the outcomes of this decision, and very courageously set up and chaired a debate between Elastic Search, Solr and Vespa.

Charlie has published a very good summary of the debate which links to a video recording.

There are two further contributions from Charlie Hull at

OpenSearch – Amazon forks Elasticsearch and the divergence begins

Is Elasticsearch no longer open source software?

IRSG Committee Meeting Highlights (January 26 2021)

Greetings all. As your acting IRSG secretary, I am pleased to kick off the inaugural update from the most recent IRSG committee meeting. This will be a very brief piece highlighting key points from the meeting.

Our last meeting IRSG committee meeting took place on January 26th 2021.

Key action points from this meeting were Read more…

And finally!

And finally!

As a way of keeping in touch with information retrieval research during the lock down I started to look at some sections of arXiv on a regular basis. After a few months I homed in on the following sections as being the most fruitful.

Artificial Intelligence authors/titles recent submissions (arxiv.org)

Computation and Language authors/titles recent submissions (arxiv.org)

Computers and Society authors/titles recent submissions (arxiv.org)

Digital Libraries authors/titles recent submissions (arxiv.org)

Human-Computer Interaction authors/titles recent submissions (arxiv.org)

Information Retrieval authors/titles recent submissions (arxiv.org)

Social and Information Networks authors/titles recent submissions (arxiv.org)

Over the last six months in particular the volume of pre-prints seems to have increased substantially, with Artificial Intelligence, Information Retrieval and Computation and Language often exceeding 100 new papers a day. Scanning them has been an interesting exercise because it replicates the challenges of scanning research results, especially when Microsoft 365 decides that life is easier with no snippets.

The first step in the process is the initial read-through to note items that have at least some indication of relevance. This comes down to perceptual speed, and I’m also conscious of the extent to which initial capitalization is helpful in this process. Another factor is the extent to which I can comprehend the title.

Zero-shot Slot Filling with DPR and RAG

BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with Assembly Models

Just two examples of many where the title is only intelligible to a small number of research teams working in that area. Is that a good idea if you have a genuine interest in achieving a high impact with your research? As a result on many occasions I have to read through the abstract, which for some reason best known to arXiv is presented in a 50 word line length. The sentences below are just a single line in arXiv.

Bayesian optimization is a popular algorithm for sequential optimization of a latent objective function when sampling from the objective is costly. The search path of the algorithm is governed by the acquisition function, which defines the agent’s search strategy. Conceptually, the acquisition function characterizes how the optimizer balances exploration and exploitation when

Scanning these abstracts with such a long line length and minimal inter-line spacing is quite challenging. Even more challenging is that authors often forget that the purpose of the abstract in arXiv is to entice you to click on the link and then read the full paper. Invariably the abstract on arXiv is the same as in the paper but to me the objectives of the pre-print abstract (read me!) and the pre-published paper itself (in case you get lost!) are different.

As breaking news on pre-print servers, at last there is significant progress from Springer Nature towards linking the published (and often somewhat different!) paper with the pre-print.