Autumn 2022

In the Autumn issue

Let me start with IRSG business. Udo has written his last post as Chair as a new Chair will take over after the AGM on 23 November.   There have been no candidates to take on the Editorship of Informer so next year we are planning to publish just two issues (in April and November) in what we hope will be an interim situation just for the coming year.

The AGM will take place at the end of the Search Solutions Conference, and you will find the programmes for both the Conference and for the Tutorials on 22 November. Details are also on the IRSG web site.

The two feature articles in this issue report on some aspects of the research underway in the DoSSIER project and give an insight into enterprise search developments from a vendor perspective.

I suspect that few readers will be familiar with the work of Carlos Cuadra, who died in August. I knew him quite well and have written a short obituary and listed some of his major achievements in the development of commercial information retrieval services.

Moving on to books, you will find a review of a book published in 2006 that still remains one of the best introductions to the technology and business impact of enterprise search. I thought it had vanished but came across it recently as an open access download. Also on open access is my attempt to write a history of enterprise search from 1938 (that is not a misprint!) to 2022. There are of course many books on various aspects of search from both a practitioner and research perspective and I have listed out what I hope is a representative selection of books published since 2010.

Andy Macfarlane provides his usual list of IR-related conferences around the world, many of which you can probably observe from the comfort of your home office.

And finally some thoughts from me about what I regard as a rather substantial gap in IR research, probably because (at least in my opinion) search is a poorly understood wicked problem.

Martin White

From the BCS IRSG Chair

Welcome to this autumn issue of Informer from your Chair. Hopefully you had a spectacular summer (or winter if you happen to be in Australia) with plenty of sun (or snow) and perhaps some conference trips?

A warm welcome to the AGM

Well, if your answer to the last point is ‘unfortunately not’, then maybe you should consider joining us at Search Solutions 2022 in London later this month. We are looking forward to an in-person event and the speakers (what a line-up!) are equally excited to be back in the room with people like you asking questions and discussing the latest in search technology and deployment.

But wait, wasn’t there some other key date in the diary that month … you are right, our AGM will be co-located with Search Solutions and that is your chance to make your voice known. At the AGM we will also welcome new committee members and officers. As I write this we are still open for nominations including for the role of Chair to the committee (has it really been two years that I took on the role?).

Looking ahead, we also have ECIR 2023 in Dublin nicely shaping up with some record numbers of submissions. Right now the various review processes are well underway, but one thing we know already: workshops have been announced and it is … a record number of accepted submissions.

So who will host ECIR 2024? You will find out at the AGM …

Let’s get back to Informer. I am very pleased that Martin has worked his magic yet again to compile a comprehensive list of contributions (I am amazed myself every time I check out the new issue I come across some surprising stories not covered elsewhere, such as David Hawking’s new book reviewed in the summer edition). Just to pick out something from the current issue, we have two contributions for the newly established Graduate Corner. Wojciech Kusa and Georgios Peikos are both PhD students in the DoSSIER project. And if that was not all, you even get a report on the DoSSIER summer school that was held recently in Greece at the birthplace of Aristotle …

I hope you enjoy this issue of Informer, and hope even more that I see you at one of the forthcoming events we are involved in. Sunny greetings from Bavaria … (easy to say as it’s always sunny here) …

Udo Kruschwitz

IRSG AGM and Elections

The BCS IRSG Annual General Meeting (AGM) is scheduled to take place on Wednesday November 23rd at 6PM.  The AGM will take place immediately following the close of Search Solutions 2022, which is being held at the BCS London office at 25 Copthall Avenue
London EC2R 7BP.    As with the Search Solutions conference, the AGM will be run in a hybrid format.

During the AGM, updates will be provided including announcement of the ECIR 2024 location and election results for the new committee members.   If you are interested in becoming a committee member please see the election page here.   The deadline is officially November 3rd, however due to a low response of candidates, the window is being extended.

 

 

Informer in 2023

For most of its history Informer has been published four times a year. In 2023 the plan is to publish just two issues because efforts to find someone to take over the editorship have not been successful. I have agreed to continue to act as Editor but because of other commitments I can only commit to two issues. One of these commitments is specifying a new publishing platform for Informer – if you have any suggestions from your own experience please let me know. I must emphasis the word ‘experience’.

There will be a Spring issue published in late April 2023 that will provide an initial report on ECIR 2023 and a detailed report on Search Solutions 2022. There will be an Autumn issue published in late October that will set out the programme for the Search Solutions Tutorials and Conference in late November.

If a new Editor does come forward then it may be possible to publish additional issues in 2023. We will review the situation towards the end of 2023 and very much hope that it will be possible to revert to the quarterly cycle, or even to a completely different publishing model if someone is willing to take on the task.

Martin White

Search Solutions Conference 2022 – London, 23 November

Search Solutions 2022 will take place at the BCS London office at 25 Copthall Avenue London, EC2R 7BP, which is a 10 minute walk from either Liverpool Street or Moorgate Underground stations.

Session 1: The Search Experience: Focus on the users

  • 10:00 – 10:15 Introduction
  • 10:15 – 10:45 Natasha den Dekker (LexisNexis) “How to conduct empathetic user research to test the search experience of users?”
  • 10:45 – 11:15 Amy Walduck (State Library of Queensland) “The Topography of Searching: Visualising search data“

11:15 – 11:45 Break

Session 2: Beyond keyword search: Semantic/conversational/audio search

  • 11:45 – 12:15 Brammert Ottens (Spotify) “Finding the Right Audio Content for You”
  • 12:15 – 12:45 Mohamed Yahya (Bloomberg) “Taking Question Answering from Research Prototype to Product”
  • 12:45 – 13:15 Filip Radlinski (Google) “Challenges with Really Understanding Natural Language in Conversational Recommendation”

13:15 – 14:15 Lunch

Session 3: Search with an impact: Searching health-related information

  • 14:15 – 14:45 Farhad Shokraneh (Institute of Health Informatics, University College London) “The Futures of Systematic Searching”
  • 14:45 – 15:15 Gavin Moore & Andrew Doyle (University Hospitals Coventry & Warwickshire NHS Trust) “A Programmable Search – A Solution to Finding Guidelines and Patient Information?”

15:15 – 15:30 Break

Session 4: A world beyond web search: enterprise search

  • 15:30 – 16:00 Julien Massiera & Cedric Ulmer (France Labs) “Combining Spacy with Datafari Community Edition to enable semantic Enterprise Search”
  • 16:00 – 16:30 Phil Lewis (Pureinsights) “Practical Applications of Knowledge Graphs and AI in Search”
  • 16:30 – 17:00 Lightning Talks (feel free to step up and present YOUR five-minute talk)

17:00 – 17:30 Our traditional ‘fishbowl’ session

17:30 – 17:45 BCS Search Industry Awards

17:45  Drinks reception/ BCS-IRSG AGM starts at 18:00

Registration

BCS Members: £92.00
Non-Members: £110.00
Students: £80.00

https://www.eventbrite.co.uk/e/search-solutions-2022-inc-tutorials-information-retrieval-sg-tickets-433482236037

 

Search Solutions Tutorials 2022 – London 22 November

The Search Solution Tutorials will take place at the BCS London office at 25 Copthall Avenue, London, EC2R 7BP. This is around a 10 minute walk from either Liverpool Street or Moorgate underground stations.

Tutorial 1 – Full day
IR From Bag-of-words to BERT and Beyond through Practical Experiments

  • Sean MacAvaney (University of Glasgow)
  • Craig Macdonald (University of Glasgow)
  • Nicola Tonellotto (University of Pisa)

Tutorial 2 – Morning
Approaching Neural Search with Apache Solr and Open-source technologies

  • Alessandro Benedetti (CEO @ Sease Ltd, Apache Lucene/Solr Committer, Apache Solr PMC Member)

Tutorial 3 – Afternoon
Simplifying NLP researchers work with Datafari Open Source

  • Julien Massiera (France Labs)
  • Cedric Ulmer (France Labs)

Tutorial 4 – Full Day
Diverse Approaches to Systematic Searching

  • Dr Farhad Shokraneh (Institute of Health Informatics, University College London)

Ticket costs

(Prices quoted include VAT and booking fees.)

Tuesday 22 November 2022 – Tutorials

BCS Members: £80.00
Non-Members: £95.00
Students: £65.00

https://www.eventbrite.co.uk/e/search-solutions-2022-inc-tutorials-information-retrieval-sg-tickets-433482236037

Refunds/Cancellations

We will issue a refund, excluding fees, if a cancellation is received within 14 days of the booking date or by noon on Monday 21 November 2022; otherwise, name substitutions will be allowed after this date.

BCS is a membership organisation. If you enjoy this event, please consider joining BCS. You’ll be very welcome. You’ll receive access to many exclusive career development tools, an introduction to a thriving professional community and also help us Make IT Good For Society. Join BCS today.

For overseas delegates who wish to attend the event, please note that BCS does not issue invitation letters.

COVID-19

BCS is following government guidelines and we would ask attendees to continue to also follow these guidelines. Please go to https://www.nhs.uk/conditions/coronavirus-covid-19/ for more information, advice, and instructions.

 

Perspectives on the EU Horizon DoSSIER Project

In this feature item there are three excellent contributions from members of the DoSSIER project. DoSSIER is an acronym for an EU Horizon 2020 ITN/ETN on Domain Specific Systems for Information Extraction and Retrieval

There are three contributions from members of the project team
  • A Summary of the First DoSSIER Training School, which took place in September 2022 and is jointly authored by Florina Piroi, Mike Salampasis, and Allan Hanbury
  • A sub-project on the exploration of ‘relevance’ by Geirgios Peikos, an early-stage researcher in the project team
  • The application of machine learning in the healthcare and biomedical domain by Wojciech Kus

Read more…

The evolution of Datafari, a European open source enterprise search application – Cedric Ulmer CEO

I have recently been asked for a customer to deliver an analysis of available open source solutions as a replacement to a decade old proprietary enterprise search solution. In this article, I wanted to share the outcome of this analysis, and to give an outlook of how we perceive the future of such solutions. As a disclaimer, be aware that I am the CEO and cofounder of France Labs, the company developing Datafari, and this is important since Datafari was part of the solutions analyzed, and obviously this puts a potential bias on my considerations.

As an introduction, let me define what we mean by Enterprise Search in this particular context: it is about proposing a solution that can index many document sources within an information system, without prior knowledge about the working context (and yes, this is bad for UI optimization and relevancy), that allows employees to type in textual queries in a search bar, and that displays the search results as a vertical list together with facets. In addition, this must be done securely, which means it should come with connectivity to AD/LDAP for authentication, and with the capacity to respect the documents level permissions when users are searching (i.e. only display what users are allowed to see). Orthogonally to these functional aspects, Enterprise Search solutions must also provide administration capabilities, in terms of configuration and exploitation.

Read more…

Carlos Cuadra 1925-2022 A pioneer in commercial IR service development

Carlos Cuadra died in August this year. I doubt many readers of Informer will recognize the name but Carlos was a remarkable innovator in information retrieval in the 1960s and 1970s. Almost all his development work was carried out in a commercial environment, notably at System Development Corporation when it was spun out of RAND in 1957 and then at Cuadra Associates from 1978. His early work at SDC was on the development of question-answering applications for the Los Angeles Police Force. In 1979 Carlos released STAR, the first multi-tasking/multi-user information retrieval application that could be run on a PC. It was (and indeed is!) widely used as an archive management solution.

A measure of the contribution that Carlos made to the development of on-line information retrieval applications is that in the definitive A History of Online Information Services 1963-1976 there are more items in the index relating to Carlos than any other individual.

Read more…

The Book of Search – Book Review

Spoiler alert – this is a review of a book published in 2006!

First a short history lesson.  FAST Search and Transfer was founded in 1997 to commercialize search technology developed at the University of Trondheim. The subsequent history of FAST is difficult to untangle as the company seemed adept at delivering some very neat search technology and raising question-marks about its accounting practices. The end-game was the acquisition of the company by Microsoft in 2008 for $1.2 billion, followed by an inquest about the extent of the due diligence that was mirrored a few years later with HP’s acquisition of Autonomy. The core of the company lives on as the Microsoft Technology Centre in Oslo and a global group of alumni who worked for the company and left to either work for other companies or start their own business.

FAST was also very good at marketing. One of the outcomes was The Book of Search published in 2006. The quality of the content and the presentation of this 142pp book are both exceptional. To quote from the Introduction

Read more…

A History of Enterprise Search 1938-2022

If you will exclude some self-promotion from the Editor, last month my book A History of Enterprise Search 1938-2022 was published under the PressBooks imprint of the University of Sheffield.

If you have cause to doubt that there were any enterprise search initiatives in 1938 then you should certainly read this book. Enterprise search has been around for a very long time.

https://sheffield.pressbooks.pub/eshistory1/ will take you to the PressBooks site from where you can download both a pdf and e-Pub file under a Creative Commons License.

For more about PressBooks (it is based on WordPress) and my experience of using it read the story on the Library web site

Martin White

Books for search researchers and search managers

For some time now I have maintained a list of books on the Intranet Focus web site that I think would be of interest to search researchers and search managers. I am in the process of rebuilding the web site and so many of the internal links to reviews of the books will no longer exist. I have now revised the list so that the links are to the web sites of the publishers and so should stand the test of time.

If there are any books that you feel should be included, please email me with your suggestions.

The list is in reverse chronological order of publication, going back to 2010. There are of course many earlier books, notably Introduction to Information Retrieval by Manning, Raghavan and Schutze but I had to stop somewhere. The earliest book on my bookshelf is An Introduction to Modern Information Retrieval by none other than Gerard Salton and Michael McGill, published in 1986.

Read more…

And finally….!

As I undertake my daily scanning of the research papers on information retrieval, I continue to be impressed with the level of innovation that I see. Clearly significant advances are being made into many areas. My concern is how these advances are going to feed into search products that are going to make a commensurate advance in the findability experienced by employees in their daily work. To use the metaphor of building a house, there seem to be endless suggestions for the design of kitchens, bedrooms, patios and bathrooms but no one is considering whether there is any combination that makes a house a home and takes into account that people have different visions for what they regard as an acceptable compromise taking into account their long terms plans and the current (and continuing) high levels of interest.

Read more…