hi

In the Autumn issue

Let me start with IRSG business. Udo has written his last post as Chair as a new Chair will take over after the AGM on 23 November.   There have been no candidates to take on the Editorship of Informer so next year we are planning to publish just two issues (in April and November) in what we hope will be an interim situation just for the coming year.

The AGM will take place at the end of the Search Solutions Conference, and you will find the programmes for both the Conference and for the Tutorials on 22 November. Details are also on the IRSG web site.

The two feature articles in this issue report on some aspects of the research underway in the DoSSIER project and give an insight into enterprise search developments from a vendor perspective.

I suspect that few readers will be familiar with the work of Carlos Cuadra, who died in August. I knew him quite well and have written a short obituary and listed some of his major achievements in the development of commercial information retrieval services.

Moving on to books, you will find a review of a book published in 2006 that still remains one of the best introductions to the technology and business impact of enterprise search. I thought it had vanished but came across it recently as an open access download. Also on open access is my attempt to write a history of enterprise search from 1938 (that is not a misprint!) to 2022. There are of course many books on various aspects of search from both a practitioner and research perspective and I have listed out what I hope is a representative selection of books published since 2010.

Andy Macfarlane provides his usual list of IR-related conferences around the world, many of which you can probably observe from the comfort of your home office.

And finally some thoughts from me about what I regard as a rather substantial gap in IR research, probably because (at least in my opinion) search is a poorly understood wicked problem.

Martin White

From the BCS IRSG Chair

Welcome to this autumn issue of Informer from your Chair. Hopefully you had a spectacular summer (or winter if you happen to be in Australia) with plenty of sun (or snow) and perhaps some conference trips?

A warm welcome to the AGM

Well, if your answer to the last point is ‘unfortunately not’, then maybe you should consider joining us at Search Solutions 2022 in London later this month. We are looking forward to an in-person event and the speakers (what a line-up!) are equally excited to be back in the room with people like you asking questions and discussing the latest in search technology and deployment.

But wait, wasn’t there some other key date in the diary that month … you are right, our AGM will be co-located with Search Solutions and that is your chance to make your voice known. At the AGM we will also welcome new committee members and officers. As I write this we are still open for nominations including for the role of Chair to the committee (has it really been two years that I took on the role?).

Looking ahead, we also have ECIR 2023 in Dublin nicely shaping up with some record numbers of submissions. Right now the various review processes are well underway, but one thing we know already: workshops have been announced and it is … a record number of accepted submissions.

So who will host ECIR 2024? You will find out at the AGM …

Let’s get back to Informer. I am very pleased that Martin has worked his magic yet again to compile a comprehensive list of contributions (I am amazed myself every time I check out the new issue I come across some surprising stories not covered elsewhere, such as David Hawking’s new book reviewed in the summer edition). Just to pick out something from the current issue, we have two contributions for the newly established Graduate Corner. Wojciech Kusa and Georgios Peikos are both PhD students in the DoSSIER project. And if that was not all, you even get a report on the DoSSIER summer school that was held recently in Greece at the birthplace of Aristotle …

I hope you enjoy this issue of Informer, and hope even more that I see you at one of the forthcoming events we are involved in. Sunny greetings from Bavaria … (easy to say as it’s always sunny here) …

Udo Kruschwitz

IRSG AGM and Elections

The BCS IRSG Annual General Meeting (AGM) is scheduled to take place on Wednesday November 23rd at 6PM.  The AGM will take place immediately following the close of Search Solutions 2022, which is being held at the BCS London office at 25 Copthall Avenue
London EC2R 7BP.    As with the Search Solutions conference, the AGM will be run in a hybrid format.

During the AGM, updates will be provided including announcement of the ECIR 2024 location and election results for the new committee members.   If you are interested in becoming a committee member please see the election page here.   The deadline is officially November 3rd, however due to a low response of candidates, the window is being extended.

 

 

Informer in 2023

For most of its history Informer has been published four times a year. In 2023 the plan is to publish just two issues because efforts to find someone to take over the editorship have not been successful. I have agreed to continue to act as Editor but because of other commitments I can only commit to two issues. One of these commitments is specifying a new publishing platform for Informer – if you have any suggestions from your own experience please let me know. I must emphasis the word ‘experience’.

There will be a Spring issue published in late April 2023 that will provide an initial report on ECIR 2023 and a detailed report on Search Solutions 2022. There will be an Autumn issue published in late October that will set out the programme for the Search Solutions Tutorials and Conference in late November.

If a new Editor does come forward then it may be possible to publish additional issues in 2023. We will review the situation towards the end of 2023 and very much hope that it will be possible to revert to the quarterly cycle, or even to a completely different publishing model if someone is willing to take on the task.

Martin White

Search Solutions Conference 2022 – London, 23 November

Search Solutions 2022 will take place at the BCS London office at 25 Copthall Avenue London, EC2R 7BP, which is a 10 minute walk from either Liverpool Street or Moorgate Underground stations.

Session 1: The Search Experience: Focus on the users

  • 10:00 – 10:15 Introduction
  • 10:15 – 10:45 Natasha den Dekker (LexisNexis) “How to conduct empathetic user research to test the search experience of users?”
  • 10:45 – 11:15 Amy Walduck (State Library of Queensland) “The Topography of Searching: Visualising search data“

11:15 – 11:45 Break

Session 2: Beyond keyword search: Semantic/conversational/audio search

  • 11:45 – 12:15 Brammert Ottens (Spotify) “Finding the Right Audio Content for You”
  • 12:15 – 12:45 Mohamed Yahya (Bloomberg) “Taking Question Answering from Research Prototype to Product”
  • 12:45 – 13:15 Filip Radlinski (Google) “Challenges with Really Understanding Natural Language in Conversational Recommendation”

13:15 – 14:15 Lunch

Session 3: Search with an impact: Searching health-related information

  • 14:15 – 14:45 Farhad Shokraneh (Institute of Health Informatics, University College London) “The Futures of Systematic Searching”
  • 14:45 – 15:15 Gavin Moore & Andrew Doyle (University Hospitals Coventry & Warwickshire NHS Trust) “A Programmable Search – A Solution to Finding Guidelines and Patient Information?”

15:15 – 15:30 Break

Session 4: A world beyond web search: enterprise search

  • 15:30 – 16:00 Julien Massiera & Cedric Ulmer (France Labs) “Combining Spacy with Datafari Community Edition to enable semantic Enterprise Search”
  • 16:00 – 16:30 Phil Lewis (Pureinsights) “Practical Applications of Knowledge Graphs and AI in Search”
  • 16:30 – 17:00 Lightning Talks (feel free to step up and present YOUR five-minute talk)

17:00 – 17:30 Our traditional ‘fishbowl’ session

17:30 – 17:45 BCS Search Industry Awards

17:45  Drinks reception/ BCS-IRSG AGM starts at 18:00

Registration

BCS Members: £92.00
Non-Members: £110.00
Students: £80.00

https://www.eventbrite.co.uk/e/search-solutions-2022-inc-tutorials-information-retrieval-sg-tickets-433482236037

 

Search Solutions Tutorials 2022 – London 22 November

The Search Solution Tutorials will take place at the BCS London office at 25 Copthall Avenue, London, EC2R 7BP. This is around a 10 minute walk from either Liverpool Street or Moorgate underground stations.

Tutorial 1 – Full day
IR From Bag-of-words to BERT and Beyond through Practical Experiments

  • Sean MacAvaney (University of Glasgow)
  • Craig Macdonald (University of Glasgow)
  • Nicola Tonellotto (University of Pisa)

Tutorial 2 – Morning
Approaching Neural Search with Apache Solr and Open-source technologies

  • Alessandro Benedetti (CEO @ Sease Ltd, Apache Lucene/Solr Committer, Apache Solr PMC Member)

Tutorial 3 – Afternoon
Simplifying NLP researchers work with Datafari Open Source

  • Julien Massiera (France Labs)
  • Cedric Ulmer (France Labs)

Tutorial 4 – Full Day
Diverse Approaches to Systematic Searching

  • Dr Farhad Shokraneh (Institute of Health Informatics, University College London)

Ticket costs

(Prices quoted include VAT and booking fees.)

Tuesday 22 November 2022 – Tutorials

BCS Members: £80.00
Non-Members: £95.00
Students: £65.00

https://www.eventbrite.co.uk/e/search-solutions-2022-inc-tutorials-information-retrieval-sg-tickets-433482236037

Refunds/Cancellations

We will issue a refund, excluding fees, if a cancellation is received within 14 days of the booking date or by noon on Monday 21 November 2022; otherwise, name substitutions will be allowed after this date.

BCS is a membership organisation. If you enjoy this event, please consider joining BCS. You’ll be very welcome. You’ll receive access to many exclusive career development tools, an introduction to a thriving professional community and also help us Make IT Good For Society. Join BCS today.

For overseas delegates who wish to attend the event, please note that BCS does not issue invitation letters.

COVID-19

BCS is following government guidelines and we would ask attendees to continue to also follow these guidelines. Please go to https://www.nhs.uk/conditions/coronavirus-covid-19/ for more information, advice, and instructions.

 

Perspectives on the EU Horizon DoSSIER Project

In this feature item there are three excellent contributions from members of the DoSSIER project. DoSSIER is an acronym for an EU Horizon 2020 ITN/ETN on Domain Specific Systems for Information Extraction and Retrieval

There are three contributions from members of the project team
  • A Summary of the First DoSSIER Training School, which took place in September 2022 and is jointly authored by Florina Piroi, Mike Salampasis, and Allan Hanbury
  • A sub-project on the exploration of ‘relevance’ by Geirgios Peikos, an early-stage researcher in the project team
  • The application of machine learning in the healthcare and biomedical domain by Wojciech Kus

Continue reading “Perspectives on the EU Horizon DoSSIER Project”

The evolution of Datafari, a European open source enterprise search application – Cedric Ulmer CEO

I have recently been asked for a customer to deliver an analysis of available open source solutions as a replacement to a decade old proprietary enterprise search solution. In this article, I wanted to share the outcome of this analysis, and to give an outlook of how we perceive the future of such solutions. As a disclaimer, be aware that I am the CEO and cofounder of France Labs, the company developing Datafari, and this is important since Datafari was part of the solutions analyzed, and obviously this puts a potential bias on my considerations.

As an introduction, let me define what we mean by Enterprise Search in this particular context: it is about proposing a solution that can index many document sources within an information system, without prior knowledge about the working context (and yes, this is bad for UI optimization and relevancy), that allows employees to type in textual queries in a search bar, and that displays the search results as a vertical list together with facets. In addition, this must be done securely, which means it should come with connectivity to AD/LDAP for authentication, and with the capacity to respect the documents level permissions when users are searching (i.e. only display what users are allowed to see). Orthogonally to these functional aspects, Enterprise Search solutions must also provide administration capabilities, in terms of configuration and exploitation.

Continue reading “The evolution of Datafari, a European open source enterprise search application – Cedric Ulmer CEO”

Carlos Cuadra 1925-2022 A pioneer in commercial IR service development

Carlos Cuadra died in August this year. I doubt many readers of Informer will recognize the name but Carlos was a remarkable innovator in information retrieval in the 1960s and 1970s. Almost all his development work was carried out in a commercial environment, notably at System Development Corporation when it was spun out of RAND in 1957 and then at Cuadra Associates from 1978. His early work at SDC was on the development of question-answering applications for the Los Angeles Police Force. In 1979 Carlos released STAR, the first multi-tasking/multi-user information retrieval application that could be run on a PC. It was (and indeed is!) widely used as an archive management solution.

A measure of the contribution that Carlos made to the development of on-line information retrieval applications is that in the definitive A History of Online Information Services 1963-1976 there are more items in the index relating to Carlos than any other individual.

Continue reading “Carlos Cuadra 1925-2022 A pioneer in commercial IR service development”

The Book of Search – Book Review

Spoiler alert – this is a review of a book published in 2006!

First a short history lesson.  FAST Search and Transfer was founded in 1997 to commercialize search technology developed at the University of Trondheim. The subsequent history of FAST is difficult to untangle as the company seemed adept at delivering some very neat search technology and raising question-marks about its accounting practices. The end-game was the acquisition of the company by Microsoft in 2008 for $1.2 billion, followed by an inquest about the extent of the due diligence that was mirrored a few years later with HP’s acquisition of Autonomy. The core of the company lives on as the Microsoft Technology Centre in Oslo and a global group of alumni who worked for the company and left to either work for other companies or start their own business.

FAST was also very good at marketing. One of the outcomes was The Book of Search published in 2006. The quality of the content and the presentation of this 142pp book are both exceptional. To quote from the Introduction

Continue reading “The Book of Search – Book Review”

A History of Enterprise Search 1938-2022

If you will exclude some self-promotion from the Editor, last month my book A History of Enterprise Search 1938-2022 was published under the PressBooks imprint of the University of Sheffield.

If you have cause to doubt that there were any enterprise search initiatives in 1938 then you should certainly read this book. Enterprise search has been around for a very long time.

https://sheffield.pressbooks.pub/eshistory1/ will take you to the PressBooks site from where you can download both a pdf and e-Pub file under a Creative Commons License.

For more about PressBooks (it is based on WordPress) and my experience of using it read the story on the Library web site

Martin White

Books for search researchers and search managers

For some time now I have maintained a list of books on the Intranet Focus web site that I think would be of interest to search researchers and search managers. I am in the process of rebuilding the web site and so many of the internal links to reviews of the books will no longer exist. I have now revised the list so that the links are to the web sites of the publishers and so should stand the test of time.

If there are any books that you feel should be included, please email me with your suggestions.

The list is in reverse chronological order of publication, going back to 2010. There are of course many earlier books, notably Introduction to Information Retrieval by Manning, Raghavan and Schutze but I had to stop somewhere. The earliest book on my bookshelf is An Introduction to Modern Information Retrieval by none other than Gerard Salton and Michael McGill, published in 1986.

Continue reading “Books for search researchers and search managers”

And finally….!

As I undertake my daily scanning of the research papers on information retrieval, I continue to be impressed with the level of innovation that I see. Clearly significant advances are being made into many areas. My concern is how these advances are going to feed into search products that are going to make a commensurate advance in the findability experienced by employees in their daily work. To use the metaphor of building a house, there seem to be endless suggestions for the design of kitchens, bedrooms, patios and bathrooms but no one is considering whether there is any combination that makes a house a home and takes into account that people have different visions for what they regard as an acceptable compromise taking into account their long terms plans and the current (and continuing) high levels of interest.

Continue reading “And finally….!”

In the Summer issue

NOTICE: The BCS IRSG is seeking new committee members.  If you are interested to find out more please see details on the nomination and election process.  November 3rd 2022 is the deadline for nominations.

Can I start by highlighting the forthcoming vacancy for an Editor for Informer, to take over at the end of 2022. I might also mention that there will be a number of vacancies on the IRSG Committee for 2023, including the position of Chair as Udo Kruschwitz has served his two year term.

This issue contains two superb reports on the LREC 2022 and  SIGIR 2022 conferences contributed by Dennis Aumiller. I’ve also linked to reflections on ECIR 2022 (Stavanger) by Krisztian Balog and his colleagues that was published in the SIGIR Forum newsletter in June. As well has highlighting some of the papers and themes of the conferences there are many insights in all three reports into the challenges of managing hybrid conferences which should be of interest to anyone involved in a conference, no matter what its size and topic. In addition there is a reminder of the Search Solutions 2022 conference on 23/24 November and a list of forthcoming events diligently compiled by Andy Macfarlane. Of course among these events is ECIR 2023 in Dublin in early April.

There are a number of book reviews in this issue covering taxonomy management, the role of digital technologies (including enterprise search) in supporting knowledge management initiatives, the science of reading and a fascinating autobiography from David Hawking.

Two awards are now open for nominations, the Karen Spark Jones Award from IRSG (sponsored by Microsoft Research) and the UKeIG Tony Kent Strix Award. in which IRSG is a partner.  The deadlines for both are by coincidence 9 September.

Also in this issue there are calls for papers for forthcoming special issues in 2023 of JASIS&T on the subject of information retrieval research and also of the ACM TIOS journal on efficiencies in neural information retrieval. Open Source Connections has released a collection of video presentations on a very wide range of relevance topics and a very challenging paper from Professor Justin Zobell on the questionable value of batch-mode IR testing has just been published in the July issue of SIGIR Forum and deserves the widest possibly readership.

And finally I consider the role and future of Informer.

The copy date for the Autumn issue is 31 October and for the Winter issue (my last as Editor) it is 8 January 2023.

Martin White

Vacancy for an Informer Editor from January 2023

I took over the role as the Editor of Informer in 2019 and have enjoyed the challenge of publishing a quarterly newsletter that in each issue has something of interest to the very varied IRSG. Earlier this year I decided that it was time to hand over the Editorial Desktop as I wanted to gently ease out of my consultancy work on enterprise search and in general anything that had a deadline attached to it. My last issue will be the Winter 2023, which is compiled in December 2022 and published in January 2023. As a result there is a vacancy for an Editor, ideally with whom I could gradually involve in the next three issues. If you would like to volunteer then email Udo Kruschwitz, IRSG Chair. If you would like to find out more about what the Editorship involves then please contact me. Informer uses WordPress as the publishing platform but with a template that is of unknown heritage. As incoming Editor you would have the opportunity to work with the Committee on what the upgrade should be. This will need to emerge from a careful consideration of the role of Informer, especially now that we have upgraded the IRSG web site.

I offer my personal view on the role and future of Informer in my And finally column in this issue.

Martin White

BCS/IRSG Search Industry Awards 2022

The BCS Search Industry Awards recognise people, projects, and organisations that have excelled in the design of search and information retrieval products and services. If you know of any people, projects, or products that deserve recognition, let us know by submitting a nomination. Alternatively, if you’re involved with something special yourself, you can submit an application today.

Categories

This year we are offering five awards:

Continue reading “BCS/IRSG Search Industry Awards 2022”

The Microsoft BCS/BCS IRSG Karen Spärck Jones Award 2022 – Call for Nominations deadline is 9 September

A pioneer of information retrieval, the computer science sub-discipline that also underpins the technology of modern Web search engines, Karen Spärck Jones was a British professor of Computers and Information at the University of Cambridge in Cambridge. Her contributions to the fields of Natural Language Processing (NLP) and Information Retrieval (IR), especially with regard to experimentation, have been outstanding, highly influential and lasting, and include the introduction of Inverse Document Frequency for relevance ranking.

In order to honour Karen’s achievements, the BCS Information Retrieval Specialist Group (BCS IRSG) in conjunction with the BCS has established an annual award to encourage and promote talented researchers who have endeavoured to advance our understanding of Natural Language Processing or Information Retrieval with significant experimental contributions. The Karen Spärck Jones Award is sponsored by Microsoft Research Cambridge

The recipient of the 2022 award will be invited to present a keynote lecture at the European Conference on Information Retrieval (ECIR) in Dublin in April 2023.  This forum provides an excellent venue to present and announce the award as the conference attracts many new and young researchers.

For more details on the criteria for nominations and the nomination schedule go to http://jochenleidner.com/?p=67  Jochen Leidner is the Award Chair and can be contacted at Leidner AT acm.org

 

 

JASIST special issue on information retrieval research

I was personally delighted to see the announcement from ASIS&T about a Special Issue in September 2023 on the topic of research into information retrieval.

To quote from the announcement

“We are looking for contributions that broaden the respective disciplinary, methodological, or empirical perspectives to identify and explore commercial search engines and their use and role in society from new angles, or that bring together different approaches in original ways. In particular, we would like to encourage information science/information studies, broadly understood, to reposition themselves and contribute the discipline’s expertise to shed light on the ever more powerful role of commercial search engines in almost all areas of society and everyday life, influencing not only how we know and what we know, but increasingly also how knowledge and information are created and communicated, to begin with.

Continue reading “JASIST special issue on information retrieval research”

Relevance management – tips, tricks, techniques and tools

When OpenSource Connections started the Haystack conference in 2018, our intention was to bring the search and relevance community together to share tips, tricks, techniques and tools. Although the talks that year weren’t recorded we swiftly realised that we could share them much more widely on video and we made sure to record all the subsequent Haystack events.  Once the pandemic began our events moved online and with the advent of Zoom, recording talks became even easier. Fast forward a few years and our search events are now hybrid events with both in-person attendance and remote audiences watching live from across the globe – video has become an essential, not a nice-to-have. Many other events that our team speak at are also providing recordings of their talks.

We now have nearly 100 videos of talks linked from our Youtube channel at https://www.youtube.com/c/OpenSourceConnections , covering many different search engine platforms, sectors including enterprise search, media search and e-commerce search, techniques from TF/IDF to vector search and practices including building search teams and search measurement. We’re working on curating these into subject-based playlists to make them even more accessible. Speakers include industry luminaries such as Peter Morville and Ellen Vorhees, startup founders from new search engine companies like Weaviate and Tantivy, search experts from companies including LexisNexis, HomeDepot, Otto Group and EBSCO and of course OpenSource Connections. This collection is building into a fantastic knowledge resource for anyone working in the search and relevance space and we’re very proud to host it as part of our mission to Empower Search Teams.

Charlie Hull UK Director OSC

Taxonomies – Practical Approaches to Developing and Managing Vocabularies for Digital Information – Book review

There is probably no more difficult task in information management in being the Editor of a multi-author on any topic, and the level of difficulty goes up by an order of magnitude when the topic is taxonomy management. Taxonomies (and the strap line Practical Approaches to Developing and Managing Vocabularies for Digital Information) has been edited by Helen Lippell, who brings not only many years of experience to the role but also a very strong commitment to ‘getting the message across’. In this she succeeds brilliantly.

Helen has brought together eighteen experienced taxonomy managers with the objective of offering a range of insights and perspectives on taxonomy management. The scope is so broad, and yet so deep in specific topics, as to demand a careful balance of the viewpoints of authors coming from an equally wide range of backgrounds and project experience. This book deliberately focuses on presenting case studies which can be of value in specific situations but also add to a more generic knowledge base about good practice on taxonomy development and implementation. These include Associated Press, Cancer Research, UK Department of Education, Electronic Arts, Getty Images, the Institute of Chartered Accountants for England and Wales, and the National Health Service.

Continue reading “Taxonomies – Practical Approaches to Developing and Managing Vocabularies for Digital Information – Book review”

The Limits of Batch Assessment of Retrieval Systems – Justin Zobell

I had the very good fortune to get to know Cyril Cleverdon towards the end of his distinguished career as Librarian at the Cranfield Institute of Technology and his invaluable work in creating and promoting the Cranfield Projects on information retrieval performance. These Projects formed the basis for the TREC events in the USA. At the time we first met in the early 1970s computers were still somewhat on the distant horizon (especially in the UK) but his insights into the fundamental aspects of information retrieval performance most certainly catalysed my move towards information science and away from chemistry and metallurgy.

I was therefore especially interested to read a paper by Professor Justin Zobell (University of Melbourne) in the Summer issue of SIGIR Forum entitled When Measurement Misleads: The Limits of Batch Assessment of Retrieval Systems

Continue reading “The Limits of Batch Assessment of Retrieval Systems – Justin Zobell”

The Science of Reading – Book review

You may be somewhat surprised to see a review of a book on the science of reading in Informer. It seems to be implicit in information retrieval research that ‘users’ have such a competence in reading that a consideration of reading ability can be discounted from the research analysis. If only that was the case! The reality is that perhaps one in ten employees is on the dyslexia spectrum and in global businesses many employees are writing text and submitting search queries in a second language. I am especially interested in the issue of perceptual speed on the evaluation of search results. Even something as basic as line length needs careful consideration of how people read digital content.It may therefore come as surprise that the scope and depth of research into reading can occupy a book of almost 600 pages with contributions from 52 authors. I have found my journey through the 2nd Edition of The Science of Reading one of many significant discoveries, shining light onto issues that I had not even considered before reading the book.

Continue reading “The Science of Reading – Book review”

Call for nominations for the UKeiG Tony Kent Strix Award 2022 – Deadline 9 September

The Tony Kent Strix Award was inaugurated in 1998 by the Institute of Information Scientists. It is now presented by UKeiG in partnership with the International Society for Knowledge Organisation UK (ISKO UK), the Royal Society of Chemistry Chemical Information and Computer Applications Group (RSC CICAG) and the British Computer Society Information Retrieval Specialist Group (BCS IRSG). The Award is given in recognition of an outstanding practical innovation or achievement in the field of information retrieval in its widest sense. This could take the form of an application or service, or an overall appreciation of past achievements that have led to significant advances. The award is open to individuals or groups from anywhere in the world. There are profiles of the award winners since 2009 on the CILIP web site.

Continue reading “Call for nominations for the UKeiG Tony Kent Strix Award 2022 – Deadline 9 September”