hi

Informer in 2023

For most of its history Informer has been published four times a year. In 2023 the plan is to publish just two issues because efforts to find someone to take over the editorship have not been successful. I have agreed to continue to act as Editor but because of other commitments I can only commit to two issues. One of these commitments is specifying a new publishing platform for Informer – if you have any suggestions from your own experience please let me know. I must emphasis the word ‘experience’.

There will be a Spring issue published in late April 2023 that will provide an initial report on ECIR 2023 and a detailed report on Search Solutions 2022. There will be an Autumn issue published in late October that will set out the programme for the Search Solutions Tutorials and Conference in late November.

If a new Editor does come forward then it may be possible to publish additional issues in 2023. We will review the situation towards the end of 2023 and very much hope that it will be possible to revert to the quarterly cycle, or even to a completely different publishing model if someone is willing to take on the task.

Martin White

Search Solutions Conference 2022 – London, 23 November

Search Solutions 2022 will take place at the BCS London office at 25 Copthall Avenue London, EC2R 7BP, which is a 10 minute walk from either Liverpool Street or Moorgate Underground stations.

Session 1: The Search Experience: Focus on the users

  • 10:00 – 10:15 Introduction
  • 10:15 – 10:45 Natasha den Dekker (LexisNexis) “How to conduct empathetic user research to test the search experience of users?”
  • 10:45 – 11:15 Amy Walduck (State Library of Queensland) “The Topography of Searching: Visualising search data“

11:15 – 11:45 Break

Session 2: Beyond keyword search: Semantic/conversational/audio search

  • 11:45 – 12:15 Brammert Ottens (Spotify) “Finding the Right Audio Content for You”
  • 12:15 – 12:45 Mohamed Yahya (Bloomberg) “Taking Question Answering from Research Prototype to Product”
  • 12:45 – 13:15 Filip Radlinski (Google) “Challenges with Really Understanding Natural Language in Conversational Recommendation”

13:15 – 14:15 Lunch

Session 3: Search with an impact: Searching health-related information

  • 14:15 – 14:45 Farhad Shokraneh (Institute of Health Informatics, University College London) “The Futures of Systematic Searching”
  • 14:45 – 15:15 Gavin Moore & Andrew Doyle (University Hospitals Coventry & Warwickshire NHS Trust) “A Programmable Search – A Solution to Finding Guidelines and Patient Information?”

15:15 – 15:30 Break

Session 4: A world beyond web search: enterprise search

  • 15:30 – 16:00 Julien Massiera & Cedric Ulmer (France Labs) “Combining Spacy with Datafari Community Edition to enable semantic Enterprise Search”
  • 16:00 – 16:30 Phil Lewis (Pureinsights) “Practical Applications of Knowledge Graphs and AI in Search”
  • 16:30 – 17:00 Lightning Talks (feel free to step up and present YOUR five-minute talk)

17:00 – 17:30 Our traditional ‘fishbowl’ session

17:30 – 17:45 BCS Search Industry Awards

17:45  Drinks reception/ BCS-IRSG AGM starts at 18:00

Registration

BCS Members: £92.00
Non-Members: £110.00
Students: £80.00

https://www.eventbrite.co.uk/e/search-solutions-2022-inc-tutorials-information-retrieval-sg-tickets-433482236037

 

Search Solutions Tutorials 2022 – London 22 November

The Search Solution Tutorials will take place at the BCS London office at 25 Copthall Avenue, London, EC2R 7BP. This is around a 10 minute walk from either Liverpool Street or Moorgate underground stations.

Tutorial 1 – Full day
IR From Bag-of-words to BERT and Beyond through Practical Experiments

  • Sean MacAvaney (University of Glasgow)
  • Craig Macdonald (University of Glasgow)
  • Nicola Tonellotto (University of Pisa)

Tutorial 2 – Morning
Approaching Neural Search with Apache Solr and Open-source technologies

  • Alessandro Benedetti (CEO @ Sease Ltd, Apache Lucene/Solr Committer, Apache Solr PMC Member)

Tutorial 3 – Afternoon
Simplifying NLP researchers work with Datafari Open Source

  • Julien Massiera (France Labs)
  • Cedric Ulmer (France Labs)

Tutorial 4 – Full Day
Diverse Approaches to Systematic Searching

  • Dr Farhad Shokraneh (Institute of Health Informatics, University College London)

Ticket costs

(Prices quoted include VAT and booking fees.)

Tuesday 22 November 2022 – Tutorials

BCS Members: £80.00
Non-Members: £95.00
Students: £65.00

https://www.eventbrite.co.uk/e/search-solutions-2022-inc-tutorials-information-retrieval-sg-tickets-433482236037

Refunds/Cancellations

We will issue a refund, excluding fees, if a cancellation is received within 14 days of the booking date or by noon on Monday 21 November 2022; otherwise, name substitutions will be allowed after this date.

BCS is a membership organisation. If you enjoy this event, please consider joining BCS. You’ll be very welcome. You’ll receive access to many exclusive career development tools, an introduction to a thriving professional community and also help us Make IT Good For Society. Join BCS today.

For overseas delegates who wish to attend the event, please note that BCS does not issue invitation letters.

COVID-19

BCS is following government guidelines and we would ask attendees to continue to also follow these guidelines. Please go to https://www.nhs.uk/conditions/coronavirus-covid-19/ for more information, advice, and instructions.

 

Perspectives on the EU Horizon DoSSIER Project

In this feature item there are three excellent contributions from members of the DoSSIER project. DoSSIER is an acronym for an EU Horizon 2020 ITN/ETN on Domain Specific Systems for Information Extraction and Retrieval

There are three contributions from members of the project team
  • A Summary of the First DoSSIER Training School, which took place in September 2022 and is jointly authored by Florina Piroi, Mike Salampasis, and Allan Hanbury
  • A sub-project on the exploration of ‘relevance’ by Geirgios Peikos, an early-stage researcher in the project team
  • The application of machine learning in the healthcare and biomedical domain by Wojciech Kus

Continue reading “Perspectives on the EU Horizon DoSSIER Project”

The evolution of Datafari, a European open source enterprise search application – Cedric Ulmer CEO

I have recently been asked for a customer to deliver an analysis of available open source solutions as a replacement to a decade old proprietary enterprise search solution. In this article, I wanted to share the outcome of this analysis, and to give an outlook of how we perceive the future of such solutions. As a disclaimer, be aware that I am the CEO and cofounder of France Labs, the company developing Datafari, and this is important since Datafari was part of the solutions analyzed, and obviously this puts a potential bias on my considerations.

As an introduction, let me define what we mean by Enterprise Search in this particular context: it is about proposing a solution that can index many document sources within an information system, without prior knowledge about the working context (and yes, this is bad for UI optimization and relevancy), that allows employees to type in textual queries in a search bar, and that displays the search results as a vertical list together with facets. In addition, this must be done securely, which means it should come with connectivity to AD/LDAP for authentication, and with the capacity to respect the documents level permissions when users are searching (i.e. only display what users are allowed to see). Orthogonally to these functional aspects, Enterprise Search solutions must also provide administration capabilities, in terms of configuration and exploitation.

Continue reading “The evolution of Datafari, a European open source enterprise search application – Cedric Ulmer CEO”

Carlos Cuadra 1925-2022 A pioneer in commercial IR service development

Carlos Cuadra died in August this year. I doubt many readers of Informer will recognize the name but Carlos was a remarkable innovator in information retrieval in the 1960s and 1970s. Almost all his development work was carried out in a commercial environment, notably at System Development Corporation when it was spun out of RAND in 1957 and then at Cuadra Associates from 1978. His early work at SDC was on the development of question-answering applications for the Los Angeles Police Force. In 1979 Carlos released STAR, the first multi-tasking/multi-user information retrieval application that could be run on a PC. It was (and indeed is!) widely used as an archive management solution.

A measure of the contribution that Carlos made to the development of on-line information retrieval applications is that in the definitive A History of Online Information Services 1963-1976 there are more items in the index relating to Carlos than any other individual.

Continue reading “Carlos Cuadra 1925-2022 A pioneer in commercial IR service development”

The Book of Search – Book Review

Spoiler alert – this is a review of a book published in 2006!

First a short history lesson.  FAST Search and Transfer was founded in 1997 to commercialize search technology developed at the University of Trondheim. The subsequent history of FAST is difficult to untangle as the company seemed adept at delivering some very neat search technology and raising question-marks about its accounting practices. The end-game was the acquisition of the company by Microsoft in 2008 for $1.2 billion, followed by an inquest about the extent of the due diligence that was mirrored a few years later with HP’s acquisition of Autonomy. The core of the company lives on as the Microsoft Technology Centre in Oslo and a global group of alumni who worked for the company and left to either work for other companies or start their own business.

FAST was also very good at marketing. One of the outcomes was The Book of Search published in 2006. The quality of the content and the presentation of this 142pp book are both exceptional. To quote from the Introduction

Continue reading “The Book of Search – Book Review”

A History of Enterprise Search 1938-2022

If you will exclude some self-promotion from the Editor, last month my book A History of Enterprise Search 1938-2022 was published under the PressBooks imprint of the University of Sheffield.

If you have cause to doubt that there were any enterprise search initiatives in 1938 then you should certainly read this book. Enterprise search has been around for a very long time.

https://sheffield.pressbooks.pub/eshistory1/ will take you to the PressBooks site from where you can download both a pdf and e-Pub file under a Creative Commons License.

For more about PressBooks (it is based on WordPress) and my experience of using it read the story on the Library web site

Martin White

Books for search researchers and search managers

For some time now I have maintained a list of books on the Intranet Focus web site that I think would be of interest to search researchers and search managers. I am in the process of rebuilding the web site and so many of the internal links to reviews of the books will no longer exist. I have now revised the list so that the links are to the web sites of the publishers and so should stand the test of time.

If there are any books that you feel should be included, please email me with your suggestions.

The list is in reverse chronological order of publication, going back to 2010. There are of course many earlier books, notably Introduction to Information Retrieval by Manning, Raghavan and Schutze but I had to stop somewhere. The earliest book on my bookshelf is An Introduction to Modern Information Retrieval by none other than Gerard Salton and Michael McGill, published in 1986.

Continue reading “Books for search researchers and search managers”

And finally….!

As I undertake my daily scanning of the research papers on information retrieval, I continue to be impressed with the level of innovation that I see. Clearly significant advances are being made into many areas. My concern is how these advances are going to feed into search products that are going to make a commensurate advance in the findability experienced by employees in their daily work. To use the metaphor of building a house, there seem to be endless suggestions for the design of kitchens, bedrooms, patios and bathrooms but no one is considering whether there is any combination that makes a house a home and takes into account that people have different visions for what they regard as an acceptable compromise taking into account their long terms plans and the current (and continuing) high levels of interest.

Continue reading “And finally….!”

In the Summer issue

NOTICE: The BCS IRSG is seeking new committee members.  If you are interested to find out more please see details on the nomination and election process.  November 3rd 2022 is the deadline for nominations.

Can I start by highlighting the forthcoming vacancy for an Editor for Informer, to take over at the end of 2022. I might also mention that there will be a number of vacancies on the IRSG Committee for 2023, including the position of Chair as Udo Kruschwitz has served his two year term.

This issue contains two superb reports on the LREC 2022 and  SIGIR 2022 conferences contributed by Dennis Aumiller. I’ve also linked to reflections on ECIR 2022 (Stavanger) by Krisztian Balog and his colleagues that was published in the SIGIR Forum newsletter in June. As well has highlighting some of the papers and themes of the conferences there are many insights in all three reports into the challenges of managing hybrid conferences which should be of interest to anyone involved in a conference, no matter what its size and topic. In addition there is a reminder of the Search Solutions 2022 conference on 23/24 November and a list of forthcoming events diligently compiled by Andy Macfarlane. Of course among these events is ECIR 2023 in Dublin in early April.

There are a number of book reviews in this issue covering taxonomy management, the role of digital technologies (including enterprise search) in supporting knowledge management initiatives, the science of reading and a fascinating autobiography from David Hawking.

Two awards are now open for nominations, the Karen Spark Jones Award from IRSG (sponsored by Microsoft Research) and the UKeIG Tony Kent Strix Award. in which IRSG is a partner.  The deadlines for both are by coincidence 9 September.

Also in this issue there are calls for papers for forthcoming special issues in 2023 of JASIS&T on the subject of information retrieval research and also of the ACM TIOS journal on efficiencies in neural information retrieval. Open Source Connections has released a collection of video presentations on a very wide range of relevance topics and a very challenging paper from Professor Justin Zobell on the questionable value of batch-mode IR testing has just been published in the July issue of SIGIR Forum and deserves the widest possibly readership.

And finally I consider the role and future of Informer.

The copy date for the Autumn issue is 31 October and for the Winter issue (my last as Editor) it is 8 January 2023.

Martin White

Vacancy for an Informer Editor from January 2023

I took over the role as the Editor of Informer in 2019 and have enjoyed the challenge of publishing a quarterly newsletter that in each issue has something of interest to the very varied IRSG. Earlier this year I decided that it was time to hand over the Editorial Desktop as I wanted to gently ease out of my consultancy work on enterprise search and in general anything that had a deadline attached to it. My last issue will be the Winter 2023, which is compiled in December 2022 and published in January 2023. As a result there is a vacancy for an Editor, ideally with whom I could gradually involve in the next three issues. If you would like to volunteer then email Udo Kruschwitz, IRSG Chair. If you would like to find out more about what the Editorship involves then please contact me. Informer uses WordPress as the publishing platform but with a template that is of unknown heritage. As incoming Editor you would have the opportunity to work with the Committee on what the upgrade should be. This will need to emerge from a careful consideration of the role of Informer, especially now that we have upgraded the IRSG web site.

I offer my personal view on the role and future of Informer in my And finally column in this issue.

Martin White

BCS/IRSG Search Industry Awards 2022

The BCS Search Industry Awards recognise people, projects, and organisations that have excelled in the design of search and information retrieval products and services. If you know of any people, projects, or products that deserve recognition, let us know by submitting a nomination. Alternatively, if you’re involved with something special yourself, you can submit an application today.

Categories

This year we are offering five awards:

Continue reading “BCS/IRSG Search Industry Awards 2022”

The Microsoft BCS/BCS IRSG Karen Spärck Jones Award 2022 – Call for Nominations deadline is 9 September

A pioneer of information retrieval, the computer science sub-discipline that also underpins the technology of modern Web search engines, Karen Spärck Jones was a British professor of Computers and Information at the University of Cambridge in Cambridge. Her contributions to the fields of Natural Language Processing (NLP) and Information Retrieval (IR), especially with regard to experimentation, have been outstanding, highly influential and lasting, and include the introduction of Inverse Document Frequency for relevance ranking.

In order to honour Karen’s achievements, the BCS Information Retrieval Specialist Group (BCS IRSG) in conjunction with the BCS has established an annual award to encourage and promote talented researchers who have endeavoured to advance our understanding of Natural Language Processing or Information Retrieval with significant experimental contributions. The Karen Spärck Jones Award is sponsored by Microsoft Research Cambridge

The recipient of the 2022 award will be invited to present a keynote lecture at the European Conference on Information Retrieval (ECIR) in Dublin in April 2023.  This forum provides an excellent venue to present and announce the award as the conference attracts many new and young researchers.

For more details on the criteria for nominations and the nomination schedule go to http://jochenleidner.com/?p=67  Jochen Leidner is the Award Chair and can be contacted at Leidner AT acm.org

 

 

JASIST special issue on information retrieval research

I was personally delighted to see the announcement from ASIS&T about a Special Issue in September 2023 on the topic of research into information retrieval.

To quote from the announcement

“We are looking for contributions that broaden the respective disciplinary, methodological, or empirical perspectives to identify and explore commercial search engines and their use and role in society from new angles, or that bring together different approaches in original ways. In particular, we would like to encourage information science/information studies, broadly understood, to reposition themselves and contribute the discipline’s expertise to shed light on the ever more powerful role of commercial search engines in almost all areas of society and everyday life, influencing not only how we know and what we know, but increasingly also how knowledge and information are created and communicated, to begin with.

Continue reading “JASIST special issue on information retrieval research”

Relevance management – tips, tricks, techniques and tools

When OpenSource Connections started the Haystack conference in 2018, our intention was to bring the search and relevance community together to share tips, tricks, techniques and tools. Although the talks that year weren’t recorded we swiftly realised that we could share them much more widely on video and we made sure to record all the subsequent Haystack events.  Once the pandemic began our events moved online and with the advent of Zoom, recording talks became even easier. Fast forward a few years and our search events are now hybrid events with both in-person attendance and remote audiences watching live from across the globe – video has become an essential, not a nice-to-have. Many other events that our team speak at are also providing recordings of their talks.

We now have nearly 100 videos of talks linked from our Youtube channel at https://www.youtube.com/c/OpenSourceConnections , covering many different search engine platforms, sectors including enterprise search, media search and e-commerce search, techniques from TF/IDF to vector search and practices including building search teams and search measurement. We’re working on curating these into subject-based playlists to make them even more accessible. Speakers include industry luminaries such as Peter Morville and Ellen Vorhees, startup founders from new search engine companies like Weaviate and Tantivy, search experts from companies including LexisNexis, HomeDepot, Otto Group and EBSCO and of course OpenSource Connections. This collection is building into a fantastic knowledge resource for anyone working in the search and relevance space and we’re very proud to host it as part of our mission to Empower Search Teams.

Charlie Hull UK Director OSC

Taxonomies – Practical Approaches to Developing and Managing Vocabularies for Digital Information – Book review

There is probably no more difficult task in information management in being the Editor of a multi-author on any topic, and the level of difficulty goes up by an order of magnitude when the topic is taxonomy management. Taxonomies (and the strap line Practical Approaches to Developing and Managing Vocabularies for Digital Information) has been edited by Helen Lippell, who brings not only many years of experience to the role but also a very strong commitment to ‘getting the message across’. In this she succeeds brilliantly.

Helen has brought together eighteen experienced taxonomy managers with the objective of offering a range of insights and perspectives on taxonomy management. The scope is so broad, and yet so deep in specific topics, as to demand a careful balance of the viewpoints of authors coming from an equally wide range of backgrounds and project experience. This book deliberately focuses on presenting case studies which can be of value in specific situations but also add to a more generic knowledge base about good practice on taxonomy development and implementation. These include Associated Press, Cancer Research, UK Department of Education, Electronic Arts, Getty Images, the Institute of Chartered Accountants for England and Wales, and the National Health Service.

Continue reading “Taxonomies – Practical Approaches to Developing and Managing Vocabularies for Digital Information – Book review”

The Limits of Batch Assessment of Retrieval Systems – Justin Zobell

I had the very good fortune to get to know Cyril Cleverdon towards the end of his distinguished career as Librarian at the Cranfield Institute of Technology and his invaluable work in creating and promoting the Cranfield Projects on information retrieval performance. These Projects formed the basis for the TREC events in the USA. At the time we first met in the early 1970s computers were still somewhat on the distant horizon (especially in the UK) but his insights into the fundamental aspects of information retrieval performance most certainly catalysed my move towards information science and away from chemistry and metallurgy.

I was therefore especially interested to read a paper by Professor Justin Zobell (University of Melbourne) in the Summer issue of SIGIR Forum entitled When Measurement Misleads: The Limits of Batch Assessment of Retrieval Systems

Continue reading “The Limits of Batch Assessment of Retrieval Systems – Justin Zobell”

The Science of Reading – Book review

You may be somewhat surprised to see a review of a book on the science of reading in Informer. It seems to be implicit in information retrieval research that ‘users’ have such a competence in reading that a consideration of reading ability can be discounted from the research analysis. If only that was the case! The reality is that perhaps one in ten employees is on the dyslexia spectrum and in global businesses many employees are writing text and submitting search queries in a second language. I am especially interested in the issue of perceptual speed on the evaluation of search results. Even something as basic as line length needs careful consideration of how people read digital content.It may therefore come as surprise that the scope and depth of research into reading can occupy a book of almost 600 pages with contributions from 52 authors. I have found my journey through the 2nd Edition of The Science of Reading one of many significant discoveries, shining light onto issues that I had not even considered before reading the book.

Continue reading “The Science of Reading – Book review”

Call for nominations for the UKeiG Tony Kent Strix Award 2022 – Deadline 9 September

The Tony Kent Strix Award was inaugurated in 1998 by the Institute of Information Scientists. It is now presented by UKeiG in partnership with the International Society for Knowledge Organisation UK (ISKO UK), the Royal Society of Chemistry Chemical Information and Computer Applications Group (RSC CICAG) and the British Computer Society Information Retrieval Specialist Group (BCS IRSG). The Award is given in recognition of an outstanding practical innovation or achievement in the field of information retrieval in its widest sense. This could take the form of an application or service, or an overall appreciation of past achievements that have led to significant advances. The award is open to individuals or groups from anywhere in the world. There are profiles of the award winners since 2009 on the CILIP web site.

Continue reading “Call for nominations for the UKeiG Tony Kent Strix Award 2022 – Deadline 9 September”

Making knowledge management clickable – Book review

There are many books about knowledge management but very few about what I might refer to as IKM (Information and Knowledge Management) technologies. What is distinctive about this book is that it spans the crevasse between KM and IT and does so with considerable flair. The authors, Zach Wahl and  Joseph Hilger established Enterprise Knowledge (based in Arlington VA) close to decade ago and since then have built a company with a very high reputation for innovative approaches to solving KM challenges. To quote from the Preface of Making Knowledge Management Clickable, their book “bridges the gap between knowledge management and technology. It embraces the complete lifecycle of knowledge, information, and data from how knowledge flows through an organization to how end users want to handle it and experience it”. The strap line is an excellent summary of the intention of the book – Knowledge Management Systems Strategy, Design and Implementation.

Continue reading “Making knowledge management clickable – Book review”

SIGIR 2022 Annual Conference – a report from Dennis Aumiller

Another month, another conference back in person! The ACM SIGIR 2022 conference – the special interest group’s premier event – was back in person for the first time since 2019, and for the first time ever made its stop in beautiful and sunny Spain. Hosted by general chairs Enrique Amigó, Pablo Castells and Julio Gonzalo, the weather certainly left nothing to be desired in sunny Madrid in the period from 11-15 July. With record temperatures of over 40°C (104°F) reached during the conference week, it was certainly a blessing to enjoy some climatized breezes in the conference venue, Madrid’s historic Círculo de Bellas Artes. Impressing with majestic statues watching over the marble staircase, participants got to feel some of Madrid’s vivid history.

But aside from the hot weather, the organizers had a series of issues to worry about, quipped organizer Pablo Castells during his opening remarks: Record summer temperatures, re-emerging Covid waves, nearby construction sites, as well as overloaded airlines were all too real concerns that made them originally question whether physical attendance would even make up a significant portion in the first hybrid SIGIR event.

Continue reading “SIGIR 2022 Annual Conference – a report from Dennis Aumiller”

Language Resources and Evaluation Conference (LREC 2020) – a report from Dennis Aumiller

With a delay of over two years, attendees of the bi-annual Language Resources and Evaluation Conference 2022 (LREC) finally arrived in sunny Marseille. After originally scheduled for May 2020, but then being canceled due to travel restrictions and nation-wide lockdowns, the organizers decided to revive the original conference location and agreed on the currently predominant setting of a hybrid conference with both physical and virtual attendance. As conference venue, the pompous Palais du Pharo was chosen, overlooking the charming ‘vieux port’ of Marseille, which made for an impressive backdrop during the numerous coffee breaks. Its central location also made it easy to move post-conference discussions to nearby restaurants and bars in the close-by port area.

Continue reading “Language Resources and Evaluation Conference (LREC 2020) – a report from Dennis Aumiller”