This past November 30th 2016, the British Computer Society hosted the Search Solutions forum at its London Offices.
It was divided into 5 sessions with the following themes: 1. Understanding users and context, 2. Moving towards question-answering, 3. Beyond web search, 4. New modes of search, and 5. Panel session. Particularly interesting, was the panel session, in which attendees and participants alike had a discussion on the possible reasons talent in Information Retrieval (IR) was so hard to find and not matching the industry’s demand. This article will attempt to summarize each of the 4 sessions preceding the panel; using issues and solutions that arose during the panel as a framework to structure the summary. First, I will attempt to summarise and list the panel’s main talking points. Afterwards I will map the talks given at the forum to these points, and will conclude with a personal take on these issues.
The panel session’s discussion was taken over by the unmet IR talent demand in the industry, observed (by some if not all of the attendees). This problem was also described as a lack of interest in IR areas by working technology professionals and students. After describing the problem this way, the discussion yielded the following possible causes:
“Branding” for IR is weak; evident in how some of the academics in the panel mentioned that renaming their courses yielded a significant increase in enrolled students.
Companies have failed to give enough merit to IR responsibilities; evident in how there is no known IR-only role recognized by the industry.
“Cost-of-entry” into IR is high; evident in how there are not many beginner-oriented tutorials and workshops for learning common IR techniques and tools.
Regarding cause 1, professors in the panel session assured how there is still no proper way to name IR courses, and how renaming of courses from “Search …” into “Information Retrieval …” yielded some results in attracting students. Experimenting with more mainstream or “buzzworthy” terms was brought up as a suggestion. On the other hand, cause 2 is completely directed at the industry. During the panel, it was suggested that IR tasks are often handed over to some other role in a company’s tech department as secondary responsibilities. This ends up making IR-related skills look unimportant and secondary to tech professionals and students when choosing what to learn next. Also, it can be argued that when a specific set of skills and tasks are considered secondary, the quality of work done to solve them will probably be less. Finally, regarding cause 3, Charlie Hull used the Lucene4IR workshop as an example of the kind of events the IR community should be organizing to make the IR learning path more inviting to new students and existing tech professionals.
First, the “branding” problem identified in the panel was covered indirectly before the panel by Jon Brassey from Trip Database. Brassey described his experience improving search in a clinical search engine. From this experience, Jon explained an insight he had: incorporating data visualisation into IR yields beautiful attractive graphs and figures (i.e. graphs showing clickstream data), as well as a better understanding of data. Data visualisation, been a more popular field, can improve branding of IR when combined. Also regarding branding, Sessions 2 and 3 had presentations around the topic of conversational interfaces: Graham Digby from Lexis Nexis and Vassilis Plachouras from Thomson-Reuters. They talked about domain-specific question and answering and interacting with financial data by generating natural language respectively. Branding-wise, the IR community can take advantage of the chatbot and Personal Assistant current trend to structure common IR courses around these. Finally, research in IR is often very interesting and easy to promote. An example of this is Frederic Fol Leymarie from DynAikon Ltd, whose presentation was recognised as the best presentation of the whole event, and should definitely be shared more by the IR community.
Second, was the issue about not enough recognition been given to IR skills by the industry. The first session, which was mostly centered around the users of IR applications, covered evaluation of Ad Quality Ranking through two metrics: implicit (dwell time) and explicit feedback. A user-centered approach to evaluating IR techniques would yield more accurate representations of the impact it can have in a business therefore justifying the hiring of IR-only professionals. During the panel, this was considered a big problem because through anecdotal evidence, it was shown that businesses often don’t understand the added value of improving, as an example, search in their site, even when it is an ecommerce site where search has a direct impact in sales.
Finally, another issue, covering both of the aforementioned points was the big cost-of-entry into IR. Charlie Hull from Flax, during the Panel and his presentation in Session 3 encouraged attendees to organise and support more IR workshop events like Lucene4IR. Also, relevant to this issue, was Digby’s Q/A presentation in Session 2. During his presentation he explained the iterative approach followed for the development of a Q/A system. This sort of iterative development fits nicely with these workshop events. Teaching workshops helping people to start with IR, leveraging the fact that it can be iteratively added into a business, will allow people to see benefits incrementally reducing the impact of this high cost-of-entry.
In summary, during the panel session at Search Solutions 2016, attendees listed as possible reasons why there is a supply of IR talent that doesn’t match its demand: IR branding, importance assigned by the industry to IR tasks, and high cost-of-entry into the area. The presentations given at the forum, addressed partially if not completely in some way these issues. User-centered evaluation of IR tasks can help showcasing its importance to businesses. Also, combining existing trends with courses and workshops focused on teaching the basic building blocks of IR, can help attract more professionals and students into the area.
On a personal note, during the panel, the broadness of the IR field was brought up. I consider IR to be somewhat similar to web development (also very broad with many techniques and technologies to learn), if web development had a really active research community. However, unlike web development, IR is not a field in which most aspiring computer engineers and scientists work at least once during their academic careers. The main reason for this, I think, is that web development learning resources match this group’s interests perfectly while IR learning resources are more business oriented. I suggest that, to solve the “branding” issue IR has, IR fundamentals are taught through applications that students can apply to their own side projects (i.e. chatbots, Amazon Alexa skills, Temporal Expressions search in instant messages streams, etc…). This is how me and my classmates got into IR in the first place, and how I think the IR community can grow.