ECIR 2017 Industry Day Review

Industry Day – a (now) traditional feature of ECIR

The tradition of closing out ECIR with industry day continued for its 11th year in Aberdeen, Scotland.  This year’s event was co-organized and moderated by Udo Kruschwitz and Tony Russell-Rose.  For those who made the trip up north, a potpourri of exciting applications of academic research in IR and NLP was presented, including event detection and analytics at major news media outlets to methods of retrieval and identification of non-factual news.    A “Fishbowl” discussion, including ways to run future industry day sessions, provided the official end to ECIR…with the unofficial end happening in the wee hours over Scottish Whisky.

 

Session one focussed on media providers.   Peter Mika, with Schibsted Media Group, discussed  the importance and difficulties of balancing user preferences (e.g. I only want this type of news) and editorial control (e.g. all readers should receive this news) of information passed to readers.  Edgar Meij, of Bloomberg, shared that rule-based approaches are still vital to their terminal product, that all articles published are manually annotated by authors and discussed the importance of handling multi-language queries for customers.   James McMinn of Scoop Analytics closed out the session with demonstrating how his company (founded with fellow researchers at Glasgow University) provides early detection of breaking news to customers through monitoring of Twitter, their product was able to detect the latest terrorist attack in Sweden 5 minutes before it was first reported in major news outlets.

The second session demonstrated applications of information retrieval methods on government documents.   Michaela Regneri, with OTTO, mainly discussed the learnings of her work with newsleak.io, which produced a pipeline to better assist journalists with the task of sifting through leaked information such as massive sets of government documents.   Richard Boulton with UK’s Government Digital Service, presented the latest ongoing work to build a taxonomy, which will further improve the IR experience for users of GOV.UK (NOTE: it’s already excellent compared to other state sites such as USA.gov). He discussed efforts at a recent “tagathon” as a method to get multiple government agencies to work together and improve content for citizens.

Ehud Reiter on data to text: an excellent addition to the more IR-focussed talks

The post lunch session demonstrated methods developed locally in Aberdeen to convert numerical data to text and methods developed in Dublin for argument analysis.   Ehud Reiter co-founder of  Arria NLG and professor at Aberdeen University demonstrated work that produces better textual weather forecasts than actual meteorologists.  Elizabeth Daly IBM Research Ireland, presented work from “The debater challenge” which aims to identify arguments and claims in unstructured text. Her demonstration of automatically extracted pro and con arguments for vegetarianism was the perfect segue to the last session on fake news and factchecking.

Democracy and other areas, such as science, are being undermined by non-factual claims and news. Thus, the last session on fake news and automated factchecking, while weakly attended, was arguably the most important work presented of the day.   Will Moy from Full Fact and Charlie Hull from Flax combined their presentations into one show.   Will presented “the why” we need to fact check statements (e.g. from politicians) and Charlie presented open source solutions to demonstrate “the how” we can perform fact checking.  Their organizations wo

A captive audience

rked jointly in a recent hackathon sponsored by Google to implement a pipeline to identify repeated claims by politicians as well as identification and normalization of numbers (e.g. 1/2, half, 50%).  Will closed out the session with a demo of  real-time factchecking of BBC television feeds.

 

We wrapped up the day with a brief fishbowl discussion.  The topic of the links between academia and industry was our focus initially, with though

t provoking opinions made 1) That academics often have difficulties commercializing their work due to having many barriers (such as lack of business experience) 2) a significant portion of research is not commercializable and 3) it is important for academics to see how work is put to use in industry and understand some of the issues faced.    We then transitioned into a period of reflection on the session overall and ways to improve it.

Wrapping up a very long conference

Similar to the main conference, the industry day event was less well attended than in previous years.  The remoteness of the conference may have been a contributing factor to this matter.  Nevertheless, the presentations were excellent and a treat for those making the effort to attend.  Ideas for future conferences include having industry sessions on the same day as academic sessions (perhaps in parallel or split into two half day sessions).

 

About Steven Zimmerman
Steven Zimmerman

Steven Zimmerman is a PhD student at the University of Essex. His research interests are in the areas of IR, NLP and event prediction. He is currently involved with the Human Rights Big Data and Technology project and is focussed on development of methods to better address discrimination, xenophobia and freedom of speech in online media.

Leave a Reply

You must be logged in to post a comment.