Conference Review: Search Solutions 2014

Search Solutions 2014 took place on November 27th at the BCS headquarters in Covent Garden. The event attracted a slightly larger crowd than in recent years. In fact, Search Solutions 2014 was completely booked out this year with 80 attendees from a wide range of different companies, research labs and universities. As in previous years, the event was very intimate, which is an appeal for many of the attendees. There were 5 sessions throughout the day with 2-3 speakers each. The pre-lunch sessions focused on more theoretical areas of web and enterprise search. After lunch, the conference transitioned to a broad range of real world applications.

Morning Sessions: The morning session started with the talks from major web search companies on web search challenges. The second morning session focused on open-source search to build  efficient and scalable search applications.

The first morning session started with a presentation given by Peter Mika from Yahoo! on “Semantic Search at Yahoo”. The first part of his talk focused on reviewing the history of semantic search at Yahoo! as well as on developments across the broader industry. Peter described main tasks of semantic search, such as, information extraction, information tracking, query understanding, query understanding, etc. The second part of the presentation highlighted the potential of semantic search on different search tasks, followed by new applications and research challenges in mobile search and task completion.

The second speaker, Christopher Semturs from Google, introduced the audience to traditional web search which focused on matching strings to results. Then, the presenter talked about aims of modern search approaches which actually aim at understanding the content of web pages, and provide answers to user requests. After that, the practical talk described the infrastructure component which enables search “from strings to things” as well as the techniques to convert web pages to knowledge (a side note: Google’s Knowledge graph was reported to now contain 530 million entities and 40 billion facts).

The last presenter of this session was Katja Hofmann from Microsoft Research who talked about “Learning to interact”. Her talk was about self-learning search solutions, such as contextual bandits, counterfactual reasoning and online learning to rank, to automatically improve search performance by exploring the user interaction. Katja gave an overview of the key challenges in this area, such as research on measuring reward and new learning mechanisms.

A row full of keynote speakers

The second morning session of Search Solutions gave interesting introductions into open-source search to build a scalable and effective search engine. Tom Mortimer from Flax provided us with the performance comparison between two leading open source search servers namely Apache Lucene/Solr and Elasticsearch respectively. The presenter described studies on examining indexing and search performance for various index sizes and query complexities. One of the interesting points is that Solr was 20 times faster than Elasticsearch for a study of filtering millions of documents, and 4 times faster for a project requiring geospatial filtering.

Following this, Iadh Ounis and Craig Macdonald from the University of Glasgow, provided an informative talk about the Terrier search engine platform. In the first part of this talk, the presenters focused on the latest additions, such as, real-time search functionality and state-of-the-art supervised learning capabilities (e.g., LambdaMART) of the latest Terrier v4.0 release. In the second part, the presenters described how to use these additions for new search scenarios, such as, helping doctors find patients with similar medical conditions, smart cities and venue suggestions (i.e., ‘Entertain Me’).

Afternoon Sessions: The afternoon sessions focused more on enterprise applications of search as well as the end user experience.

The first afternoon session included a potpourri of speakers representing major institutions in the UK with a focus in applications of search. Dan Jackson from UCL started the session with his presentation “Implementing Website Search in a Major University”. His very practical talk was as much about untangling the existing UCL website search system as it was about designing a new one. His talk demonstrated the importance of incorporation of user experience into the design process.

Following this, Richard Boulton from the Cabinet Office presented “Tailoring search across government content”. Richard’s topic showed the importance of data collection during the user experience for a centralized government search solution, for instance tracking what links people follow and how many times have they visited a page so that they can improve the experience.

Sample analytics derived from .gov log data

The final presenter for this session was Dominic Oldman from the British Museum; presenting “Contextual Semantic Search of Cultural Data”. Dominic walked the audience through a very specific example of his hometown Lowestoft to demonstrate the problem of artefacts at different museums and the difficulty in finding these related artefacts. He explained how the project he is undertaking will improve this situation.

The second afternoon session gave a nice view of where search is heading and went beyond the notion of “search”. Jussi Karlgren, founder of Gavagai, presented “How to tell what is going on (or not going on) – media monitoring in internet text is not about search but about keeping track”. Jussi’s talk gave an example of how media monitoring can demonstrate whether marketing campaigns work well. He also went into his views about sentiment analysis and that it is much more than just polarity (i.e. positive or negative).

Following Jussi, the audience heard “Exploring a million hours of sounds” presented by Richard Ranft from the British Library. Richard’s discussion focused on the ongoing project to store a collection of sounds (143 years’ worth) and make it searchable and viewable.

Finally, Jochen Leidner, head of the London arm of Thomson Reuters research, presented “Commercial Research, Development and Innovation in Information Access.”. His talk demonstrated some of the major products at Thomson Reuters and how search is incorporated into the tools. One focus of his talk was WestLaw Next, which is a very popular tool for lawyers in the USA and UK.

The last session of the day was the Fishbowl Hot Topics Session with some discussion around semantic search. However all in all this session was tame compared to previous years’ discussions and was ended a bit early at the suggestion of Udo Kruschwitz and others, as there was concern of the wine getting a bit to warm.

Drinks, finally.

All in all a great event!

P.S.: The slides of the presentations are as usual available on the Search Solutions 2014 web page.

 

Steven Zimmerman, University of Essex

Thanh Vu, Open University

Roland Roller, University of Sheffield

About Steven Zimmerman
Steven Zimmerman

Steven Zimmerman is a PhD student at the University of Essex. His research interests are in the areas of IR, NLP and event prediction. He is currently involved with the Human Rights Big Data and Technology project and is focussed on development of methods to better address discrimination, xenophobia and freedom of speech in online media.