CARVIEW |

International Workshop on Profiling and Searching Data on the Web
April 24, 2018, Lyon, France. Co-located with The Web Conference '2018
News
-
March 5. Tentative workshop program updated.
- February 14. Accepted papers announced.
- Emilia Kacprzak, Laura Koesten, Jeni Tennison and Elena Simperl Characterising Dataset Search Queries (Short paper)
- Mohamed Ben Ellefi, Odile Papini, Djamal Merad, Jean-Marc Boi, Jean-Philip Royer, Jérôme Pasquet, Jean-Christophe Sourisseau, Filipe Castro, Mohamad Motasem Nawaf and Pierre Drap Cultural Heritage Resources Profiling: Ontology-based Approach
- Semih Yumuşak, Andreas Kamilaris, Erdogan Dogdu, Halife Kodaz, Elif Uysal and Riza Emre Aras A Discovery and Analysis Engine for Semantic Web
- Sean Soderman, Anusha Kola, Maxim Podkorytov, Michel Geyer and Michael Gubanov Hybrid.AI: A Learning Search Engine for Large-scale Structured Data
- Zhiyu Chen, Haiyan Jia, Jeff Heflin and Brian Davison Generating Schema Labels through Dataset Content Analysis
- Sebastian Neumaier, Lőrinc Thurnay, Thomas J. Lampoltshammer and Tomáš Knap Search, Filter, Fork, and Link Open Data - The ADEQUATe platform: data- and community-driven quality improvements (Short paper)
-
January 24. The submission deadline is extended to February 6.
- January 10. Keynote speakers announced:
![]() University of Amsterdam |
![]() Universidad de Chile |
Objectives and Goals
The web of data has seen tremendous growth recently. New forms of structured data have emerged in the form of web markup, such as schema.org, and a large amount of data in web tables. Considering these rich, heterogeneous and evolving data sources which cover a wide variety of domains, the exploitation of web data becomes increasingly important in the context of various applications, including (federated) search, question answering and fact verification.
The objective of this workshop is to bring together researchers and practitioners interested in the development of data search techniques, data profiling, and dataset retrieval on the web. This includes looking at the specifics of data-centric information seeking behaviour, understanding interaction challenges in data search on the web, and analysing the cognitive processes involved in the consumption of structured data by users. At the same time we aim to discuss technologies addressing data search – including semantics, information retrieval for web data (ranking algorithms and indexing), in particular in the context of decentralised and distributed systems, such as the web. We are interested in approaches to analyse, characterise and discover data sources. We want to facilitate a discussion around data search across formats and domain-specific applications.
We envision the workshop as a forum for researchers and practitioners to come together and discuss common challenges and identify synergies for joint initiatives. We welcome contributions describing technical approaches, as well as those related to Human Computer Interaction research in data discovery, profiling and retrieval.
Topics and Themes
PROFILES & DATA:SEARCH ’18 seeks application-oriented papers, as well as more theoretical papers and position papers. The workshop proposes a multidisciplinary discussion on the following themes, with a focus on RDF, CSV, JSON and other structured and semi-structured datasets:
Data Search
- Dataset retrieval
- Search results presentation for datasets
- Semantic dataset search
- Evaluation of dataset search tools and algorithms
- Decentralised and distributed architectures and algorithms in data search
- Fusing, cleaning, ranking and refining search results
- Approaches to personalisation in dataset search
- Scalability & performance of distributed data queries
- Query routing taking into account relevance, quality and profiles of distributed datasets
Data Profiling
- Dataset profile representation (vocabularies, schemas)
- Profiling and assessment of novel forms of entity-centric Web data
- Data summarisation
- Data quality analysis for query routing
- Novel applications using dataset profiles
- Topic profiling of datasets
- Dataset indexing and profiling approaches
Human Data Interaction
- Information seeking behaviour for data
- User modeling for data search
- Analysing behavioral traces during data search
- Usability of data portals and data discovery tools
- Data search result presentation to support sense making
We are interested in contributions using a variety of methods. This can include, for example, user studies, lab experiments, system based evaluation, but also experiments using gamification and crowdsourcing.
Submission Guidelines
We welcome the following types of contributions:
We encourage full papers (8 pages), short papers (4 pages) as well as position papers (2 pages). All submissions must be written in English and must be formatted according to the ACM format. The proceedings of the workshop will be included in the companion proceedings of The WebConf2018. Each submission will be reviewed by at least 2 members of the PC. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. Please submit your contributions electronically in PDF format via the Easychair system: https://easychair.org/conferences/?conf=profiles-datasearch2018
We follow a single-blind process with at least two reviewers per paper. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop.
Important Dates
Workshop paper submissions due: 24 January 6 February 2018
Workshop paper notifications sent: 14 February 2018
Camera-ready copies due: 01 March 2018
PROFILES & DATA:SEARCH Workshop: 24 April 2018
Tentative Schedule
09:00 – 09:10 | Introduction & welcome |
09:10 – 09:20 | Opening |
09:20 – 10:20 | Keynote talk Maarten de Rijke Learning to Search for Datasets |
Over the years, search engines have developed to return a broad range of retrievable items, from documents to people, locations, and products. Research datasets are being turned in retrievable items too. This raises a number of interesting challenges. Starting from the user end (What do users want from datasets?) to increasing the retrievability of datasets (What kind of contextual information is available to enrich datasets so as to make the more easily retrieval?) to optimizing rankers for datasets in the absence of large volumes of interaction data (How can we train learning to rank datasets algorithms in weakly supervised ways?). In the talk I will survey recent progress in these three areas and identify important open problems.
|
|
10:20 – 11:00 | Break |
11:00 – 12:20 |
Paper presentations
|
12:20 – 13:40 | Lunch break |
13:40 – 14:40 | Keynote talk Aidan Hogan Profiling Graphs: Order from Chaos |
Graphs are being increasingly adopted as a flexible data model in scenarios (e.g., Google’s Knowledge Graph, Facebook’s Graph API, Wikidata, etc.) where multiple editors are involved in content creation, where the schema is ever changing, where data are incomplete, where the connectivity of resources plays a key role—scenarios where relational models traditionally struggle. But with this flexibility comes a conceptual cost: it can be difficult to summarise and understand, at a high level, the content that a given graph contains. Hence profiling graphs becomes of increasing importance to extract order, a posteriori, from the chaotic processes by which such graphs are often generated. This talk will motivate the use of graphs as a data model, abstract recent trends in graph data management, and then turn to the issue of profiling graphs: what are the goals of such profiling, the principles by which graphs can be summarised, the main techniques by which this can/could be achieved? The talk will emphasise the importance of profiling graphs while highlighting a variety of open research questions yet to be tackled.
|
|
14:40 – 15:00 |
Paper presentation
|
15:00 – 15:40 | Coffee break |
15:40 – 15:55 |
Paper presentation
|
15:55 – 16:50 | Panel discussion with Paul Groth, Aidan Hogan, Jeni Tennison, Stefan Dietze and Natasha Noy |
16:50 – 17:00 | Summary of discussions, wrap up |
Chairs and Organizers
Program Committee
- Charlie Abela (University of Malta)
- Alessandro Adamou (The Insight Centre, Ireland)
- Marco Antonio Casanova (Pontifical Catholic University of Rio de Janeiro, Brazil)
- Philipp Cimiano (Bielefeld University, Germany)
- Enrico Daga (The Open University, UK)
- Ruslan Fayzrakhmanov (University of Oxford, UK)
- Max Froumentin (Government Digital Service, UK)
- Simon Gottschalk (L3S Research Center, Germany)
- Michael Gubanov (University of Texas at San Antonio, USA)
- Peter Haase (metaphacts, Germany)
- Tom Heath (Arup, UK)
- Luis-Daniel Ibáñez (University of Southampton, UK)
- Emilia Kacprzak (The Open Data Institute, UK)
- Eva Méndez (University Carlos III of Madrid, Spain)
- Stefano Modafferi (University of Southampton, UK)
- Dmitry Mouromtsev (ITMO University, Russia)
- Axel-Cyrille Ngonga Ngomo (University of Paderborn, Germany)
- Natalya Noy (Google, USA)
- Andreas Nuernberger (Otto-von-Guericke University of Magdeburg, Germany)
- Liudmila Ostroumova Prokhorenkova (Yandex, Russia)
- Bernardo Pereira Nunes (Pontifical Catholic University of Rio de Janeiro, Brazil)
- Axel Polleres (Vienna University of Economics and Business - WU, Austria)
- Muhammad Saleem (University Of Leizpig, Germany)
- Emanuel Sallinger (University of Oxford, UK)
- Arno Scharl (Modul University, Austria)
- Nicolas Tempelmeier (L3S Research Center, Germany)
- Thanassis Tiropanis (University of Southampton, UK)
- Konstantin Todorov (LIRMM / University of Montpellier, France)
- Nicolas Torzec (Yahoo, USA)
- Raquel Trillo-Lado (Universidad de Zaragoza, Spain)
- Jürgen Umbrich (Vienna University of Economics and Business - WU, Austria)
- Ran Yu (L3S Research Center, Germany)
Organization Committee
Laura Koesten, Open Data Institute and University of Southampton.
Dr. Elena Demidova, L3S Research Center (Hannover, Germany).
Dr. Vadim Savenkov, Vienna University of Economics and Business.
Dr. John Breslin, National University of Ireland Galway.
Prof. Oscar Corcho, Universidad Politécnica de Madrid.
Dr. Stefan Dietze, L3S Research Center (Hannover, Germany).
Prof. Elena Simperl, University of Southampton.