Workshops & Tutorials

Workshops

Half-day

Emoji 2019: 2nd International Workshop on Emoji Understanding and Applications in Social Media

Organizers:

Sanjaya Wijeratne, Horacio Saggion, and Amit Sheth

The ability to automatically process and interpret text fused with emoji will be essential as society embraces emoji as a standard form of online communication. However, processing emoji using traditional natural language processing techniques is a challenging task due to the pictorial nature of emoji and the fact that (the same) emoji may be used in different contexts and cultures to express different meanings. Emoji2019 aims to stimulate research on understanding social, cultural, communicative, and linguistic roles of emoji and developing novel approaches to analyze, interpret and understand them in social media. It provides a forum for researchers and practitioners from both academia and industry to discuss high-quality research, to exchange ideas, and to identify new opportunities for collaboration.

Half-day

10th International Workshop on Modeling Social Media: Mining, Modeling and Learning from Social Media (MSM 2019)

Organizers:

Martin Atzmueller, Alvin Chin, and Christoph Trattner

Machine learning and AI techniques are particularly effective in situations where deep and predictive insights need to be uncovered from such social media data sets that are large, diverse and fast changing. We aim to focus on how to apply data mining, recommendation, machine learning and AI models, algorithms and systems for analytic and predictive modeling on social media, big data, small data, web and sensor data. We invite researchers that are interested in going beyond standard analytics approaches and try to discover the intelligent information hidden in the large and fast-changing social media data.

Half-day

International Workshop on Deep Learning for Graphs and Structured Data Embedding (DL4G-SDE)

Organizers:

Ling Chen, Yuxiao Dong, Bin Li, Fragkiskos D. Malliaros, Jie Tang, Michalis Vazirgiannis, Hui Xiong

The International Workshop of Deep Learning for Graphs and Structured Data Embedding aims to provide a forum for presenting the most recent advances in embedding and representation learning for structured data as well as deep learning for graphs to unearth rich knowledge. We expect novel research works that address various aspects and challenges around this topic, including learning representation for large-scale and dynamic networks, heterogeneous network embedding, scalable and efficient algorithms for other structured data embedding, deep learning methodologies for graph-structured data, novel platforms and applications supporting structured data embeddings, and beyond. We hope this dedicated workshop will foster further research discussions and development in this field.

Half-day

5th AW4city - Enhancing Urban Mobility with Web Applications

Organizers:

Leonidas Anthopoulos, Marijn Janssen, and Vishanth Weerakkody

Following up the success of the subsequent events at WWW2015, WWW2016, WWW2017 and WWW2018, the 5th AW4City 2019 aims to keep on attracting a significant international attention with regard to web applications for smart cities. This year, AW4City emphasizes on the contribution of web applications and Apps to urban mobility. In the era of cities and under the UN 2030 Agenda for sustainable growth, cities are making an important shift regarding thinking of compact cities and they develop sustainable mobility plans.

Full-day

SAD 2019: Workshop on Subjectivity, Ambiguity and Disagreement on the Web

Organizers:

Christopher Welty, Anca Dumitrache, Alex Quinn, Olivia Rhinehart, Mike Schaekermann, and Michael Tseng

The primary objective of this full-day workshop is to bring together a latent community of researchers who treat disagreement (and subjectivity and ambiguity) as signal, rather than noise. Such researchers use theoretical and empirical methodology to characterize, utilize, mitigate and derive value from subjectivity, ambiguity and disagreement (see section Background for more in-depth analysis of related work). The workshop will include invited talks, short technical talks and a discussion of medium- and long-term challenges to fuel future work. A central goal for this second workshop edition is to begin to collect data and establish challenges for productively using data sets that exhibit disagreement using both existing methods as well as ideas put forward at the workshop.

Full-day

CyberSafety 2019: The Fourth Workshop on Computational Methods in Online Misbehavior

Organizers:

Homa Hosseinmardi, Srijan Kumar, Qin Lv, Neil Shah, and Richard Han

The web provides a valuable space for individuals to interact with each other, and read, publish and share content. However, in recent times, it has also become a breeding ground for online misbehavior, including fraudulent engagement, user deception and scams, harassment, hate speech, cyberthreats, and cyberbullying. Thus, the aim of CybersSafety 2019 work is to improve cybersafety and build a better, safer, and more inclusive web and social media ecosystem for everyone. The workshop provides an interdisciplinary venue for researchers and practitioners to showcase pioneering research, demos, and tools that can improve the way in which cybersafety is done currently. Join us for a day filled with presentations, posters, panels, and demos!

Full-day

Workshop on Knowledge Graph Technology and Applications

Organizers:

Huajun Chen, Ying Ding, Laura Dietz, Wendy Hall, James Hendler, Deborah McGuinness, Edgar Meij, Sam Molyneux, Varish Mulwad, Raghava Mutharaju, Jeff Z. Pan, Xiang Ren, Jie Tang, Alex Wade, Mengting Wan, Chenyan Xiong, Min Zhang

Knowledge Graphs are graph structures that capture knowledge in the form of entities and the relationships between them, and optionally the provenance information. Along with Semantic Web standards such as RDF, OWL, and SPARQL, advances in Machine Learning, Deep Learning, Natural Language Processing, and Information Retrieval has led to automated construction of knowledge graphs such as DBpedia, YAGO, Wikidata, Google’s and LinkedIn’s Knowledge Graph, Microsoft’s Satori, and Product Knowledge Graph from Amazon and eBay. Knowledge Graphs are used in several applications such as search, question answering, data integration, recommendation systems etc., across several domains such as healthcare, geosciences, manufacturing, aviation, power, oil and gas. There are several challenges related to knowledge graphs from the perspective of both the technology and its applications. This workshop aims to foster discussions along these perspectives.

Full-day

HumBL: Augmenting Intelligence with Bias-aware Humans-in-the-Loop

Organizers:

Lora Aroyo, Alessandro Checco, Gianluca Demartini, Ujwal Gadiraju, Anna Lisa Gentile, Cristina Sarasua, Oana Inel

Human-in-the-loop is a model of interaction where a machine process and one or more humans have an iterative interaction. Computers are fast and accurate in processing vast amounts of data, people are creative and bring in their perspectives. Bringing humans and machines together creates a natural symbiosis for accurate interpretation of data. The goal of this workshop is to bring together researchers and practitioners in various areas of AI (i.e., Machine Learning, NLP, Computational Advertising, etc.) to explore new pathways of the human-in-the-loop paradigm to address some of the current concerns in AI, such as being able to explain and understand the results as well as avoiding bias in the underlying data that might lead to unfair or unethical conclusions.

Half-day

Workshop on the Intersection of Machine Learning and Mechanism Design

Organizers:

Ronny Lempel and Aranyak Mehta

The workshop aims to bring together researchers and practitioners from two research domains - mechanism design and machine learning - whose fields interact billions of times per day in practice but that are still, for the most part, keeping separate in the academic arena. The workshop will focus on motivating, promoting and disseminating interdisciplinary research combining these fields. Specifically, the workshop will tackle topics at the interaction of the two fields, including the increasing use of data and machine learning in designing mechanisms in broad contexts (e.g., mechanisms based on sampled past data) as well as the use of mechanism design techniques in machine learning (e.g., learning in a strategic setting).

Half-day

Women in Web Data Science

Organizers:

Ana Paula Appel, Marisa Vasconcelos, Francesca Spezzano, and Célia Talma Gonçalves

The Third Workshop on Women in Web Data Science brings together female faculty, graduate students, research scientists, and industry researchers for an opportunity to connect, exchange ideas, and learn from each other in the field of Data Science. Underrepresented minorities, graduates, and undergraduates interested in pursuing data science, machine learning research are encouraged to participate. While most presenters should be women, everybody is invited to attend.

Half-day

Workshop on Hypermedia Multi-Agent Systems (HyperAgents)

Organizers:

Simon Mayer, Andrei Ciortea, Fabien Gandon, and Olivier Boissier

Hypermedia is increasingly used in Web service design, particularly in Linked Data and Web of Things systems where the use of static service contracts is not practical. This evolution raises new challenges: to discover, consume, and integrate hypermedia services at runtime, clients have to become increasingly autonomous in pursuit of their design goals. Such autonomous systems have been studied to a large extent in research on multi-agent systems (MAS). To consolidate the evolution of hypermedia services, it is now necessary to have comprehensive discussions on integrating hypermedia systems and MAS. We invite researchers and practitioners to design, build, evaluate, and share their vision on what the future of a hypermedia-driven Web for both people and autonomous agents will be.

Half-day

Ninth International Workshop on Location and the Web (LocWeb 2019)

Organizers:

Dirk Ahlers, Erik Wilde, Rossano Schifanella, and Jalal Alowibdi

LocWeb 2019 lies at the intersection of location-based services and Web architecture. It focuses on Web-scale services and systems facilitating location-aware information access as well as on Spatial Social Behavior Analytics on the Web as part of social computing. LocWeb addresses location as a cross-cutting issue in web research and technology that connects the online world to the physical spatial world. Subtopics include (i) geospatial semantics, systems, and standards; (ii) large-scale geospatial and geo-social ecosystems; (iii) mobility; (iv) location in the Web of Things; and (v) mining and searching geospatial data on the Web. The workshop encourages work describing Web-mediated or Web-scale approaches, and that thoroughly understand and embrace the geospatial dimension.

Half-day

Search-Oriented Conversational AI (SCAI)

Organizers:

Julia Kiseleva, Jeff Dalton, Aleksandr Chuklin, and Mikhail Burtsev

The Search-Oriented Conversational AI workshop brings together researchers and practitioners from NLP, AI/Deep Learning, and the search/IR communities to lay the ground for search-oriented conversational AI and establish future directions and collaborations. The focus of the third edition seeks to broaden participation between research and industry communities. The workshop features a strong program of invited talks from leaders in the field.

Full-day

9th Temporal Web Analytics Workshop (TempWeb)

Organizers:

Marc Spaniol, Ricardo Baeza-Yates, and Julien Masanes

TempWeb focuses on investigating infrastructures, scalable methods, and innovative software for aggregating, querying, and analyzing heterogeneous data at Internet scale. Emphasis will be given to temporal data analysis along the time dimension for web data that has been collected over extended time periods. A major challenge in this regard is the sheer size of the data it exposes and the ability to make sense of it in a useful and meaningful manner for its users. On the Web, to a large extent, we have also reached this point. Web scale data analytics therefore needs to develop infrastructures and extended analytical tools to make sense of these.

Full-day

6th Wiki Workshop

Organizers:

Miriam Redi, Robert West, and Dario Taraborelli

The goal of this workshop is to bring together researchers exploring all aspects of Wikimedia websites such as Wikipedia, Wikidata, and Wikimedia Commons. With members of the Wikimedia Foundation’s Research team in the organizing committee and with the experience of successful workshops in 2015 (at ICWSM), 2016 (at WWW and ICWSM), 2017 (at WWW) and 2018 (at TheWebConf), we aim to continue facilitating a direct pathway for exchanging ideas between the organization that operates Wikimedia websites and the researchers interested in studying them.

Full-day

Workshop on Data Science for Social Good

Organizers:

Natalia Adler, Ciro Cattuto, Daniela Paolotti, Michele Tizzoni, Stefaan Verhulst, and Andrew Young

From climate change to growing inequality, geopolitical upheaval and migrations, the challenges confronting our society are unprecedented, not only in their variety but also in their complexity. Data science and non-traditional data sources are becoming increasingly important to address these challenges and to unlock new opportunities in social innovation, philanthropy, international development and humanitarian aid. Data generated and collected by Web-based systems and communities can inform new models and support global agencies and policy makers to better identify needs, design interventions and evaluate impact. This workshop will showcase innovative contributions in the emerging field of data science for social good, and it will highlight the public interest value of new partnerships for the data age.

Full-day

10th Latin American Web Congress (LA-WEB)

Organizers:

Altigran da Silva and Barbara Poblete

LA-WEB 2019 is the 10th of a series of refereed international conferences that aim to provide a venue to present, demonstrate and discuss the latest advancements in Web research that involves the Latin American Web community in its broadest sense. LA-WEB offers a great venue to show exciting new work that is mature (full papers) and work that is at an early stage and can benefit from discussion with colleagues (short papers). In this edition, LA-WEB will be co-located with The Web Conference 2019. We expect that this creates an opportunity for a synergetic atmosphere between the Latin American community and the global community of leading web researchers.

Full-day

Attention from Neuroscience to the Web and Wellbeing

Organizers:

Vidhya Navalpakkam and Laurent Itti

In an age where multiple apps, advertisements and an abundance of social media are vying to get the user’s attention, attention has become an extremely scarce resource. Never before has the need to understand attention economy been more acute. To raise the community's awareness on this issue, we present a unique, multi-disciplinary workshop that brings together experts offering diverse perspectives on attention. Starting with lessons from 4+ decades of work in Neuroscience / Cognitive Psychology on the science of attention, we will discuss advances in methods for studying attention on the web at scale and across devices, followed by applications of attention across domains ranging from information satisfaction for search/browse experiences on the web; measuring the effectiveness of advertising and web design; helping users with accessibility issues (e.g., ALS patients); to matters of current societal relevance such as designing improved measures and outcomes for digital wellbeing. We believe this workshop will spearhead new research initiatives towards better use of attention for digital wellbeing on the web and apps.
Learn more

Full-day

International Workshop on Misinformation, Computational Fact-Checking, and Credible Web

Organizers:

Laks V.S. Lakshmanan, Chengkai Li, and Paolo Papotti

Our society is struggling with an unprecedented amount of falsehood which harms wealth, democracy, and health. Debunking misinformation calls for interdisciplinary advancements in multiple social science areas, in addition to computer science. The last few years have witnessed a substantial growth in efforts at data-driven, AI-powered fact-checking. These efforts tackle various fronts, such as the detection of fabricated news, rumors, and spam, automation in fact-checking, flagging clickbait, and discovering fake accounts and malicious social media bots. This workshop aims at bringing together researchers, practitioners, and educators in the aforementioned areas to explore the ongoing challenges, solutions, ethics, and educational approaches in this context, with an emphasis on studies using computational and data-driven methodology.

Full-day

Workshop on Linked Data on the Web and its Relationship to Distributed Ledgers (LDOW/LDDL)

Organizers:

Maribel Acosta, Tim Berners-Lee, Stefan Dietze, Anastasia Dimou, John Domingue, Luis Ibanez-Gonzalez, Krzysztof Janowicz, Maria-Esther Vidal, Amrapali Zaveri

The workshop on Linked Data on the Web and its Relationship to Distributed Ledgers (LDOW/LDDL) aims to stimulate discussion and further research into the challenges of publishing, consuming, and integrating structured data from the Web, covering established topics of the Linked Data on the Web (LDOW) community. As this year’s edition represents the coming together of the established Workshop on Linked Data On the Web (LDOW) with the Workshop on Linked Data and Distributed Ledgers (LDDL), we'll additionally address the question of how distributed ledgers could help towards solving some of these challenges, and how Linked Data technologies may help distributed ledgers to become more open and interoperable.

Full-day

Workshop on Fairness, Accountability, Transparency, Ethics and Society on the Web

Organizers:

Chiara Renso, Daniel Sadoc Menasché, Jonice Oliveira, Lívia Ruback, Carlos Castillo, and Jeanna Matthews

Can we build inclusive and representative machine-learning based-algorithms? Who is responsible for harm when algorithmic decision-making results in discriminatory outcomes? To whom should algorithms be transparent? What approaches to ethics might algorithms require? The FATES on the Web 2019 (Fairness, Accountability, Transparency, Ethics, and Society on the Web) is the first edition of a workshop to bring together researchers and enthusiasts concerned with the urgent challenges concerning algorithmic fairness and accountability, transparency, and ethics on data management and social interaction on the Web.

Full-day

ECNLP: First Workshop on e-Commerce and NLP

Organizers:

Shervin Malmasi, Eugene Agichtein, Oleg Rokhlenko, Ido Guy, and Nicola Ueffing

NLP and IR have been powering e-commerce applications since the early days of the fields. Today, NLP and IR already play a significant role in e-commerce tasks, including product search, recommender systems, product question answering, sentiment analysis, product description and review summarization, and customer review processing, amongst many other tasks. With the exploding popularity of chatbots and shopping assistants – both text- and voice-based – NLP, IR, question answering, and dialogue systems research is poised to transform e-commerce once again, but requires a forum where new and unfinished ideas could be discussed. The ECNLP workshop aims to provide a venue for the dissemination of late-breaking research results and ideas related to e-commerce and online shopping, bringing together researchers from both academia and industry.

Half-day

Managing the Evolution and Preservation of the Data Web (MEPDaW)

Organizers:

Javier D. Fernández, Jeremy Debattista, Fabrizio Orlandi, and Maria-Esther Vidal

There is a vast and rapidly increasing quantity of data published on the emerging Data Web. Knowledge graphs have emerged as scalable knowledge models for integrating data collected from heterogeneous and dynamic data sources. The workshop targets one of the emerging and fundamental problems in the Web, specifically the management and preservation of evolving knowledge graphs. This topic is of particular relevance to The Web Conference since it raises awareness of the many research challenges for preserving and managing knowledge graphs that evolve over time. Fostering active usage of such evolving knowledge graphs requires further research advances on topics such as storage, synchronisation, change representation and querying. Solutions to these problems correspond to main subjects of interests of the workshop.

Organizers:

Tutorials

Half-day

The Practice of Labeling: everything you always wanted to know about labeling

Organizers:

Omar Alonso

Description:

Many data intensive applications that use machine learning or artificial intelligence techniques depend on humans providing the initial dataset, enabling algorithms to process the rest or for other humans to evaluate the performance of such algorithms. There are, however, practical issues with the adoption of human computation and crowdsourcing at scale in the real world. Building systems data processing pipelines that require crowd computing remains difficult. In this tutorial, we present practical considerations for designing and implementing tasks that require the use of humans and machines in combination with the goal of producing high quality labels.

Half-day

Deep Chit-Chat: Deep Learning for Chatbots

Organizers:

Wei Wu and Rui Yan

Description:

The tutorial is based on the long-term efforts on building conversational models with deep learning approaches for chatbots. We will summarize the fundamental challenges in modeling open domain dialogues, clarify the difference from modeling goal-oriented dialogues, and give an overview of state-of-the-art methods for open domain conversation including both retrieval-based methods and generation-based methods. In addition to these, our tutorial will also cover some new trends of research of chatbots, such as how to design a reasonable evaluation metric and how to conduct dialogue management for the conversational systems in the open domain.

Half-day

Sequence-aware Recommender Systems

Organizers:

Paolo Cremonesi, Massimo Quadrana and Dietmar Jannach

Description:

In recent years, more and more recommendation algorithms have been proposed that are based on time-ordered user interaction logs. Algorithms for session-based recommendation tasks are among the most prominent examples of such approaches. Differently from more traditional rating prediction algorithms, sequence-aware algorithms are typically designed to learn sequential patterns from user behavior data. These patterns can then be used to predict the user's next action within an ongoing session or to detect short-term trends in the community. In this tutorial, we first outline the application areas of sequence-aware recommendation. We then focus on sequential and session-based recommendation techniques and discuss algorithmic proposals as well as evaluation challenges. Finally, the tutorial will be concluded by an hands-on session.

Half-day

Scalable Subgraph Counting: The Methods Behind The Madness

Organizers:

Comandur Seshadhri and Srikanta Tirthapura

Description:

Subgraph counting is a fundamental and widely applied problem in graph analysis that asks to count or approximate the occurrences of a small subgraph (pattern) in a large graph (dataset). The last few years have seen a rich literature develop around scalable solutions for this challenging problem. While research results have so far appeared as a somewhat disconnected set of ideas, we observe a few common algorithmic building blocks that they build on. In this tutorial, we summarize the state-of-the-art in terms of such building blocks, and highlight the practical utility of various approaches. We will also cover methods for subgraph counting in "big data" computational models such as the streaming model and parallel and distributed models.

Half-day

Cloud Economics

Organizers:

Ian Kash

Description:

Current cloud pricing schemes are generally simple utility-style metering where customers pay for what is used. However, this hides substantial complexity as there are many different ways to buy the same underlying resources and subtle but important details to the pricing structure. This tutorial will survey both the resource allocation challenges faced by cloud providers and the economic mechanisms used to resolve them. Topics include spot markets, reservations, storage, resource bundling, and higher-level services such as access to datasets and machine learning models.

Half-day

Designing Equitable Algorithms for the Web

Organizers:

Ricardo Baeza-Yates and Sharad Goel

Description:

Machine learning algorithms increasingly affect both our online and offline experiences. Researchers and policymakers, however, have rightfully raised concerns that these systems might inadvertently exacerbate societal biases. We provide an introduction to fair machine learning, beginning with a general overview of algorithmic fairness, and then discussing these issues specifically in the context of the Web.

Full-day

The Challenge of API Management: API Strategies for Decentralized API Landscapes

Organizers:

Erik Wilde and Mike Amundsen

Description:

The rapidly evolving "Web of Services" is based on a diverse set of approaches and technologies. This can make architectural decisions hard when it comes to choosing on how to expose information and services through an API. This challenge becomes more pronounced in organizations with continuously evolving API landscapes. This tutorial takes participants through two different journeys: (1) Discussing API styles and API technologies, comparing and contrasting them as a way to highlight the fact that there is no such thing as the best choice. (2) How to define an API strategy that helps teams to make effective choices about APIs in a given context, and how to manage that context over time in landscapes of thousands of evolving APIs.

Half-day

A/B Testing at Scale: Accelerating Software Innovation

Organizers:

Somit Gupta, Ronny Kohavi, Alex Deng, Jeff Omhover and Pawel Janowski

Description:

Online controlled experiments help make data-driven decisions in a number of products and services like search engines (e.g., Google, Bing), retail services (e.g., Amazon, eBay, Etsy), social networking services (e.g., Facebook, LinkedIn, Twitter), and travel services (e.g., Expedia, Airbnb, Booking.com). The theory of a controlled experiment is simple. In practice, the deployment and evaluation of online controlled experiments at scale (100’s of concurrently running experiments) presents many pitfalls and challenges. In this tutorial, we will introduce the A/B testing methodology, walkthrough use cases using real examples, and then focus on practical and research challenges in scaling experimentation. We will share key lessons learned from scaling experimentation at Microsoft to thousands of experiments per year and outline directions for future work.

Half-day

Crowdsourcing Inclusivity: Dealing with diversity of opinions, perspectives and ambiguity in annotated data - The CrowdTruth Tutorial

Organizers:

Lora Aroyo, Anca Dumitrache, Oana Inel, Zoltán Szlávik, Benjamin Timmermans and Chris Welty

Description:

We introduce the CrowdTruth methodology for crowdsourcing ground truth by harnessing and interpreting inter-annotator disagreement. CrowdTruth is a widely used crowdsourcing methodology adopted by industrial partners and public organizations (Google, IBM, New York Times, The Cleveland Clinic, Crowdynews, The Netherlands Institute for Sound and Vision, Rijksmuseum), in multiple domains (news, medicine, cultural heritage, social sciences). The central characteristic of CrowdTruth is harnessing the diversity in human interpretation to capture the wide range of opinions and perspectives, and thus, provide more reliable and realistic real-world annotated data for training and evaluating machine learning components. This tutorial aims to introduce this novel approach to crowdsourcing that contributes to the larger discussion on how to make the Web more reliable, diverse and inclusive.

Half-day

Socially Responsible NLP

Organizers:

Yulia Tsvetkov, Vinodkumar Prabhakaran and Rob Voigt

Description:

As language technologies have become increasingly prevalent in analyzing online data, there is a growing awareness that decisions we make about our data, methods, and tools often have immense impact on people and societies. This tutorial will provide an overview of real-world applications of NLP technologies and their potential ethical implications. We intend to provide the researchers with an overview of tools to ensure that the data, algorithms, and models that they build are socially responsible. These tools will include a checklist of common pitfalls that one should avoid, as well as methods to mitigate these issues. Issues of bias, ethics, and impact are often not clear-cut; this tutorial will also discuss the complexities inherent in this area.

Half-day

Concept to Code: Deep Learning for Fashion Recommendation

Organizers:

Omprakash Sonie, Muthusamy Chelliah and Shamik Sural

Description:

Deep Learning has shown significant results in various domains. In this tutorial, we provide conceptual understanding of embedding methods, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNNs). We present fashion use case and apply these techniques for modeling image, text as well as sequence data to figure out user profiles, give personalized recommendations tailored to changing user taste and interest. Given the image of a fashion item, recommending complementary matches is a challenge. Users’ taste evolves over time and depends on persona. Humans relate objects based on their appearance and non-visual factors of lifestyle merchandise which further complicate recommendation task. Composing outfits in addition necessitates constituent items to be compatible - similar in some but different in other aspects.

Half-day

Economic Theories of Distributive Justice for Fair Machine Learning

Organizers:

Krishna Gummadi and Hoda Heidari

Description:

Machine Learning is increasingly employed to make consequential decisions for humans. In response to the ethical issues that may ensure, an active area of research in ML has been dedicated to the study of algorithmic unfairness. This tutorial introduces fair-ML to the web conference community and offers a new perspective on it through the lens of the long-established economic theories of distributive justice. Based on our own past and ongoing research, we believe that economic theories of equality of opportunity, inequality measurement, and social choice has a lot to offer—in terms of tools and insights—to data scientists and practitioners interested in understanding the ethical implications of their work. We overview these theories and discuss their connections to fair-ML.

Half-day

Modeling and Mining Feature-Rich Networks

Organizers:

Rushed Kanawati and Martin Atzmueller

Description:

In the field of web mining and web science, as well as data science and data mining there has been a lot of interest in the analysis of (social) networks. With the growing complexity of heterogeneous data, feature-rich networks have emerged as a powerful modeling approach: They capture data and knowledge at different scales from multiple heterogeneous data sources, and allow the mining and analysis from different perspectives. The challenge is to devise novel algorithms and tools for the analysis of such networks. This tutorial provides a unified perspective on feature-rich networks, focusing on different modeling approaches, in particular multiplex and attributed networks. It outlines important principles, methods, tools and future research directions in this emerging field.

Half-day

Online User Engagement: Metrics and Optimization

Organizers:

Liangjie Hong and Mounia Lalmas

Description:

User engagement plays a central role in online services. The main challenge is to leverage collected knowledge about the daily online behavior of millions of users to understand what engages them short-term and long-term. Two critical steps of improving user engagement are metrics and their optimization. The most common way that engagement is measured is through various online metrics. This tutorial will review these metrics, their advantages and drawbacks, and their appropriateness to various types of online services. Once metrics are defined, how to optimize them will become the key issue. We will survey methodologies that are utilized to optimize these metrics via direct or indirect ways, with case studies in the domain of news, search, entertainment, and e-commerce.

Half-day

Continuous Analytics of Web Streams

Organizers:

Riccardo Tommasini, Robin Keskisärkkä, Jean-Paul Calbimonte, Eva Blomqvist, Emanuele Della Valle and Albert Bifet

Description:

We provide a comprehensive introduction to web stream processing, including the fundamental stream reasoning concepts, as well as an introduction to practical implementations and how to use them in concrete web applications. To this extent, we intend to 1) survey existing research outcomes from Stream Reasoning / RDF Stream Processing that arise in querying, reasoning on and learning from a variety of highly dynamic data, 2) introduce deductive and inductive stream reasoning techniques as powerful tools to use when addressing a data-centric problem characterized both by variety and velocity, 3) present a relevant use-case, which requires to address data velocity and variety simultaneously on the web, and guide the participants in developing a Web stream processing application.

Half-day

From Research Articles to Knowledge Graphs: Methods for ontology-driven knowledge base creation from text

Organizers:

Vayianos Pertsas and Panos Constantopoulos

Description:

We address the challenge of transforming text into knowledge graphs. We will tutor the participants to methods for modeling domain knowledge, extracting information from texts using ML techniques and associating this with other information mined from the Web in order to create knowledge graphs according to a domain model. The scholarly domain will be used as a use case, where we will show how to model research processes, extract them from research articles, associate them with contextual information from article metadata and other digital repositories and create knowledge bases available as linked data. Our aim is to show how different methodologies, namely NLP, ML and conceptual modeling, can be combined with Web technologies in a meaningful workflow.

Full-day

Representation Learning on Networks: Theories, Algorithms, and Applications

Organizers:

Jie Tang and Yuxiao Dong

Description:

We will give a systematic introduction for representation learning on large-scale networks, covering theories, algorithms, and applications. We will introduce both the history and recent advances on network representation learning. Uniquely, this tutorial aims to provide the audience with 1) underlying theories in network representation learning and 2) our experience in translating network representation learning into real-world application on the Web, including Alibaba, AMiner, Microsoft Academic Search, as well as Wechat and Tencent. Finally, all the work introduced in the tutorial is guaranteed with open code and we will also take the opportunity to release the Open Challenge on Network Embedding with open datasets and benchmarks.

Full-day

Human Mobility from theory to practice: Data, Models and Applications

Organizers:

Filippo Simini, Gianni Barlacchi, Luca Pappalardo, Roberto Pellungrini

Description:

The rapid inclusion of tracking technologies in personal devices opened the doors to the analysis of large sets of mobility data like GPS traces and call detail records. This tutorial presents an overview on both modeling principles of human mobility and machine learning models applicable to specific problems. We review the state of the art of four main aspects in human mobility: (1) human mobility data landscape; (2) key measures of individual and collective mobility; (3) generative models at the level of individual, population and mixture of the two; (4) next location prediction algorithms; (5) applications for social good. For each aspect, we show experiments and simulations using the Python library "scikit-mobility" developed by the presenters of the tutorial.

Full-day

Social audience, under the influence

Organizers:

Augustin Chaintreau, Arthi Ramachandran and Elissa Redmiles

Description:

The increasing availability of knowledge and interconnectivity has brought with it an amplification of propaganda and influence. We must account for the influence systems we design may have on changing audience beliefs and societal outcomes. This tutorial will provide an overview of the current state-of-the-art knowledge about how audiences reacts to and engage with content and how audience influence has been engineered, both by human and algorithmic intervention. We will provide in-depth discussions and surveys of key research methodologies for each topic (surveying and quantifying perceptions, assessing audience size, reproducing reinforcing dynamics, and computational limits of fair rankings). We will conclude with breakout sessions to develop concrete directions for future work based on current shortcomings and underexplored areas.

Half-day

Privacy-preserving Data Mining in Industry

Organizers:

Krishnaram Kenthapadi, Ilya Mironov and Abhradeep Thakurta

Description:

Preserving privacy of users is a key requirement of web-scale data mining applications and systems, and has witnessed a renewed focus in light of recent data breaches and regulations such as GDPR. We will first present the lessons learned from privacy breaches over the last two decades and an overview of differential privacy. Then, we will focus on the application of privacy-preserving data mining techniques in practice, by presenting case studies such as Apple's differential privacy deployment for iOS / macOS, Google's RAPPOR, LinkedIn Salary, and Microsoft's differential privacy deployment for collecting Windows telemetry. We will conclude with open problems and challenges for the data mining / machine learning community, based on our experiences in industry.

Half-day

Fairness-Aware Machine Learning: Practical Challenges and Lessons Learned

Organizers:

Sarah Bird, Ben Hutchinson, Krishnaram Kenthapadi, Emre Kıcıman and Margaret Mitchell

Description:

Researchers and practitioners from different disciplines have highlighted the ethical and legal challenges posed by the use of machine learned models and data-driven systems, and the potential for such systems to discriminate against certain population groups, due to biases in algorithmic decision-making systems. This tutorial aims to present an overview of algorithmic bias / discrimination and techniques for achieving fairness in machine learning systems. We will motivate the need for adopting a "fairness-first" approach when developing machine learning based models and systems in practice. Based on our experiences in industry, we will present case studies from different technology companies, highlight best practices, and identify open problems and research challenges for the data mining / machine learning community.

Half-day

Explainable Recommendation and Search

Organizers:

Yongfeng Zhang, Jiaxin Mao and Qingyao Ai

Description:

Explainable recommendation and search attempt to develop search/recommendation models that are both accurate (i.e., high-quality recommendation or search results), and explainable (i.e., model is explainable or intuitive explanations of the results can be generated), which can help to improve the system transparency, persuasiveness, trustworthiness, and effectiveness. The tutorial focuses on the recent research of explainable recommendation and search algorithms, as well as their application in real-world systems such as search engine, e-commerce and social networks. The tutorial aims at introducing and communicating explainable recommendation and search methods to the community, as well as gathering researchers and practitioners interested in this research direction for discussions, idea communications, and research promotions.