Computational Linguistics in the Netherlands 30

Detailed CLIN30 programme

09:30 – 10:30 Semantics I – Drift 21 0.32 (chair: Els Lefever)
09:30 – 09:50 Evaluating the consistency of word embeddings from small data
Jelke Bloem1, Antske Fokkens2, Aurélie Herbelot3
1University of Amsterdam, 2VU Amsterdam, 3University of Trento
09:50 – 10:10 Type-Driven Composition of Word Embeddings in the age of BERT
Gijs Wijnholds
Utrecht University
10:10 – 10:30 A diachronic study on the compositionality of English noun-noun compounds using vector-based semantics
Prajit Dhar1, Janis Pagel2, Lonneke van der Plas3, Sabine Schulte im Walde4
1University of Groningen, 2Institute for Natural Language Processing, University of Stuttgart, 3University of Malta, 4University of Stuttgart
09:30 – 10:30 Machine Learning – Drift 21 1.05 (chair: Marco Spruit)
09:30 – 09:50 Investigating The Generalization Capacity Of Convolutional Neural Networks For Interpreted Languages
Daniel Bezema and Denis Paperno
Utrecht University
09:50 – 10:10 Predicting the number of citations of scientific articles with shallow and deep models
Gideon Maillette de Buy Wenniger1, Herbert Teun Kruitbosch2, Lambert Schomaker1, Valentijn A. Valentijn3
1Autonomous Perceptive Systems group – Bernoulli Institute, University of Groningen, 2University of Groningen, 3Kapteyn Astronomical Institute, University of Groningen,
10:10 – 10:30 A Non-negative Tensor Train Decomposition Framework for Language Data
Tim Van de Cruys
09:30 – 10:30 Text Analytics I – Drift 25 1.02 (chair: Ineke Schuurman)
09:30 – 09:50 Language features and social media metadata for age prediction using CNN
Abhinay Pandya1, Mourad Oussalah1, Paola Monachesi2, Panos Kostakos1
1University of Oulu, 2Utrecht University
09:50 – 10:10 EventDNA: Identifying event mention spans in Dutch-language news text
Camiel Colruyt, Orphée De Clercq, Véronique Hoste
LT3, Ghent University
10:10 – 10:30 Cross-context News Corpus of Protest Events
Ali Hürriyetoğlu, Erdem Yoruk, Deniz Yuret, Osman Mutlu, Burak Gurel, Cagri Yoltar, Firat Durusan
Koç University
09:30 – 10:30 Syntax & Parsing I – Drift 21 0.05 (chair: Michael Moortgat)
09:30 – 09:50 Linguistic enrichment of historical Dutch using deep learning
Silke Creten1, Peter Dekker2, Vincent Vandeghinste3
1KU Leuven, 2Vrije Universiteit Brussel & Instituut voor de Nederlandse Taal, 3Instituut voor de Nederlandse Taal
09:50 – 10:10 Resolution of morphosyntactic ambiguity in Russian with two-level linguistic analysis
Uliana Petrunina
University of Tromsø
10:10 – 10:30 Task-specific pretraining for German and Dutch dependency parsing
Daniël de Kok and Tobias Pütz
University of Tübingen
09:30 – 10:30 Sentiment Analysis – Drift 21 1.09 (chair: Kalliopi Zervanou)
09:30 – 09:50 An unsupervised aspect extraction method with an application to Dutch book reviews
Stephan Tulkens1 and Andreas van Cranenburgh2
1CLiPS, University of Antwerp, 2University of Groningen
09:50 – 10:10 Improving sentiment analysis
Lorenzo Gatti and Judith van Stegeren
Human Media Interaction, University of Twente
10:10 – 10:30 Dutch language polarity analysis on reviews and cognition description datasets
Gerasimos Spanakis and Josephine Rutten
Maastricht University
11:00 – 12:00 Keynote: Multilingual Dependency Parsing: From Universal Dependencies to Sesame Street – Drift 21 0.32 (chair: Jan Odijk)
While research on dependency parsing has always had a strong multilingual orientation, the lack of standardized annotations for a long time made it difficult both to meaningfully compare results across languages and to develop truly multilingual systems. The Universal Dependencies project has during the last five years tried to overcome this obstacle by developing cross-linguistically consistent morphosyntactic annotation for many languages. During the same period, dependency parsing (like the rest of NLP) has been transformed by the adoption of continuous vector representations and neural network techniques. In this talk, I will introduce the framework and resources of Universal Dependencies, and discuss advances in multilingual dependency parsing enabled by these resources in combination with deep learning techniques, ranging from traditional word and character embeddings to deep contextualized word representations like ELMo and BERT.
Joakim Nivre
12:00 – 13:45 Poster – Drift 21 0.03
BLISS: A collection of Dutch spoken dialogue about what makes people happy
Jelte van Waterschoot1, Iris Hendrickx2, Arif Khan2, Marcel de Korte3
1University of Twente, 2Radboud University Nijmegen, 3ReadSpeaker
Dutch Anaphora Resolution: A Neural Network Approach towards Automatic die/dat Prediction
Liesbeth Allein1, Artuur Leeuwenberg2, Marie-Francine Moens1
1KU Leuven, 2Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht
Computational Model of Quantification
Guanyi Chen and Kees van Deemter
Utrecht University
Parallel corpus annotation and visualization with TimeAlign
Martijn van der Klis and Ben Bonfil
UiL OTS, Utrecht University
Towards a Dutch FrameNet lexicon and parser using the data-to-text method
Gosse Minnema1 and Levi Remijnse2
1University of Groningen, 2VU University Amsterdam
The merits of Universal Language Model Fine-tuning for Small Datasets – a case with Dutch book reviews
Benjamin van der Burgh and Suzan Verberne
LIACS, Leiden University
Low-Resource Unsupervised Machine Translation using Dependency Parsing
Lukas Edman, Gertjan van Noord, Antonio Toral
University of Groningen
Relation extraction for images using the image captions as supervision
Xue Wang1, Youtian Du1, Suzan Verberne2, Fons J. Verbeek2
1Xi’an Jiaotong University, 2LIACS, Leiden University
Evaluating and improving state-of-the-art named entity recognition and anonymisation methods
Chaïm van Toledo and Marco Spruit
Universiteit Utrecht
Automatic extraction of semantic roles in support verb constructions
Ignazio Mauro Mirto
Università di Palermo
Introducing CROATPAS: A digital semantic resource for Croatian verbs
Costanza Marini and Elisabetta Ježek
University of Pavia
SONNET: our Semantic Ontology Engineering Toolset
Maaike de Boer, Jack Verhoosel, Roos Bakker
Stylometric and Emotion-Based Features for Hate Speech Detection
Ilia Markov and Walter Daelemans
University of Antwerp, CLiPS
Evaluating an Acoustic-based Pronunciation Distance Measure Against Human Perceptual Data
Martijn Bartelds and Martijn Wieling
University of Groningen
Examination on the Phonological Rules Processing of Korean TTS
Hyeon-yeol Im
Chung-ang University
12:00 – 13:45 Poster – Drift 21 0.06
How far is “man bites dog” from “dog bites man”? Investigating the structural sensitivity of distributional verb matrices
Luka van der Plas
Utrecht University
The Effect of Vocabulary Overlap on Linguistic Probing Tasks for Neural Language Models
Prajit Dhar and Arianna Bisazza
University of Groningen
Towards automation of language assessment procedures
Sjoerd Eilander and Jan Odijk
1Utrecht University, UiL-OTS
A Collection of Side Effects and Coping Strategies in Patient Discussion Groups
Anne Dirkson, Suzan Verberne, Wessel Kraaij
Leiden University
A replication study for better application of text classification in political science.
Hugo de Vos
Leiden University
Article omission in Dutch newspaper headlines
R. van Tuijl and Denis Paperno
Utrecht University
Innovation Power of ESN
Erwin Koens
HU University of Applied Sciences Utrecht
Multi-label ICD Classification of Dutch Hospital Discharge Letters
Ayoub Bagheri1, Arjan Sammani2, Daniel Oberski3, Folkert W. Asselbergs2
1Department of Methodology and Statistics, Utrecht University, 2Department of Cardiology, Division of Heart and Lungs, University Medical Center Utrecht, 3Department of Methodology and Statistics, Faculty of Social Sciences, Utrecht University
Political self-presentation on Twitter before, during, and after elections: A diachronic analysis with predictive models
Harmjan Setz, Marcel Broersma, Malvina Nissim
University of Groningen
Text Processing with Orange
Erik Tjong Kim Sang1, Peter Kok1, Wouter Smink2, Bernard Veldkamp2, Gerben Westerhof2, Anneke Sools2
1Netherlands eScience Center, 2University of Twente
Whose this story? Investigating Factuality and Storylines
Tommaso Caselli, Marcel Broersma, Blanca Calvo Figueras, Julia Meyer
Rijksuniversiteit Groningen
Bootstrapping the extension of an Afrikaans treebank through gamification
Peter Dirix1 and Liesbeth Augustinus2
1Cerence, KU Leuven, 2CCL, KU Leuven
Starting a treebank for Ughele
Peter Dirix1 and Benedicte Haraldstad Frostad2
1Cerence, KU Leuven, 2Norwegian Language Council, Oslo
13:45 – 14:45 Semantics II – Drift 21 0.32 (chair: Malvina Nissim)
13:45 – 14:05 Comparing Frame Membership to WordNet-based and Distributional Similarity
Esra Abdelkareem
Debrecen University
14:05 – 14:25 Representing a concept by the distribution of names of its instances
Matthijs Westera1, Gemma Boleda2, Sebastian Padó3
1Universitat Pompeu Fabra, 2ICREA / Universitat Pompeu Fabra, 3Universität Stuttgart
14:25 – 14:45 Semantic parsing with fuzzy meaning representations
Pavlo Kapustin1 and Michael Kapustin2
1University of Bergen, 2Moscow Institute of Physics and Technology
13:45 – 14:45 Text Analytics II – Drift 25 1.02 (chair: Paola Monachesi)
13:45 – 14:05 Annotating sexism as hate speech: the influence of annotator bias
Elizabeth Cappon1,2, Guy De Pauw1,2, Walter Daelemans2
1TEXTGAIN, 2University of Antwerp
14:05 – 14:25 Accurate Estimation of Class Distributions in Textual Data
Erik Tjong Kim Sang1, Kim Smeenk2, Aysenur Bilgin3, Tom Klaver1, Laura Hollink3, Jacco van Ossenbruggen3,4, Frank Harbers2, Marcel Broersma2
1Netherlands eScience Center, 2University of Groningen, 3CWI, 4VU
14:25 – 14:45 Interpreting Dutch Tombstone Inscriptions
Johan Bos
University of Groningen
13:45 – 14:45 Medical NLP – Drift 21 1.05 (chair: Thierry Declerck)
13:45 – 14:05 Extracting Drug, Reason, and Duration Mentions from Clinical Text Data: A Comparison of Approaches
Jens Lemmens, Simon Suster, Walter Daelemans
University of Antwerp, CLiPS
14:05 – 14:25 Dialogue Summarization for Smart Reporting: the case of consultations in health care.
Sabine Molenaar, Fabiano Dalpiaz, Sjaak Brinkkemper
Utrecht University
14:25 – 14:45 Natural Language Processing and Machine Learning for Classification of Dutch Radiology Reports
Prajakta Shouche1 and Ludo Cornelissen2
1University of Groningen, 2University Medical Center Groningen
13:45 – 14:45 Syntax & Parsing II – Drift 21 0.05 (chair: Gosse Bouma)
13:45 – 14:05 Frequency-tagged EEG responses to grammatical and ungrammatical phrases.
Amelia Burroughs, Nina Kazanina, Conor Houghton
University of Bristol
14:05 – 14:25 Detecting syntactic differences automatically using the minimum description length principle
Martin Kroon1, Sjef Barbiers1, Jan Odijk2, Stéphanie van der Pas1
1Leiden University, 2Utrecht University
14:25 – 14:45 Complementizer Agreement Revisited: A Quantitative Approach
Milan Valadou
KU Leuven
14:45 – 16:15 Poster – Drift 21 0.03
Alpino for the masses
Joachim Van den Bogaert
K.U. Leuven
GrETEL @ INT: Querying Very Large Treebanks by Example
Vincent Vandeghinste and Koen Mertens
Instituut voor de Nederlandse Taal
Convergence in First and Second Language Acquisition Dialogues
Arabella Sinclair and Raquel Fernández
ILLC, University of Amsterdam
ExpReal: A Multilingual Expressive Realiser
Ruud de Jong1, Nicolas Szilas2, Mariët Theune1
1University of Twente, 2University of Geneva
IVESS: Intelligent Vocabulary and Example Selection for Spanish vocabulary learning
Jasper Degraeuwe and Patrick Goethals
Ghent University
Interlinking the ANW Dictionary and the Open Dutch WordNet
Thierry Declerck
Psycholinguistic Profiling of Contemporary Egyptian Colloquial Arabic Words
Bacem Essam1 and Ameni Mejri2
1Peerwith, 2Debrecen University
BERT-NL: a set of language models pre-trained on the Dutch SoNaR corpus
Alex Brandsen1, Anne Dirkson1, Suzan Verberne1, Maya Sappelli2, Dungh Manh Chu3, Kimberly Stoutjesdijk3
1Leiden University, 2FDMG / HAN, 3FD Mediagroep
Dialect-aware Tokenisation for Translating Arabic User Generated Gontent
Pintu Lohar1, Haithem Afli2, Andy Way3
1Dublin City University, 2ADAPT Centre, Cork Institute of Technology, 3ADAPT Centre, Dublin City University
Literary MT under the magnifying glass: Assessing the quality of an NMT-translated Agatha Christie novel.
Margot Fonteyne, Arda Tezcan, Lieve Macken
LT3, Ghent University, Belgium
WordNet, occupations and natural gender
Ineke Schuurman1, Vincent Vandeghinste2, Leen Sevens1
1KU Leuven, 2Instituut voor de Nederlandse Taal
Mark my Word: A Sequence-to-Sequence Approach to Definition Modeling
Timothee Mickus1, Denis Paperno2, Mathieu Constant3
1Université de Lorraine, ATILF, 2Utrecht University, 3Université de Lorraine, CNRS, ATILF
Nederlab Word Embeddings
Martin Reynaert
KNAW Meertens Institute / Tilburg University
Neural Semantic Role Labeling Using Deep Syntax for French FrameNet
Tatiana Bladier1 and Marie Candito2
1Heinrich Heine University of Düsseldorf, 2LLF (Univ Paris Diderot / CNRS)
14:45 – 16:15 Poster – Drift 21 0.06
Acoustic speech markers for psychosis
Janna de Boer1, Alban Voppel2, Frank Wijnen3, Iris Sommer2
1UMC Utrecht, 2UMC Groningen, 3UiL OTS
Social media candidate generation as a psycholinguistic task
Stephan Tulkens, Dominiek Sandra, Walter Daelemans
University of Antwerp, CLiPS
Evaluating Language-Specific Adaptations of Multilingual Language Models for Universal Dependency Parsing
Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord
University of Groningen
SPOD: Syntactic Profiler of Dutch
Gertjan van Noord, Jack Hoeksema, Peter Kleiweg, Gosse Bouma
University of Groningen
HAMLET: Hybrid Adaptable Machine Learning approach to Extract Terminology
Ayla Rigouts Terryn, Veronique Hoste, Els Lefever
LT3, Ghent University
How Similar are Poodles in the Microwave? Classification of Urban Legend Types
Myrthe Reuver
Radboud University Nijmegen
Identifying Predictors of Decisions for Pending Cases of the European Court of Human Rights
Masha Medvedeva, Michel Vols, Martijn Wieling
University of Groningen
Rightwing Extremism Online Vernacular: Empirical Data Collection and Investigation through Machine Learning Techniques
Pierre Voué
SnelSLiM: a webtool for quick stable lexical marker analysis
Bert Van de Poel
KU Leuven
Syntactic, semantic and phonological features of speech in schizophrenia spectrum disorders; a combinatory classification approach.
Alban Voppel1,2, Janna de Boer1,2, Hugo Schnack2,3, Iris Sommer1
1UMC Groningen, 2UMC Utrecht, 3Universiteit Utrecht
Tracing thoughts – application of “ngram tracing” on schizophrenia data
Lisa Becker1 and Walter Daelemans2
1University of Potsdam, 2Universiteit Antwerpen
Spanish ‘se’ and ‘que’ in Universal Dependencies (UD) parsing: a critical review
Patrick Goethals and Jasper Degraeuwe
Ghent University
16:15 – 17:15 Language Generation – Drift 21 0.05 (chair: Mariët Theune)
16:15 – 16:35 Generating relative clauses from logic
Crit Cremers
LUCL, Leiden University
16:35 – 16:55 Elastic words in English and Chinese: are they the same phenomenon?
Lin Li, Kees van Deemter, Denis Paperno
Utrecht University
16:55 – 17:15 Generation of Image Captions Based on Deep Neural Networks
Shima Javanmardi1, Ali Mohammad Latif1, Fons Verbeek2, Mohammad Taghi Sadeghi Sadeghi1
1Yazd University, 2Leiden University
16:15 – 17:15 Semantics III – Drift 21 0.32 (chair: Suzan Verberne)
16:15 – 16:35 AETHEL: typed supertags and semantic parses for Dutch
Konstantinos Kogkalidis1, Michael Moortgat1, Richard Moot2
1Utrecht University, 2CNRS
16:35 – 16:55 Evaluating character-level models in neural semantic parsing
Rik van Noord, Antonio Toral, Johan Bos
University of Groningen
16:55 – 17:15 Testing Abstract Meaning Representation for Recognizing Textual Entailment
Lasha Abzianidze
CLCG, University of Groningen
16:15 – 17:15 Text & Speech Analytics – Drift 25 1.02 (chair: Erik Tjong Kim Sang)
16:15 – 16:35 Towards Dutch Automated Writing Evaluation
Orphee De Clercq
LT3, Ghent University
16:35 – 16:55 Automatic Analysis of Dutch speech prosody
Aoju Chen, Na Hu, Berit Janssen
Utrecht University
16:55 – 17:15 Hyphenation: from transformer models and word embeddings to a new linguistic rule-set
Francois REMY
16:15 – 17:15 Translation – Drift 21 1.05 (chair: Vincent Vandeghinste)
16:15 – 16:35 Translation mining in the domain of conditionals: first results
Jos Tellings
Utrecht University
16:35 – 16:55 Automatic Detection of English-Dutch and French-Dutch Cognates on the basis of Orthographic Information and Cross-Lingual Word Embeddings
Sofie Labat, Els Lefever, Pranaydeep Singh
LT3, Ghent University
16:55 – 17:15 On the difficulty of modelling fixed-order languages versus case marking languages in Neural Machine Translation
Stephan Sportel and Arianna Bisazza
University of Groningen
16:15 – 17:15 Shared Task – Drift 21 1.09 (chair: Marijn Schraagen)