11 November 2021 - 15:00 to 19:00

The 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

The workshop is co-located with EMNLP 2021. The conference will be held in November 2021 in a hybrid manner: in Punta Cana, Dominican Republic, and on-line. Our workshop, fully virtual, will run on November 11.

Workshop Program

11 November 2021, Punta Cana time (AST, or GMT-4)

Q&A for five talks

The Early Modern Dutch Mediascape. Detecting Media Mentions in Chronicles Using Word Embeddings and CRF
Alie Lassche and Roser Morante

Batavia asked for advice. Pretrained language models for Named Entity Recognition in historical texts.
Sophie I. Arnoult, Lodewijk Petram and Piek Vossen

Quantifying Contextual Aspects of Inter-annotator Agreement in Intertextuality Research
Enrique Manjavacas Arevalo, Laurence Mellerin and Mike Kestemont

WMDecompose: A Framework for Leveraging the Interpretable Properties of Word Mover’s Distance in Sociocultural Analysis
Mikael Brunila and Jack LaViolette

Two Lenses on the Fragrant Harbour: Differences in Western and Hong Kong–based Reporting on the 2019–2020 Protests
Arya D. McCarthy, James A. Scharf and Giovanna Maria Dora Dore

Invited talk

Dissecting offensive language detection: does it work, and what can we do with it?
Sara Tonelli

Q&A for five talks

FrameNet-like Annotation of Olfactory Information in Texts
Sara Tonelli and Stefano Menini

End-to-end style-conditioned poetry generation: What does it take to learn from examples alone?
Jörg Wöckener, Thomas Haider, Tristan Miller, The-Khang Nguyen, Thanh Tung Linh Nguyen, Minh Vu Pham, Jonas Belouadi and Steffen Eger

Automating the Detection of Poetic Features: The Limerick as Model Organism
Almas Abdibayev, Yohei Igarashi, Allen Riddell and Daniel Rockmore

Translationese in Russian Literary Texts
Maria Kunilovskaya, Ekaterina Lapshinova-Koltunski and Ruslan Mitkov

Stylometric Literariness Classification: the Case of Stephen King
Andreas van Cranenburgh and Erik Ketzan

Walk around eleven posters

The Multilingual Corpus of Survey Questionnaires Query Interface  -  Danielly Sorato and Diana Zavala-Rojas

The FairyNet Corpus – Character Networks for German Fairy Tales  -  David Schmidt, Albin Zehe, Janne Lorenzen, Lisa Sergel, Sebastian Düker, Markus Krug and Frank Puppe

Emotion Classification in German Plays with Transformer-based Language Models Pretrained on Historical and Contemporary Language  -  Thomas Schmidt, Katrin Dennerlein and Christian Wolff

Unsupervised Adverbial Identification in Modern Chinese Literature  -  Wenxiu Xie, John Lee, Fangqiong Zhan, Xiao Han and Chi-Yin Chow

Data-Driven Detection of General Chiasmi Using Lexical and Semantic Features  -  Felix Schneider, Björn Barz, Phillip Brandes, Sophie Marshall and Joachim Denzler

BAHP: Benchmark of Assessing Word Embeddings in Historical Portuguese  -  Zuoyu Tian, Dylan S. Jarrett, Juan Manuel Escalona Torres and Patricia Amaral

The diffusion of scientific terms — tracing individuals’ influence in the history of science for English  -  Yuri Bizzoni, Stefania Degaetano-Ortlieb, Katrin Menzel and Elke Teich

A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek  -  Pranaydeep Singh, Gorik Rutten and Els Lefever

Zero-Shot Information Extraction to Enhance a Knowledge Graph Describing Silk Textiles  -  Thomas Schleider and Raphael Troncy

Tecnologica cosa’: Modeling Storyteller Personalities in Boccaccio’s ‘Decameron’  -  A. Feder Cooper, Maria Antoniak, Christopher De Sa, Marilyn Migiel and David Mimno

Period Classification in Chinese Historical Texts  -  Zuoyu Tian and Sandra Kübler