Stance Detection and Open Research Avenues

Dilek Küçük dilek.kucuk@tubitak.gov.tr \institutionTÜBİTAK Marmara Research Center \streetaddressMETU Campus \cityAnkara \countryTurkey and Fazli Can canf@cs.bilkent.edu.tr \institutionBilkent University \streetaddressBilkent Campus \cityAnkara \countryTurkey

Abstract.

This tutorial aims to cover the state-of-the-art on stance detection and address open research avenues for interested researchers and practitioners. Stance detection is a recent research topic where the stance towards a given target or target set is determined based on the given content and there are significant application opportunities of stance detection in various domains. The tutorial comprises two parts where the first part outlines the fundamental concepts, problems, approaches, and resources of stance detection, while the second part covers open research avenues and application areas of stance detection. The tutorial will be a useful guide for researchers and practitioners of stance detection, social media analysis, information retrieval, and natural language processing.

Key words and phrases:

stance detection, affective computing, sentiment analysis, social media analysis, data streams, stance quantification

{CCSXML}

<ccs2012> <concept> <concept_id>10010147.10010178.10010179</concept_id> <concept_desc>Computing methodologies Natural language processing</concept_desc> <concept_significance>500</concept_significance> </concept> <concept> <concept_id>10002951.10003317</concept_id> <concept_desc>Information systems Information retrieval</concept_desc> <concept_significance>500</concept_significance> </concept> <concept> <concept_id>10002951.10003317.10003371.10010852.10010853</concept_id> <concept_desc>Information systems Web and social media search</concept_desc> <concept_significance>500</concept_significance> </concept> <concept> <concept_id>10002951.10003317.10003347.10003353</concept_id> <concept_desc>Information systems Sentiment analysis</concept_desc> <concept_significance>500</concept_significance> </concept> <concept> <concept_id>10010147.10010257</concept_id> <concept_desc>Computing methodologies Machine learning</concept_desc> <concept_significance>500</concept_significance> </concept> <concept> <concept_id>10010147.10010178.10010179.10010186</concept_id> <concept_desc>Computing methodologies Language resources</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012>

\ccsdesc

[500]Computing methodologies Natural language processing \ccsdesc[500]Information systems Information retrieval \ccsdesc[500]Information systems Web and social media search \ccsdesc[500]Information systems Sentiment analysis \ccsdesc[500]Computing methodologies Machine learning \ccsdesc[500]Computing methodologies Language resources

1. Introduction

Stance detection is a research problem focusing on people’s positions towards specific targets, in natural language texts Küçük and Can (2020, 2021, 2022). It can be considered as a subproblem of affective computing, along with related research topics such as sentiment analysis. Stance classification, stance analysis, and stance extraction are also used to refer to the problem of stance detection in the related literature.

One of the important milestones of stance detection research is the stance detection shared task on English tweets performed in 2016 Mohammad et al. (2016a). Within the course of this competition, a stance-annotated tweet dataset is created and publicly shared Mohammad et al. (2016b). This shared task is followed by other similar competitions on texts in Chinese Xu et al. (2016), Spanish and Catalan Taulé et al. (2017), Italian Cignarella et al. (2020), and Basque Agerri et al. (2021).

Significant subproblems of stance detection and closely-related problems have been previously discussed in details in Küçük and Can (2020, 2021, 2022). Based on the related figure in Küçük and Can (2020), a revised schematic representation demonstrating stance detection, its subproblems, and related problems is given in Figure 1. The newly-added problems are contextual stance detection Cignarella et al. (2020); AlDayel and Magdy (2021), intent detection Küçük (2021a), and stance quantification Küçük (2021b), which are shown in blue in the figure.

Refer to caption — Figure 1. Stance Detection, Its Subproblems, and Related Research Problems (Revised Version of the Corresponding Figure in Küçük and Can (2020)).

Contextual stance detection does not only use the input text, but makes use of contextual information (social media user profiles, interactions between posts, etc.) as well during the stance detection procedure. Hence, contextual stance detection is a subproblem of generic stance detection. Intent detection (together with slot filling), on the other hand, is a research problem of dialogue systems and natural language understanding, where the goals of the users are extracted from their utterances Niu et al. (2019). Stance quantification aims the determine the percentages of the textual items belonging to distinct stance classes, instead of labeling each item with its stance label Küçük (2021b). We aim to cover all of the research problems in Figure 1 during our tutorial.

Most of the recent work on stance detection employs deep learning approaches Zhao and Yang (2020); Schiller et al. (2021). Yet, ensemble methods are also commonly observed in stance detection research Chen et al. (2021).

It has been previously reported that stance detection studies were published on several languages including English, Chinese, Spanish, Catalan, Italian, Japanese, Turkish, Czech, Russian, and Arabic Küçük and Can (2020). Recent work also reveal that there are related studies performed on Basque Agerri et al. (2021), German Mascarell et al. (2021), Portuguese Won and Fernandes (2022), and Persian Nasiri and Analoui (2022), too.

This stance detection tutorial will consist of two main parts where the first part will be devoted to presentation of basic concepts, related research problems, stance detection competitions, machine learning and deep learning based approaches, and stance detection resources like the datasets, with particular emphasis on publicly shared datasets. The second part of the tutorial will mostly cover open research topics related to stance detection. Basically the following open research topics will be considered in the second part:

(1)

Stance Detection in Data Streams: Data streams constitute a significant research area Bonab and Can (2016); Gözüaçık and Can (2021); Gulcan and Can (2022) and stance detection can also be applied to large volumes of particularly social media posts provided as streams. Hence, stance detection in data streams is one of the fruitful open research areas of stance detection Bechini et al. (2021).
(2)

Finer Grained Stance Detection: Most of the work on stance detection performs three-way or two-way classification using the stance classes of favor, against, none, and neutral. However, considering related work on sentiment analysis, corresponding polarity classes occasionally include classes that posses additional semantic information such as strongly positive, strongly negative, weakly positive etc. In order to extract richer stance-related information from the underlying texts, finer-grained stance detection can be employed similarly: by extending the basic stance label set with new labels such as strongly favor, weakly favor, strongly against, etc.
(3)

Stance Detection on Legal Texts and Other Text Genres: Recently, most of the stance detection research has been conducted on social media texts, particularly on tweets. But stance detection can also be performed on legal texts and on other text genres such as generic news articles Mascarell et al. (2021) which may lead to fruitful results for the corresponding domains.
(4)

Cross-lingual and Multilingual Stance Detection: In cross-lingual stance detection, stance-annotated dataset in one language is used to improve stance detection in another language Mohtarami et al. (2019); Hardalov et al. (2022) where the latter language is usually a low-resource one. On the other hand, in multilingual stance detection Lai et al. (2020), a stance detection approach is usually applied to datasets in different languages to test the portability of the employed approach. Both cross-lingual and multilingual stance detection are part of important open research topics of stance detection.

Second part of the tutorial will also cover important application areas of stance detection such as topics in information retrieval Al-Ghadir et al. (2021), fake news detection Shu et al. (2019), and rumour detection and resolution Yang et al. (2022); Huang et al. (2022).

2. Target Audience, Prerequisites, and Benefits

The main two benefits of this tutorial are listed below:

•

The tutorial will firstly cover the basic concepts, approaches, and resources for stance detection. While covering approaches and resources such as datasets, more emphasis will be given to recent research on stance detection.
•

Secondly, significant open research topics related to stance detection will be outlined, along with the main application areas of stance detection. Although these two aspects can be thought as intersecting, here we will put more emphasis on open research avenues of stance detection.

Target audience of this tutorial include researchers and practitioners interested in stance detection, social media analysis, affective computing, natural language processing, and information retrieval. We believe that the tutorial will be beneficial to those researchers/practitioners who have prior information about stance detection as well as to those who do not have, since the tutorial will cover basics, current state-of-the-art, and significant future research opportunities.

There are no prerequisites for this tutorial.

3. Tutorial Outline

Below provided is the outline of our half-day tutorial proposal on stance detection and open research problems related to stance detection.

(1)
Tutorial Part I: Basics, Competitions, Approaches, and Datasets
1. (a)
  Basic concepts of stance detection
  1. (i)
    
    Stance detection: problem definition
  2. (ii)
    
    Subproblems of stance detection
  3. (iii)
    
    Problems related to stance detection
2. (b)
  Stance detection competitions (shared tasks)
  1. (i)
    
    Shared task on English tweets (2016)
  2. (ii)
    
    Shared task on Chinese microblogs (2016)
  3. (iii)
    
    Shared task on Spanish and Catalan tweets (2017)
  4. (iv)
    
    Shared task on Italian tweets (2020)
  5. (v)
    
    Shared task on Basque tweets (2021)
3. (c)
  
  Approaches to stance detection
4. (d)
  
  Stance detection datasets
(2)
Tutorial Part II: Open Research Avenues and Application Areas
1. (a)
  Open research avenues
  1. (i)
    
    Stance detection in data streams
  2. (ii)
    
    Finer grained stance detection
  3. (iii)
    
    Stance detection on legal documents and on other text genres
  4. (iv)
    
    Cross-lingual and multilingual stance detection
2. (b)
  Application areas
  1. (i)
    
    Information retrieval
  2. (ii)
    
    Rumour classification
  3. (iii)
    
    Fake news detection

4. Previous Related Tutorials

4.1. Detection and Characterization of Stance on Social Media (ICWSM-2020)

This stance detection tutorial was performed as part of the ICWSM-2020 conference and particularly considers stance detection on social media¹¹1http://smash.inf.ed.ac.uk/files/Part2_phase2.pdf. Our current stance detection tutorial covers more recent work, is not limited to stance detection on social media only (though we should acknowledge that most of the related work is performed on social media), and finally allots almost half of the tutorial duration to open research avenues. Therefore, we believe that our tutorial will be very beneficial for researchers and graduate students who are about to start stance detection research.

4.2. Stance Detection: Concepts, Approaches, Resources, and Outstanding Issues (SIGIR-2021)

This tutorial was presented at SIGIR-2021 conference and constitutes the initial form of our tutorials on stance detection Küçük and Can (2021). The tutorial is mostly built upon the work presented in Küçük and Can (2020) and the current tutorial has intersecting content with this preceding tutorial, particularly regarding the basic concepts of stance detection. Yet, as emphasized in the previous sections, the current tutorial covers research problems very much related to stance detection that are not considered in this preceding tutorial. Again, being a more recent tutorial, our current tutorial includes more recent studies on stance detection and emphasizes open research topics.

4.3. A Tutorial on Stance Detection (WSDM-2022)

This is the second version of our tutorials on stance detection and was presented at WSDM-2022 Küçük and Can (2022). It is an extended form of the initial tutorial presented at SIGIR-2021, including more recent related studies compared to the previous one. Hence, the current tutorial differs from this second form in that the current one covers more recent research and additionally emphasizes important open research avenues.

5. Short Biographies of Presenters

5.1. Dilek Küçük

Dilek Küçük is an associate professor and senior chief researcher at TÜBİTAK Marmara Research Center (MRC) in Ankara, Turkey. She received her B.Sc., M.Sc. and Ph.D. degrees in Computer Engineering all from Middle East Technical University (Ankara, Turkey) in 2003, 2005, and 2011, respectively. From May 2013 to May 2014, she studied as a post-doctoral researcher at European Commission’s Joint Research Centre in Italy. Her research interests include stance detection, social media analysis, and energy informatics. She is the author or co-author of 16 papers published in SCI-indexed journals (including ACM-CSUR and IEEE transactions) and 42 papers presented at international conferences/workshops. She was a joint tutorial presenter at SIGIR-2021 and WSDM-2022 conferences. Her personal Web page is available at https://dkucuk.github.io/en.html.

5.2. Fazli Can

Fazli Can received the B.Sc. and M.Sc. degrees in Electrical and Electronics and Computer Engineering and the Ph.D. degree in Computer Engineering from Middle East Technical University, Ankara, Turkey, in 1976, 1979, and 1985, respectively. He conducted his Ph.D. research under the supervision of Prof. E. Ozkarahan; at Arizona State University, Tempe, AZ, USA, and Intel, Chandler, AZ, USA; as a part of the RAP Database Machine Project. He is currently a faculty member at Bilkent University, Ankara. Before joining Bilkent, he was a tenured full professor at Miami University, Oxford, OH, USA. He co-edited ACM SIGIR Forum from 1995 to 2002 and is a co-founder of the Bilkent Information Retrieval Group, Bilkent University. His interest in dynamic information processing dates back to his 1993 incremental clustering paper in ACM Transactions on Information Systems and some other earlier work with Prof. E. Ozkarahan on dynamic cluster maintenance. His current research interests include information retrieval and data mining. His personal Web page is available at http://www.cs.bilkent.edu.tr/~canf.

References

(1)
Agerri et al. (2021) Rodrigo Agerri, Roberto Centeno, María Espinosa, Joseba Fernandez de Landa, and Alvaro Rodrigo. 2021. VaxxStance@IberLEF-2021: Overview of the task on going beyond text in cross-lingual stance detection. Procesamiento del Lenguaje Natural 67 (2021), 173–181.
Al-Ghadir et al. (2021) Abdulrahman I Al-Ghadir, Aqil M Azmi, and Amir Hussain. 2021. A novel approach to stance detection in social media tweets by fusing ranked lists and sentiments. Information Fusion 67 (2021), 29–40.
AlDayel and Magdy (2021) Abeer AlDayel and Walid Magdy. 2021. Stance detection on social media: State of the art and trends. Information Processing & Management 58, 4 (2021), 102597.
Bechini et al. (2021) Alessio Bechini, Alessandro Bondielli, Pietro Ducange, Francesco Marcelloni, and Alessandro Renda. 2021. Addressing event-driven concept drift in twitter stream: A stance detection application. IEEE Access 9 (2021), 77758–77770.
Bonab and Can (2016) Hamed R Bonab and Fazli Can. 2016. A theoretical framework on the ideal number of classifiers for online ensembles in data streams. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 2053–2056.
Chen et al. (2021) Pengyuan Chen, Kai Ye, and Xiaohui Cui. 2021. Integrating N-gram features into pre-trained model: a novel ensemble model for multi-target stance detection. In International Conference on Artificial Neural Networks. Springer, 269–279.
Cignarella et al. (2020) Alessandra Teresa Cignarella, Mirko Lai, Cristina Bosco, Viviana Patti, Rosso Paolo, et al. 2020. SardiStance@EVALITA2020: Overview of the task on stance detection in Italian tweets. In EVALITA 2020 Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. 1–10.
Gözüaçık and Can (2021) Ömer Gözüaçık and Fazli Can. 2021. Concept learning using one-class classifiers for implicit drift detection in evolving data streams. Artificial Intelligence Review 54, 5 (2021), 3725–3747.
Gulcan and Can (2022) Ege Berkay Gulcan and Fazli Can. 2022. Unsupervised concept drift detection for multi-label data streams. Artificial Intelligence Review (2022), 1–34.
Hardalov et al. (2022) Momchil Hardalov, Arnav Arora, Preslav Nakov, and Isabelle Augenstein. 2022. Few-shot cross-lingual stance detection with sentiment-based pre-training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 10729–10737.
Huang et al. (2022) Weidong Huang, Yuan Wang, Jinyuan Yang, and Yijun Xu. 2022. Stance Detection Based on User Feature Fusion. Computational Intelligence and Neuroscience 2022 (2022).
Küçük (2021a) Dilek Küçük. 2021a. Sentiment, stance, and intent detection in Turkish tweets. In New Opportunities for Sentiment Analysis and Information Processing. IGI Global, 206–217.
Küçük (2021b) Dilek Küçük. 2021b. Stance quantification: Definition of the problem. arXiv preprint arXiv:2112.13288 (2021).
Küçük and Can (2020) Dilek Küçük and Fazli Can. 2020. Stance detection: A survey. ACM Computing Surveys (CSUR) 53, 1 (2020), 1–37.
Küçük and Can (2021) Dilek Küçük and Fazli Can. 2021. Stance detection: Concepts, approaches, resources, and outstanding issues. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2673–2676.
Küçük and Can (2022) Dilek Küçük and Fazli Can. 2022. A tutorial on stance detection. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 1626–1628.
Lai et al. (2020) Mirko Lai, Alessandra Teresa Cignarella, Delia Irazú Hernández Farías, Cristina Bosco, Viviana Patti, and Paolo Rosso. 2020. Multilingual stance detection in social media political debates. Computer Speech & Language 63 (2020), 101075.
Mascarell et al. (2021) Laura Mascarell, Tatyana Ruzsics, Christian Schneebeli, Philippe Schlattner, Luca Campanella, Severin Klingler, and Cristina Kadar. 2021. Stance detection in German news articles. In Proceedings of the Fourth Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 66–77.
Mohammad et al. (2016a) Saif Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, and Colin Cherry. 2016a. SemEval-2016 Task 6: Detecting stance in tweets. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). 31–41.
Mohammad et al. (2016b) Saif M Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, and Colin Cherry. 2016b. A dataset for detecting stance in tweets. In Proceedings of the Language Resources and Evaluation Conference. 3945–3952.
Mohtarami et al. (2019) Mitra Mohtarami, James Glass, and Preslav Nakov. 2019. Contrastive language adaptation for cross-lingual stance detection. arXiv preprint arXiv:1910.02076 (2019).
Nasiri and Analoui (2022) Homa Nasiri and Morteza Analoui. 2022. Persian stance detection with transfer learning and data augmentation. In Proceedings of the 27th International Computer Conference, Computer Society of Iran (CSICC). 1–5.
Niu et al. (2019) Peiqing Niu, Zhongfu Chen, Meina Song, et al. 2019. A novel bi-directional interrelated model for joint intent detection and slot filling. arXiv preprint arXiv:1907.00390 (2019).
Schiller et al. (2021) Benjamin Schiller, Johannes Daxenberger, and Iryna Gurevych. 2021. Stance detection benchmark: How robust is your stance detection? KI-Künstliche Intelligenz 35, 3 (2021), 329–341.
Shu et al. (2019) Kai Shu, Suhang Wang, and Huan Liu. 2019. Beyond news contents: The role of social context for fake news detection. In Proceedings of the twelfth ACM international conference on web search and data mining. 312–320.
Taulé et al. (2017) Mariona Taulé, M Antonia Martí, Francisco Rangel, Paolo Rosso, Cristina Bosco, and Viviana Patti. 2017. Overview of the task on stance and gender detection in tweets on Catalan independence at IberEval 2017. In Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017).
Won and Fernandes (2022) Miguel Won and Jorge Fernandes. 2022. SS-PT: A stance and sentiment data set from Portuguese quoted tweets. In Proceedings of the International Conference on Computational Processing of the Portuguese Language. Springer, 110–121.
Xu et al. (2016) Ruifeng Xu, Yu Zhou, Dongyin Wu, Lin Gui, Jiachen Du, and Yun Xue. 2016. Overview of NLPCC shared task 4: Stance detection in Chinese microblogs. In Natural Language Understanding and Intelligent Applications. 907–916.
Yang et al. (2022) Ruichao Yang, Jing Ma, Hongzhan Lin, and Wei Gao. 2022. A weakly supervised propagation model for rumor verification and stance detection with multiple instance learning. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1761–1772.
Zhao and Yang (2020) Guangzhen Zhao and Peng Yang. 2020. Pretrained embeddings for stance detection with hierarchical capsule network on social media. ACM Transactions on Information Systems (TOIS) 39, 1 (2020), 1–32.