Bridging Classification and Visualization

Derya Ipek Eroglu, Onur Seref, Michelle Seref

WHAT IS DONE

· Collected text data from a discussion community on Reddit using Reddit API on Python, and preprocessed the data using Python’s nlp libraries

· Represented the collected data numerically using Doc2Vec method and dimensionality reduction techniques and built an interactive visualization of the entire corpus

· Implemented a Deep Learning method (bi-LSTM) to detect the persuasive content in the community

· Our findings showed that the developed implementation has limitations on the dataset and the limitations are analyzed using previously developed vector representation to extract the patterns and understand the behavior of the black-box implementation

ABSTRACT

Digital platforms have become indispensable parts of peoples’ lives in various different ways, and they have changed the way people interact with each other. These platforms serve as public spheres where people may have discussions, share information and persuade each other. In this study, we analyze discussions from an online platform on racial injustice context. We seek answers to two questions: (1) Can we effectively capture characteristic textual features in these discussions using distributed vector representations? and (2) Can we effectively detect persuasion with the help of these vectors? To answer the first question, we use sentence-level vectors, dimensionality reduction and clustering techniques to visualize sentence distributions followed by a qualitative analysis among sentence representations within close proximity. We then build a context-aware deep learning approach using a stacked Bidirectional Long-Short Term Memory (BiLSTM) architecture to detect persuasion, and we compare its performance with a support vector machine (SVM) classifier combined with distributed vectors and topic modeling vectors. Finally, we bridge the classification and visualization results to analyze the classification performance in terms of semantic relationships.

Keywords: Distributed Vector Representation, Visualization, Deep Learning, Persuasion