Useing Neural Networks To Detect Examples of Ad Hominem In Politics and Microblogging Platforms – Tensorflow – Model



Useing Neural Networks To Detect Examples of Ad Hominem In Politics and Microblogging Platforms – Tensorflow – Model

0 0


mics2016

Presentation for MICS 2016

On Github jghibiki / mics2016

Useing Neural Networks To Detect Examples of Ad Hominem In Politics and Microblogging Platforms

By Jordan Goetze Computer Science North Dakota State University Fargo, North Dakota 58103 jordan.goetze@ndsu.edu

Introduction

  • Ad Hominem
  • Tensorflow
  • Model
  • Dataset and Expirimental Setup
  • Results and Observations
  • Future Work

Ad Hominem

Attacking a speaker's character rather than their argument.

@RealBenCarson take your common sense BS and stick it. What this country needs is a laxative. Your a doctor and can't see that? #Trump2016

Tensorflow

What is Tensorflow?

  • Open source library by Google
  • Python and C++ APIs
  • CPU or GPU architectures
  • Represents models as graphs
  • Recently added distributed support

Model

Yoon Kim's Convolutional Neural Networks for Sentence Classification

Embedding Layer

word2vec

Covolutional Layers

Output Layer

Loss

Accuracy

Sensitivity

Sensibility

Dataset and Expirimental Setup

Data Set

Total Tweets 5808 Negative Examples 4155 Positive Examples 1653

Data Preprocessing

Original Tweet || @HillaryClinton why always so #smug? https://t.co/eOU1rOaOlR Preprocessed Tweet || <AT_NAME/> why always so #smug <URL/>

Hyperparameter Tuning

Filter Windows 3, 4, 5 Dropout Rate 0.5 L2 constrains 3 Mini-batch size 64 Filters per size 50 Word embedding dimensions* 20

Model Variations

N-T-Names

Baseline model Non-unique twitter names are removed Example: <AT_NAME/> Word embeddings randomly initialized and trained with model

U-T-Names

Based on the baseline model (N-T-Names) Twitter usernames are replaced with a unique token Example: <AT_NAME_123/>

G-N-Vecs:

Base on the baseline model (N-T-Names) Initialized with 300 dimension pre-trained word embeddings
  • Publicly available model
  • Trained using word2vec
  • Trained on 100 billion words from Google News articles
If word is not included in 300 dimmension pre-trained word embeddings it is initialized randomly.
Total Words 8313 100% of words Pre-trained Embeddings 5841 70.3% of words Randomly Initialized embeddings 2472 29.7% of words

Results and Observations

Run Time: 500 epochs/35.4K runs

Model Accuracy Specificity Sensitivity N-T-Names 87.4% 95.6% 35.5% U-T-Names 87.3% 95.1% 38.2% G-N-Vecs 87.5% 95.8% 34.2%

N-T-Names vs U-T-Names

Model Accuracy Specificity Sensitivity N-T-Names 87.4% 95.6% 35.5% U-T-Names 87.3% 95.1% 38.2%

N-T-Names vs G-N-Vecs

Model Accuracy Specificity Sensitivity N-T-Names 87.4% 95.6% 35.5% G-N-Vecs 87.5% 95.8% 34.2%

Future Work

Early Stopping

Green: N-T-Names Yellow: U-T-Names Brown: G-N-Vecs

Data Ratio

Total Tweets 5808 Negative Examples 4155 Positive Examples 1653 5 Negative Example : 2 Example

Questions

References

Mikolov et al. Distributed Representations of Words and Phrases and their Compositionality. In Preceedings of ACL 2002, October 2013. Britz D. Implementation a CNN for Text Classification in Tensorflow. WildML, December 2015. Kim, Y. Convolutional Neural Networks for Sentence Classification. In EMNLP 2014, September 2014.

References

Abadi et al. Vector Representations of Words. 2015 Goldberg and Levy. Word2Vec Explained: Deriving Mikolov et al.'s Negative-Sampling Word-Embedding Method. February 2014.

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.

0
Useing Neural Networks To Detect Examples of Ad Hominem In Politics and Microblogging Platforms By Jordan Goetze Computer Science North Dakota State University Fargo, North Dakota 58103 jordan.goetze@ndsu.edu