Kalina Bontcheva, University of Sheffield Title: Natural Language Processing for Social Media: Are We There Yet? Abstract: Social media poses three major computational challenges, dubbed by Gartner the 3Vs of big data: volume, velocity, and variety. NLP methods, in particular, face further difficulties arising from the short, noisy, and strongly contextualised nature of social media. As a result, NLP methods generally tend to perform significantly worse on social media, than on longer, cleaner texts, such as news. After a short introduction to the challenges of processing social media, the talk will cover key NLP algorithms (corpus annotation, linguistic pre-processing, information extraction and opinion mining) adapted to processing social content, discuss available evaluation datasets and outline remaining challenges. Since the lack of human-annotated NLP corpora of social media content is one of the key challenges, the talk will cover also crowdsourcing approaches used to collect training and evaluation data (including paid-for crowdsourcing with CrowdFlower and its combination with expert-sourcing). I will also discuss briefly practical and ethical considerations, arising from gathering and mining social media content.