Uncategorized - SEDA Lab

New paper: Classification and event identification using word embedding

24th Jul, 201924th Jul, 2019 Tristan Cann

A new paper “Classification and event identification using word embedding” is now available online.

This paper presents our contribution to the CLEF 2019 Protest-News Track, which aims to classify and identify protest events in English-language news from India and China. We used traditional classification models, namely, support vector machines and XGBoost classifiers, combined with various word embedding approaches. Multiple models were tested for experimental purposes, in addition to the two models evaluated within the official campaign. Results show promising performance, especially in terms of precision on both document and sentence classification tasks.

Twitter experiment at Royal Meteorological Society Conference

10th Jul, 201910th Jul, 2019 Michelle Spruce

Michelle Spruce recently attended the Royal Meteorological Society (RMetS) Student & Early Career Researcher conference at the University of Birmingham on 4/5 July 2019.

Our first speaker this morning is Michelle Spruce from @UniofExeter on using social media for monitoring and assessing weather events #RMetSStudent pic.twitter.com/3GBtwx5RTM

— Royal Met. Society (@RMetS) July 4, 2019

Great start to #RMetSStudents learning about social sensing of extreme weather with Michelle Spruce. pic.twitter.com/sPDA19HXbF

— Sally Woodhouse (@sallywoodhouse_) July 4, 2019

Michelle Spruce @UniofExeter uses social media data to study the impacts weather events.
The @RMetS #RMetSstudents conference just started! pic.twitter.com/qpIk9Lsh70

— Caroline Coch (@PolarCaro) July 4, 2019

Really interesting talk by Michelle Spruce on using Twitter to track the social impacts of storms #RMetSStudents @RMetS

— Chris Manktelow (@chrismanktelow3) July 4, 2019

As well as opening the conference by presenting her research on the social sensing of extreme weather events, Michelle also encouraged conference attendees to use Twitter during the conference in a social sensing experiment to understand the impact of ‘tweeting’ during an academic conference.

Social experiment at #RMetSStudents, in the capacity of using social media to infer weather impacts.

(This might mean I get rather academic these two days)

— Sleepy Tom (@Claxtneph) July 4, 2019

Over the 2 days of the conference attendees tweeted news and updates using the conference hashtag #RMetSStudents. By lunchtime on the second day of the conference with just 162 tweets Michelle was able to demonstrate the wider impact of these tweets:

Results of the #RMetSStudents Social Sensing experiment – good work everyone! pic.twitter.com/l9gxEr3cnO

— Royal Met. Society (@RMetS) July 5, 2019

Having a great time today at the @RMetS Student conference at @unibirmingham. The twitter world is buzzing with news from the conference #rmetsstudents pic.twitter.com/fxZBMRv953

— Amanda Maycock (@acmaycock) July 5, 2019

The power of #twitter, our little experiment at #RMetSStudents shows how many people you can reach with just a few tweets. pic.twitter.com/Fytm1QQmHn

— Sally Woodhouse (@sallywoodhouse_) July 5, 2019

By the end of the conference, 203 tweets including this hashtag were generated, from 44 users in 6 countries and 13 cities. While a seemingly small amount of data, by the end of the conference these tweets generated a potential reach of over 32,000 Twitter users and over 500,000 impressions (individual views of these tweets). This simple experiment demonstrated the power of using Twitter as a source of information even for small scale events such as this.

Conference paper accepted: Classification and Event Identification Using Word Embedding

18th Jun, 201918th Jun, 2019 Tristan Cann

Our new paper has just been accepted for presentation at CLEF 2019 in September.

Classification and Event Identification Using Word Embedding

This paper presents our contribution to the CLEF 2019 ProtestNews Track, which aims to classify and identify protest events in English-language news from India and China. We used traditional classification models, namely, support vector machines and XGBoost classifiers, combined with various word embedding approaches. Multiple models were tested for experimental purposes, in addition to the two models evaluated within the official campaign. Results show promising performance, especially in terms of precision on both document and sentence classification tasks.

Come and talk to us if you would like to know more.

New paper: Communities of online news exposure during the UK General Election 2015

7th Jun, 20197th Jun, 2019 Tristan Cann

New paper available in Online Social Networks and Media

Communities of online news exposure during the UK General Election 2015

Media exposure has become increasingly complex and hard to measure with the rise in online news consumption. Furthermore, since many people now routinely access news via social media, questions arise as to whether social news-sharing is affected by the polarization and partisan echo chambers that are often observed in social media communication. This study considers news-sharing on Twitter during the UK General Election in 2015, using the act of sharing as an indicator that the sharer has been exposed to that online news content. Analysis of the network structure of users and the news articles they share identifies multiple distinct user communities, which are characterized by analysis of the articles shared within them. Communities are characterised by news article sources (web domains), geographical origin and content; time of article publication was also considered but showed no significant relationships. There is evidence for ideologically biased audiences that predominantly share content from either left-leaning or right-leaning news sources, but these audiences also see content from opposing viewpoints. Other audiences are characterized by geography and/or specialised on particular news topics. Overall these findings suggest that many people consume a diverse range of news content over the election period and that the level of political bias in content exposure varies widely across the Twitter user population.

New paper: Scaling Laws in Geo-located Twitter Data

7th Jun, 20197th Jun, 2019 Tristan Cann

New paper accepted for publication in PLOS One

Scaling Laws in Geo-located Twitter Data

We observe and report on a systematic relationship between population density and Twitter use. Number of tweets, number of users and population per unit area are related by power laws, with exponents greater than one, that are consistent with each other and across a range of spatial scales. This implies that population density can accurately predict Twitter activity. Furthermore this trend can be used to identify ‘anomalous’ areas that deviate from the trend. Analysis of geo-tagged and place-tagged tweets show that geo-tagged tweets are different with respect to user type and content. Our findings have implications for the spatial analysis of Twitter data and for understanding demographic biases in the Twitter user base.