At the 2012 Annual Meeting of the Association of American Geographers, I presented during the session Information Geographies: Online Power, Representation and Voice’, which was organised by Mark Graham (Oxford Internet Institute) and Matthew Zook (University of Kentucky). For an early morning session on a Saturday, the session was well attended – and the papers in the session were very interesting.

My presentation, titled ‘Nobody wants to do council estates’ – digital divide, spatial justice and outliers‘, was the result of thinking about the nature of social information that is available on the Web and which I partially articulated in a response to a post on GeoIQ blog. When Mark and Matt asked for an abstract, I provided the following:

The understanding of the world through digital representation (digiplace) and VGI is frequently carried out with the assumption that these are valid, comprehensive and useful representations of the world. A common practice throughout the literature on these issues is to mention the digital divide and, while accepting it as a social phenomenon, either ignore it for the rest of the analysis or expect that it will solve itself over time through technological diffusion. The almost deterministic belief in technological diffusion absolves the analyst from fully confronting the political implication of the divide.

However, what VGI and social media analysis reveals is that the digital divide is part of deep and growing social inequalities in Western societies. Worse still, digiplace amplifies and strengthens them.

In digiplace the wealthy, powerful, educated and mostly male elite is amplified through multiple digital representations. Moreover, the frequent decision of algorithm designers to highlight and emphasise those who submit more media, and the level of ‘digital cacophony’ that more active contributors create, means that a very small minority – arguably outliers in every analysis of normal distribution of human activities – are super empowered. Therefore, digiplace power relationships are arguably more polarised than outside cyberspace due to the lack of social check and balances. This makes the acceptance of the disproportional amount of information that these outliers produce as reality highly questionable.

The following notes might help in making sense of the slides.

Slide 2 takes us back 405 years to Mantua, Italy, where Claudio Monteverdi has just written one of the very first operas – L’Orfeo – as an after-dinner entertainment piece for Duke Vincenzo Gonzaga. Leaving aside the wonderful music – my personal recommendation is for Emmanuelle Haïm’s performance and I used the opening toccata in my presentation – there is a serious point about history. For a large portion of human history, and as recent as 400 years ago, we knew only about the rich and the powerful. We ignored everyone else because they ‘were not important’.

Slide 3 highlights two points about modern statistics. First, that it is a tool to gain an understanding about the nature of society as a whole. Second, when we look at the main body of society, it is within the first 2 standard deviations of a normalised distribution. The Index of Deprivation of the UK (Slide 4) is an example ofthis type of analysis. Even though it was designed to direct resources to the most needy, it analyses the whole population (and, by the way, is normalised).

Slide 5 points out that on the Web, and in social media in particular, the focus is on ‘long tail’ distributions. My main issue is not with the pattern but with what it means in terms of analysing the information. This is where participation inequality (Slide 6) matters and the point of Nielsen’s analysis is that outlets such as Wikipedia (and, as we will see, OpenStreetMap) are suffering from even worse inequality than other communication media. Nielsen’s recent analysis in his newsletter (Slide 7) demonstrates how this is playing out on Facebook (FB). Notice the comment ‘these people have no life‘ or, as Sherry Turkle put it, they got life on the screen

Slide 8 and 9 demonstrate that participation inequality is strongly represented in OpenStreetMap, and we can expect it to play out in FourSquare, Google Map Maker, Waze and other GeoWeb social applications. Slide 10 focuses on other characteristics of the people that are involved in the contribution of content: men, highly educated, age 20-40. Similar characteristics have been shown in other social media and the GeoWeb by Monica Stephens & Antonella Rondinone, and by many other researchers.

In slides 11-14, observed spatial biases in OpenStreetMap are noted – concentration on highly populated places, gap between rich and poor places (using the Index of Deprivation from Slide 4), and difference between rural and urban areas. These differences were also observed in other sources of Volunteer Geographic Information (VGI) such as photo sharing sites (in Vyron Antoniou’s PhD).

Taken together, participation inequality, demographic bias and spatial bias point to a very skewed group that is producing most of the content that we see on the GeoWeb. Look back at Slide 3, and it is a good guess that this minority falls within 3 standard deviations of the centre. They are outliers – not representative of anything other than of themselves. Of course, given the large number of people online and the ability of outliers to ‘shout’ louder than anyone else, and converse among themselves, it is tempting to look at them as a population worth listening to. But it is, similarly to the opening point, a look at the rich and powerful (or super enthusiastic) and not the mainstream.

Strangely, when such a small group controls the economy, we see it as a political issue (Slide 15, which was produced by Mother Jones as part of the response to the Occupy movement). We should be just as concerned when it happens with digital content and sets the agenda of what we see and how we understand the world.

Now to the implication of this analysis, and the use of the GeoWeb and social media to understand society. Slide 17 provides the link to the GeoIQ post that argued that these outliers are worth listening to. They might be, but the issue is what you are trying to find out by looking at the data:

The first option is to ask questions about the resulting data such as ‘can it be used to update national datasets?’ – accepting the biases in the data collection as they are and explore if there is anything useful that comes out of the outcomes (Slides 19-21, from the work of Vyron Antoniou and Thomas Koukoletsos). This should be fine as long as the researchers don’t try to state something general about the way society works from the data. Even so, researchers ought to analyse and point to biases and shortcomings (Slides 11-14 are doing exactly that).

The second option is to start claiming that we can learn something about social activities (Slides 22-23, from the work of Eric Fischer and Daniel Gayo-Avello, as well as Sean Gorman in the GeoIQ post). In this case, it is wrong to read too much into the dataas Gayo-Avello noted – as the outliers’ bias renders the analysis as not representative of society. Notice, for example, the huge gap between the social media noise during the Egyptian revolution and the outcomes of the elections, or the political differences that Gayo-Avello noted.

The third option is to find data that is representative (Slide 24, from the MIT Senseable City Lab), which looks at the ‘digital breadcrumbs’ that we leave behind on a large scale – phone calls, SMS, travel cards, etc. This data is representative, but provides observations without context. There is no qualitative or contextual information that comes with it and, because of the biases that are noted above, it is wrong to integrate it with the digital cacophony of the outliers. It is most likely to lead to erroneous conclusions.

Therefore, the understanding of the concept of digiplace (Slide 25) – the ordering of digital representation through software algorithms and GeoWeb portals – is, in fact, double filtered. The provision of content by outliers means that the algorithms will tend to amplify their point of view and biases.  Not only that, digital inequality, which is happening on top of social and economic inequality, means that more and more of our views of the world are being shaped by this tiny minority.

When we add to the mix aspects of digital inequalities (some people can only afford a pay-as-you-go function phone, while a tiny minority consumes a lot of bandwidth over multiple devices), we should stop talking about the ‘digital divide’ as something that will close over time. This is some sort of imaginary trickle-down  theory that is being proven not to withstand the test of reality. If anything, it grows as the ‘haves’ are using multiple devices to shape digiplace in their own image.

This is actually one of the core problems that differentiates to approaches to engagement in data collection. There is the laissez-faire approach to engaging society in collecting information about the world (Slides 27-28 showing OpenStreetMap mapping parties) which does not confront the biases and opposite it, there are participatory approaches (Slides 29-30 showing participatory mapping exercises from the work of Mapping for Change) where the effort is on making the activity inclusive.

This point about the biases, inequality and influence on the way we understand the world is important to repeat – as it is too often ignored by researchers who deal with these data.

The London Citizen Cyberscience Summit ran in the middle of February, from 16th (Thursday) to 18th (Saturday). It marked the launch of the UCL Extreme Citizen Science (ExCiteS) group, while providing an opportunity for people who are interested in different aspects of citizen science to come together, discuss, share ideas, consider joint projects and learn from other people. The original idea for the summit, when the first organisational meeting took place in October last year, was to set a programme that would include academics who research citizen science or develop citizen science projects; practitioners and enthusiasts who are developing technologies for citizen science activities; and people who are actively engaged in citizen science.Therefore, we included a mix of talks, workshops and hack days and started approaching speakers who would cover the range of interests, backgrounds and knowledge.

The announcement about the summit came out only in late December, so it was somewhat surprising to see the level of interest in the topic of citizen science. Considering that the previous summit, in 2010, attracted about 60 or 70 participants, it was pleasing to see that the second summit attracted more than 170 people.

To read about what happened in the summit there is plenty of material online. Nature news reported it as ‘Citizen science goes extreme‘. The New Scientist blog post discussed the ‘Intelligent Maps’ project of ExCiteS in ‘Interactive maps help pygmy tribes fight back‘, which was also covered by the BBC World Service Newshour programme (around 50 minutes in) and the Canadian CBC Science Shift programme. Le Monde also reported on ‘Un laboratoire de l’extrême‘.

Another report in New Scientist focused on the Public Laboratory for Open Technology and Science (PLOTS) development of a thermal flashlight in ‘Thermal flashlight “paints” cold rooms with colour‘. The China DialogueScientists and Citizens‘ provided a broader review of the summit.

In terms of blogs, there are summaries on the GridCast blog (including some video interviews), and a summary by one of the speakers, Andrea Wiggins, of day 1, day 2 and day 3. Nicola Triscott from the Arts Catalyst provides another account of the summit and her Arctic Perspective Initiative linkage.   Another participant, Célya Gruson-Daniel, discussed the summit in French at MyScienceWork, which also provided a collection of social media from the first day at http://storify.com/mysciencework/london-citizen-cyberscience-summit-16-18th-februar.

The talks are available to view again on the LiveStream account of ExCiteS at http://www.livestream.com/excites and there are also summaries on the ExCiteS blog http://uclexcites.wordpress.com/ and on the conference site http://cybersciencesummit.org/blog/ . Flickr photos from MyScienceWork and UCL Engineering (where the image on the right is from) are also available.

For me, several highlights of the conference included the impromptu integration of different projects during the summit. Ellie D’Hondt and Matthias Stevens from  BrusSense and NoiseTube used the opportunity of the PLOTS balloon mapping demonstration to extend it to noise mapping; Darlene Cavalier from SciStarter discussed with the Open Knowledge Foundation people how to use data about citizen science projects; and the people behind Xtribe at the University of Rome considered how their application can be used for Intelligent Maps – all these are synergies, new connections and new experimentation that the summit enabled. The enthusiasm of people who came to the summit contributed significantly to its success (as well as the hard work of the ExCiteS team).

Especially interesting, because of the wide-ranging overview of examples and case studies, is how the activity is conceptualised in different ways across the spectrum of DIY citizen science to structured observations that are managed by professional scientists. This is also apparent in the reports about the summit. I have commented in earlier blog posts about the need to understand citizen science as a different way of producing scientific knowledge. What might be helpful is a clear ‘code of ethics’ or ‘code of conduct’ for scientists who are involved in such projects. As Francois Taddei highlighted in his talk at the summit, there is a need to value the shared learning among all the participants, and not to keep the rigid hierarchies of university academics/public in place. There is also a need to allow for the creativity, exploration and development of ideas that we have seen during the summit to blossom – but only happen when all the sides that are involved in the process are open to such a process.

Follow

Get every new post delivered to your Inbox.

Join 2,310 other followers