At the end of 2010, Matt Wilson (University of Kentucky) and Mark Graham(Oxford Internet Institute), started coordinating a special issue of Environment and Planning Adedicated to ‘Situating Neogeography’, asking ‘How might we situate neogeography?  What are the various assemblages, networks, ecologies, configurations, discourses, cyborgs, alliances that enable/enact these technologies?’

My response to this call is a paper titled ‘Neogeography and the delusion of democratisation’ and it is finally been accepted for publication. I am providing below an excerpt from the introduction, to provide a flavour of the discussion:

“Since the emergence of the World Wide Web (Web) in the early 1990s, claims about its democratic potential and practice are a persistent feature in the discourse about it. While awareness of the potential of ‘anyone, anytime, anywhere’ to access and use information was extolled for a long while (for an early example see Batty 1997), the emergence of Web 2.0 in the mid-2000s (O’Reilly 2005) increased this notion. In the popular writing of authors such as Friedman (2006), these sentiments are amplified by highlighting the ability of anyone to ‘plug into the flat earth platform’ from anywhere and anytime.

Around the middle of the decade, the concept of neogeography appeared and the ability to communicate geographic information over the Web (in what is termed the GeoWeb) gained prominence (see Haklay et al. 2008). Neogeography increased the notion of participation and access to geographic information, now amplified through the use of the political term democratisation. The following citations provide a flavour of the discourse within academic and popular writing – for example, in Mike Goodchild’s declaration that ‘Just as the PC democratised computing, so systems like Google Earth will democratise GIS’ (quoted in Butler 2006), or Turner’s (2006) definition of neogeography as ‘Essentially, Neogeography is about people using and creating their own maps, on their own terms and by combining elements of an existing toolset. Neogeography is about sharing location information with friends and visitors, helping shape context, and conveying understanding through knowledge of place’.  This definition emphasises the wide access to the technology in everyday practice. Similar and stronger statements can be found in Warf and Sui (2010) who clarify that ‘neogeography has helped to foster an unprecedented democratization of geographic knowledge’ (p. 200) and, moreover, ‘Wikification represents a significant step forward in the democratization of geographic information, shifting control over the production and use of GIS data from a handful of experts to large groups of users’ (ibid.). Even within international organisations this seems to be the accepted view as Nigel Snoad, strategy adviser for the communications and information services unit of the United Nations Office for the Coordination of Humanitarian Affairs (OCHA), stated: ‘On the technology side, Google, Microsoft and OpenStreetMap have really democratized mapping’ (cited in Lohr 2011).

However, what is the nature of this democratisation and what are its limits? To what extent do the technologies that mediate the access to, and creation of, geographic information allow and enable such democratisation?

To answer these questions, we need to explore the meaning of democratisation and, more specifically, within the context of interaction between people and technology. According to the Oxford English Dictionary, democratisation is ‘the action of rendering, or process of becoming, democratic’, and democracy is defined as ‘Government by the people; that form of government in which the sovereign power resides in the people as a whole, and is exercised either directly by them (as in the small republics of antiquity) or by officers elected by them. In modern use often more vaguely denoting a social state in which all have equal rights, without hereditary or arbitrary differences of rank or privilege’ [emphasis added]. A more colloquial notion of democratisation, and a much weaker one, is making a process or activity that used to be restricted to an elite or privileged group available to a wider group in society and potentially to all. For example, with mobile telephony now available across the globe, the statement ‘mobile telephony has been democratised’ aims to express the fact that, merely three decades ago, only the rich and powerful members of Western society had access to this technology.

Therefore, it is accepted from the start that the notion of democratisation cited above is more about the potential of neogeography to make the ability to assemble, organise and share geographical information accessible to anyone, anywhere and anytime and for a variety of purposes than about advancing the specific concept of democracy. And yet, it will be wrong to ignore the fuller meaning of the concept. Democratisation has a deeper meaning in respect of making geographic information technologies more accessible to hitherto excluded or marginalised groups in a way that assists them to make a change in their life and environment. Democratisation evokes ideas about participation, equality, the right to influence decision making, support to individual and group rights, access to resources and opportunities, etc. (Doppelt 2006). Using this stronger interpretation of democratisation reveals the limitation of current neogeographic practices and opens up the possibility of considering alternative development of technologies that can, indeed, be considered as democratising.

To explore this juncture of technology and democratisation, this paper relies on Andrew Feenberg’s critical philosophy of technology, especially as explored in his Questioning Technology (1999) and Transforming Technology (2002), which is useful as he addresses issues of democratisation and technology directly. For readers who are not familiar with the main positions within philosophy of technology, a very brief overview – based on Feenberg’s interpretation (1999) – is provided. This will help to explain his specific critique and suggestion for ‘deep democratisation’ of technology.

Equipped with these concepts, attention is turned to the discussion about the democratic potential of Geographic Information Systems (GIS), which appears in early discussions about GIS and society in the 1990s, and especially to the discussions within the literature on (Public) Participatory GIS (PPGIS/PGIS – assumed to be interchangeable here) and critical GIS. As we shall see, discussions about empowerment, marginalisation and governance are central to this literature from its inception and provide the foundations to build a deeper concept of democratisation when considering neogeographic practices.

Based on this historical understanding, the core of the paper explores why it is that neogeographic practices are assumed to be democratising and, more importantly, what the limitations are on their democratic potential. To do that, a hierarchy of ‘hacking’ – that is the artful alteration of technology beyond the goals of its original design or intent – is suggested. Importantly, here ‘hacking’ does not mean the malicious alteration of technology or unauthorised access to computer systems, or the specific culture of technology enthusiasts (‘hacker culture’). The term is used to capture the first and second instrumentation that Feenberg (1996, 2002) describes.  As we shall see, by exploring the ability to alter systems, there is some justification in the democratisation claims of neogeography as it has, indeed, improved the outreach of geographic technologies and opened up the potential of their use in improving democratic processes, but in a much more limited scope and extent. The paper concludes with observations on the utilisation of neogeographic technologies within the participatory process that aim to increase democratisation in its deeper sense.”

The paper’s concepts are based on talk that I originally gave in 2008 as part of the World University Netowrk seminar on Neogeography. A final note is about the length of time that some ideas need from first emerging until publication – even with the current imagination of ‘fast moving technology’, there is a value in thinking through an idea over 4 years.

At the 2012 Annual Meeting of the Association of American Geographers, I presented during the session Information Geographies: Online Power, Representation and Voice’, which was organised by Mark Graham (Oxford Internet Institute) and Matthew Zook (University of Kentucky). For an early morning session on a Saturday, the session was well attended – and the papers in the session were very interesting.

My presentation, titled ‘Nobody wants to do council estates’ – digital divide, spatial justice and outliers‘, was the result of thinking about the nature of social information that is available on the Web and which I partially articulated in a response to a post on GeoIQ blog. When Mark and Matt asked for an abstract, I provided the following:

The understanding of the world through digital representation (digiplace) and VGI is frequently carried out with the assumption that these are valid, comprehensive and useful representations of the world. A common practice throughout the literature on these issues is to mention the digital divide and, while accepting it as a social phenomenon, either ignore it for the rest of the analysis or expect that it will solve itself over time through technological diffusion. The almost deterministic belief in technological diffusion absolves the analyst from fully confronting the political implication of the divide.

However, what VGI and social media analysis reveals is that the digital divide is part of deep and growing social inequalities in Western societies. Worse still, digiplace amplifies and strengthens them.

In digiplace the wealthy, powerful, educated and mostly male elite is amplified through multiple digital representations. Moreover, the frequent decision of algorithm designers to highlight and emphasise those who submit more media, and the level of ‘digital cacophony’ that more active contributors create, means that a very small minority – arguably outliers in every analysis of normal distribution of human activities – are super empowered. Therefore, digiplace power relationships are arguably more polarised than outside cyberspace due to the lack of social check and balances. This makes the acceptance of the disproportional amount of information that these outliers produce as reality highly questionable.

The following notes might help in making sense of the slides.

Slide 2 takes us back 405 years to Mantua, Italy, where Claudio Monteverdi has just written one of the very first operas – L’Orfeo – as an after-dinner entertainment piece for Duke Vincenzo Gonzaga. Leaving aside the wonderful music – my personal recommendation is for Emmanuelle Haïm’s performance and I used the opening toccata in my presentation – there is a serious point about history. For a large portion of human history, and as recent as 400 years ago, we knew only about the rich and the powerful. We ignored everyone else because they ‘were not important’.

Slide 3 highlights two points about modern statistics. First, that it is a tool to gain an understanding about the nature of society as a whole. Second, when we look at the main body of society, it is within the first 2 standard deviations of a normalised distribution. The Index of Deprivation of the UK (Slide 4) is an example ofthis type of analysis. Even though it was designed to direct resources to the most needy, it analyses the whole population (and, by the way, is normalised).

Slide 5 points out that on the Web, and in social media in particular, the focus is on ‘long tail’ distributions. My main issue is not with the pattern but with what it means in terms of analysing the information. This is where participation inequality (Slide 6) matters and the point of Nielsen’s analysis is that outlets such as Wikipedia (and, as we will see, OpenStreetMap) are suffering from even worse inequality than other communication media. Nielsen’s recent analysis in his newsletter (Slide 7) demonstrates how this is playing out on Facebook (FB). Notice the comment ‘these people have no life‘ or, as Sherry Turkle put it, they got life on the screen

Slide 8 and 9 demonstrate that participation inequality is strongly represented in OpenStreetMap, and we can expect it to play out in FourSquare, Google Map Maker, Waze and other GeoWeb social applications. Slide 10 focuses on other characteristics of the people that are involved in the contribution of content: men, highly educated, age 20-40. Similar characteristics have been shown in other social media and the GeoWeb by Monica Stephens & Antonella Rondinone, and by many other researchers.

In slides 11-14, observed spatial biases in OpenStreetMap are noted – concentration on highly populated places, gap between rich and poor places (using the Index of Deprivation from Slide 4), and difference between rural and urban areas. These differences were also observed in other sources of Volunteer Geographic Information (VGI) such as photo sharing sites (in Vyron Antoniou’s PhD).

Taken together, participation inequality, demographic bias and spatial bias point to a very skewed group that is producing most of the content that we see on the GeoWeb. Look back at Slide 3, and it is a good guess that this minority falls within 3 standard deviations of the centre. They are outliers – not representative of anything other than of themselves. Of course, given the large number of people online and the ability of outliers to ‘shout’ louder than anyone else, and converse among themselves, it is tempting to look at them as a population worth listening to. But it is, similarly to the opening point, a look at the rich and powerful (or super enthusiastic) and not the mainstream.

Strangely, when such a small group controls the economy, we see it as a political issue (Slide 15, which was produced by Mother Jones as part of the response to the Occupy movement). We should be just as concerned when it happens with digital content and sets the agenda of what we see and how we understand the world.

Now to the implication of this analysis, and the use of the GeoWeb and social media to understand society. Slide 17 provides the link to the GeoIQ post that argued that these outliers are worth listening to. They might be, but the issue is what you are trying to find out by looking at the data:

The first option is to ask questions about the resulting data such as ‘can it be used to update national datasets?’ – accepting the biases in the data collection as they are and explore if there is anything useful that comes out of the outcomes (Slides 19-21, from the work of Vyron Antoniou and Thomas Koukoletsos). This should be fine as long as the researchers don’t try to state something general about the way society works from the data. Even so, researchers ought to analyse and point to biases and shortcomings (Slides 11-14 are doing exactly that).

The second option is to start claiming that we can learn something about social activities (Slides 22-23, from the work of Eric Fischer and Daniel Gayo-Avello, as well as Sean Gorman in the GeoIQ post). In this case, it is wrong to read too much into the dataas Gayo-Avello noted – as the outliers’ bias renders the analysis as not representative of society. Notice, for example, the huge gap between the social media noise during the Egyptian revolution and the outcomes of the elections, or the political differences that Gayo-Avello noted.

The third option is to find data that is representative (Slide 24, from the MIT Senseable City Lab), which looks at the ‘digital breadcrumbs’ that we leave behind on a large scale – phone calls, SMS, travel cards, etc. This data is representative, but provides observations without context. There is no qualitative or contextual information that comes with it and, because of the biases that are noted above, it is wrong to integrate it with the digital cacophony of the outliers. It is most likely to lead to erroneous conclusions.

Therefore, the understanding of the concept of digiplace (Slide 25) – the ordering of digital representation through software algorithms and GeoWeb portals – is, in fact, double filtered. The provision of content by outliers means that the algorithms will tend to amplify their point of view and biases.  Not only that, digital inequality, which is happening on top of social and economic inequality, means that more and more of our views of the world are being shaped by this tiny minority.

When we add to the mix aspects of digital inequalities (some people can only afford a pay-as-you-go function phone, while a tiny minority consumes a lot of bandwidth over multiple devices), we should stop talking about the ‘digital divide’ as something that will close over time. This is some sort of imaginary trickle-down  theory that is being proven not to withstand the test of reality. If anything, it grows as the ‘haves’ are using multiple devices to shape digiplace in their own image.

This is actually one of the core problems that differentiates to approaches to engagement in data collection. There is the laissez-faire approach to engaging society in collecting information about the world (Slides 27-28 showing OpenStreetMap mapping parties) which does not confront the biases and opposite it, there are participatory approaches (Slides 29-30 showing participatory mapping exercises from the work of Mapping for Change) where the effort is on making the activity inclusive.

This point about the biases, inequality and influence on the way we understand the world is important to repeat – as it is too often ignored by researchers who deal with these data.

The Eye on Earth Summit took place in Abu Dhabi on the 12 to 15 December 2011, and focused on ‘the crucial importance of environmental and societal information and networking to decision-making’. The summit was an opportunity to evaluate the development of Principle 10 from Rio declaration in 1992 as well as Chapter 40 of Agenda 21 both of which focus on environmental information and decision making.  The summit’s many speakers gave inspirational talks – with an impressive list including Jane Goodall highlighting the importance of information for education; Mathis Wackernagel updating on the developments in Ecological Footprint; Rob Swan on the importance of Antarctica;  Sylvia Earle on how we should protect the oceans; Mark Plotkin, Rebecca Moore and Chief Almir Surui on indigenous mapping in the Amazon and man others. The white papers that accompany the summit can be found in the Working Groups section of the website, and are very helpful updates on the development of environmental information issues over the past 20 years and emerging issues.

Interestingly, Working Group 2 on Content and User Needs is mentioning the conceptual framework of Environmental Information Systems (EIS) which I started developing in 1999 and after discussing it in the GIS and Environmental Modelling conference in 2000, I have published it as the paper ‘Public access to environmental information: past, present and future’ in the journal Computers, Environment and Urban Systems in 2003.

Discussing environmental information for a week made me to revisit the framework and review the changes that occurred over the past decade.

First, I’ll present the conceptual framework, which is based on 6 assertions. The framework was developed on the basis of a lengthy review in early 1999 of the available information on environmental information systems (the review was published as CASA working paper 7). While synthesising all the information that I have found, some underlying assumptions started to emerge, and by articulating them and putting them together and showing how they were linked, I could make more sense of the information that I found. This helped in answering questions such as ‘Why do environmental information systems receive so much attention from policy makers?’ and ‘Why are GIS appearing in so many environmental information systems ?’. I have used the word ‘assertions’ as the underlying principles seem to be universally accepted and taken for granted. This is especially true for the 3 core assumptions (assertions 1-3 below).

The framework offers the following assertions:

  1. Sound knowledge, reliable information and accurate data are vital for good environmental decision making.
  2. Within the framework of sustainable development, all stakeholders should take part in the decision making processes. A direct result of this is a call for improved public participation in environmental decision making.
  3. Environmental information is exceptionally well suited to GIS (and vice versa). GIS development is closely related to developments in environmental research, and GIS output is considered to be highly advantageous in understanding and interpreting environmental data.
  4. (Notice that this is emerging from combining 1 and 2) To achieve public participation in environmental decision making, the public must gain access to environmental information, data and knowledge.
  5. (Based on 1 and 3) GIS use and output is essential for good environmental decision making.
  6. (Based on all the others) Public Environmental Information Systems should be based on GIS technologies. Such systems are vital for public participation in environmental decision making.

Intriguingly, the Eye on Earth White Paper notes ‘This is a very “Geospatial” centric view; however it does summarise the broader principles of Environmental Information and its use’. Yet, my intention was not to develop a ‘Geospatial’ centric view – I was synthesising what I have found, and the keywords that I have used in the search did not include GIS. Therefore, the framework should be seen as an attempt to explain the reason that GIS is so prominent.

With this framework in mind, I have noticed a change over the past decade. Throughout the summit, GIS and ‘Geospatial’ systems were central – and they were mentioned and demonstrated many times. I was somewhat surprised how prominent they were in Sha Zukang speech (He is the Undersecretary General, United Nations, and Secretary General Rio +20 Summit). They are much more central than they were when I carried out the survey, and I left the summit feeling that for many speakers, presenters and delegates, it is now expected that GIS will be at the centre of any EIS. The wide acceptance does mean that initiatives such as the ‘Eye on Earth Network’ that is based on geographic information sharing is now possible. In the past, because of the very differing data structures and conceptual frameworks, it was more difficult to suggest such integration. The use of GIS as a lingua franca for people who are dealing with environmental information is surely helpful in creating an integrative picture of the situation at a specific place, across multiple domains of knowledge.

However, I see a cause for concern for the equivalence of GIS with EIS. As the literature in GIScience discussed over the years, GIS is good at providing snapshots, but less effective in modelling processes, or interpolating in both time and space, and most importantly, is having a specific way of creating and processing information. For example, while GIS can be coupled with system dynamic modelling (which was used extensively in environmental studies – most notably in ‘Limits to Growth’) it is also possible to run such models and simulations in packages that don’t use geographic information – For example, in the STELLA package for system dynamics or in bespoke models that were created with dedicated data models and algorithms. Importantly, the issue is not about the technical issues of coupling different software packages such as STELLA or agent-based modelling with GIS. Some EIS and environmental challenge might benefit from different people thinking in different ways about various problems and solutions, and not always forced to consider how a GIS play a part in them.

This post continues to the theme of the previous one, and is also based on the chapter that will appear next year in the book:

Sui, D.Z., Elwood, S. and M.F. Goodchild (eds.), 2013. Crowdsourcing Geographic Knowledge. Berlin: SpringerHere is a link to the chapter.

The post focuses on the participatory aspect of different Citizen Science modes:

Against the technical, social and cultural aspects of citizen science, we offer a framework that classifies the level of participation and engagement of participants in citizen science activity. While there is some similarity between Arnstein’s (1969) ‘ladder of participation and this framework, there is also a significant difference. The main thrust in creating a spectrum of participation is to highlight the power relationships that exist within social processes such as urban planning or in participatory GIS use in decision making (Sieber 2006). In citizen science, the relationship exists in the form of the gap between professional scientists and the wider public. This is especially true in environmental decision making where there are major gaps between the public’s and the scientists’ perceptions of each other (Irwin 1995).

In the case of citizen science, the relationships are more complex, as many of the participants respect and appreciate the knowledge of the professional scientists who are leading the project and can explain how a specific piece of work fits within the wider scientific body of work. At the same time, as volunteers build their own knowledge through engagement in the project, using the resources that are available on the Web and through the specific project to improve their own understanding, they are more likely to suggest questions and move up the ladder of participation. In some cases, the participants would want to volunteer in a passive way, as is the case with volunteered computing, without full understanding of the project as a way to engage and contribute to a scientific study. An example of this is the many thousands of people who volunteered to the Climateprediction.net project, where their computers were used to run global climate models. Many would like to feel that they are engaged in one of the major scientific issues of the day, but would not necessarily want to fully understand the science behind it.

Levels of Participation in Citizen Science

Therefore, unlike Arnstein’s ladder, there shouldn’t be a strong value judgement on the position that a specific project takes. At the same time, there are likely benefits in terms of participants’ engagement and involvement in the project to try to move to the highest level that is suitable for the specific project. Thus, we should see this framework as a typology that focuses on the level of participation.

At the most basic level, participation is limited to the provision of resources, and the cognitive engagement is minimal. Volunteered computing relies on many participants that are engaged at this level and, following Howe (2006), this can be termed ‘crowdsourcing’. In participatory sensing, the implementation of a similar level of engagement will have participants asked to carry sensors around and bring them back to the experiment organiser. The advantage of this approach, from the perspective of scientific framing, is that, as long as the characteristics of the instrumentation are known (e.g. the accuracy of a GPS receiver), the experiment is controlled to some extent, and some assumptions about the quality of the information can be used. At the same time, running projects at the crowdsourcing level means that, despite the willingness of the participants to engage with a scientific project, their most valuable input – their cognitive ability – is wasted.

The second level is ‘distributed intelligence’ in which the cognitive ability of the participants is the resource that is being used. Galaxy Zoo and many of the ‘classic’ citizen science projects are working at this level. The participants are asked to take some basic training, and then collect data or carry out a simple interpretation activity. Usually, the training activity includes a test that provides the scientists with an indication of the quality of the work that the participant can carry out. With this type of engagement, there is a need to be aware of questions that volunteers will raise while working on the project and how to support their learning beyond the initial training.

The next level, which is especially relevant in ‘community science’ is a level of participation in which the problem definition is set by the participants and, in consultation with scientists and experts, a data collection method is devised. The participants are then engaged in data collection, but require the assistance of the experts in analysing and interpreting the results. This method is common in environmental justice cases, and goes towards Irwin’s (1995) call to have science that matches the needs of citizens. However, participatory science can occur in other types of projects and activities – especially when considering the volunteers who become experts in the data collection and analysis through their engagement. In such cases, the participants can suggest new research questions that can be explored with the data they have collected. The participants are not involved in detailed analysis of the results of their effort – perhaps because of the level of knowledge that is required to infer scientific conclusions from the data.

Finally, collaborative science is a completely integrated activity, as it is in parts of astronomy where professional and non-professional scientists are involved in deciding on which scientific problems to work and the nature of the data collection so it is valid and answers the needs of scientific protocols while matching the motivations and interests of the participants. The participants can choose their level of engagement and can be potentially involved in the analysis and publication or utilisation of results. This form of citizen science can be termed ‘extreme citizen science’ and requires the scientists to act as facilitators, in addition to their role as experts. This mode of science also opens the possibility of citizen science without professional scientists, in which the whole process is carried out by the participants to achieve a specific goal.

This typology of participation can be used across the range of citizen science activities, and one project should not be classified only in one category. For example, in volunteer computing projects most of the participants will be at the bottom level, while participants that become committed to the project might move to the second level and assist other volunteers when they encounter technical problems. Highly committed participants might move to a higher level and communicate with the scientist who coordinates the project to discuss the results of the analysis and suggest new research directions.

As part of the Volunteered Geographic Information (VGI) workshop that was held in Seattle in April 2011, Daniel Sui, Sarah Elwood and Mike Goodchild announced that they will be editing a volume dedicated to the topic, published as ‘Crowdsourcing Geographic Knowledge‘  (Here is a link to the Chapter in Crowdsourcing Geographic Knowledge)

My contribution to this volume focuses on citizen science, and shows the links between it and VGI. The chapter is currently under review, but the following excerpt discusses different types of citizen science activities, and I would welcome comments:

“While the aim here is not to provide a precise definition of citizen science. Yet, a definition and clarification of what the core characteristics of citizen science are is unavoidable. Therefore, it is defined as scientific activities in which non-professional scientists volunteer to participate in data collection, analysis and dissemination of a scientific project (Cohn 2008; Silvertown 2009). People who participate in a scientific study without playing some part in the study itself – for example, volunteering in a medical trial or participating in a social science survey – are not included in this definition.

While it is easy to identify a citizen science project when the aim of the project is the collection of scientific information, as in the recording of the distribution of plant species, there are cases where the definition is less clear-cut. For example, the process of data collection in OpenStreetMap or Google Map Maker is mostly focused on recording verifiable facts about the world that can be observed on the ground. The tools that OpenStreetMap mappers use – such as remotely sensed images, GPS receivers and map editing software – can all be considered scientific tools. With their attempt to locate observed objects and record them on a map accurately, they follow the footsteps of surveyors such as Robert Hooke, who also carried out an extensive survey of London using scientific methods – although, unlike OpenStreetMap volunteers, he was paid for his effort. Finally, cases where facts are collected in a participatory mapping activity, such as the one that Ghose (2001) describes, should probably be considered a citizen science only if the participants decided to frame it as such. For the purpose of the discussion here, such a broad definition is more useful than a limiting one that tries to reject certain activities.

Notice also that, by definition, citizen science can only exist in a world in which science is socially constructed as the preserve of professional scientists in academic institutions and industry, because, otherwise, any person who is involved in a scientific project would simply be considered a contributor and potentially a scientist. As Silvertown (2009) noted, until the late 19th century, science was mainly developed by people who had additional sources of employment that allowed them to spend time on data collection and analysis. Famously, Charles Darwin joined the Beagle voyage, not as a professional naturalist but as a companion to Captain FitzRoy. Thus, in that era, almost all science was citizen science albeit mostly by affluent gentlemen scientists and gentlewomen. While the first professional scientist is likely to be Robert Hooke, who was paid to work on scientific studies in the 17th century, the major growth in the professionalisation of scientists was mostly in the latter part of the 19th and throughout the 20th centuries.

Even with the rise of the professional scientist, the role of volunteers has not disappeared, especially in areas such as archaeology, where it is common for enthusiasts to join excavations, or in natural science and ecology, where they collect and send samples and observations to national repositories. These activities include the Christmas Bird Watch that has been ongoing since 1900 and the British Trust for Ornithology Survey, which has collected over 31 million records since its establishment in 1932 (Silvertown 2009). Astronomy is another area where amateurs and volunteers have been on par with professionals when observation of the night sky and the identification of galaxies, comets and asteroids are considered (BBC 2006). Finally, meteorological observations have also relied on volunteers since the early start of systematic measurements of temperature, precipitation or extreme weather events (WMO 2001).

This type of citizen science provides the first type of ‘classic’ citizen science – the ‘persistence’ parts of science where the resources, geographical spread and the nature of the problem mean that volunteers sometimes predate the professionalisation and mechanisation of science. These research areas usually require a large but sparse network of observers who carry out their work as part of a hobby or leisure activity. This type of citizen science has flourished in specific enclaves of scientific practice, and the progressive development of modern communication tools has made the process of collating the results from the participants easier and cheaper, while inherently keeping many of the characteristics of data collection processes close to their origins.

A second set of citizen science activities is environmental management and, even more specifically, within the context of environmental justice campaigns. Modern environmental management includes strong technocratic and science oriented management practices (Bryant & Wilson 1998; Scott & Barnett 2009) and environmental decision making is heavily based on scientific environmental information. As a result, when an environmental conflict emerges – such as a community protest over a local noisy factory or planned expansion of an airport – the valid evidence needs to be based on scientific data collection. This aspect of environmental justice struggle is encouraging communities to carry out ‘community science’ in which scientific measurements and analysis are carried out by members of local communities so they can develop an evidence base and set out action plans to deal with problems in their area. A successful example of such an approach is the ‘Global Community Monitor’ method to allow communities to deal with air pollution issues (Scott & Barnett 2009). This is performed through a simple method of sampling air using plastic buckets followed by analysis in an air pollution laboratory, and, finally, the community being provided with instructions on how to understand the results. This activity is termed ‘Bucket Brigade’ and was used across the world in environmental justice campaigns. In London, community science was used to collect noise readings in two communities that are impacted by airport and industrial activities. The outputs were effective in bringing environmental problems to the policy arena (Haklay, Francis & Whitaker 2008). As in ‘classic’ citizen science, the growth in electronic communication has enabled communities to identify potential methods – e.g. through the ‘Global Community Monitor’ website – as well as find international standards , regulations and scientific papers that can be used together with the local evidence.
However, the emergence of the Internet and the Web as a global infrastructure has enabled a new incarnation of citizen science: the realisation of scientists that the public can provide free labour, skills, computing power and even funding, and, the growing demands from research funders for public engagement all contributing to the motivation of scientists to develop and launch new and innovative projects (Silvertown 2009; Cohn 2008). These projects utilise the abilities of personal computers, GPS receivers and mobile phones to double as scientific instruments.

This third type of citizen science has been termed ‘citizen cyberscience’ by Francois Grey (2009). Within it, it is possible to identify three sub-categories: volunteered computing, volunteered thinking and participatory sensing.

Volunteered computing was first developed in 1999, with the foundation of SETI@home (Anderson et al. 2002), which was designed to distribute the analysis of data that was collected from a radio telescope in the search for extra-terrestrial intelligence. The project utilises the unused processing capacity that exists in personal computers, and uses the Internet to send and receive ‘work packages’ that are analysed automatically and sent back to the main server. Over 3.83 million downloads were registered on the project’s website by July 2002. The system on which SETI@home is based, the Berkeley Open Infrastructure for Network Computing (BOINC), is now used for over 100 projects, covering Physics, processing data from the Large Hadron Collider through LHC@home; Climate Science with the running of climate models in Climateprediction.net; and Biology in which the shape of proteins is calculated in Rosetta@home.

While volunteered computing requires very little from the participants, apart from installing software on their computers, in volunteered thinking the volunteers are engaged at a more active and cognitive level (Grey 2009). In these projects, the participants are asked to use a website in which information or an image is presented to them. When they register onto the system, they are trained in the task of classifying the information. After the training, they are exposed to information that has not been analysed, and are asked to carry out classification work. Stardust@home (Westphal et al. 2006) in which volunteers were asked to use a virtual microscope to try to identify traces of interstellar dust was one of the first projects in this area, together with the NASA ClickWorkers that focused on the classification of craters on Mars. Galaxy Zoo (Lintott et al. 2008), a project in which volunteers classify galaxies, is now one of the most developed ones, with over 100,000 participants and with a range of applications that are included in the wider Zooniverse set of projects (see http://www.zooniverse.org/) .

Participatory sensing is the final and most recent type of citizen science activity. Here, the capabilities of mobile phones are used to sense the environment. Some mobile phones have up to nine sensors integrated into them, including different transceivers (mobile network, WiFi, Bluetooth), FM and GPS receivers, camera, accelerometer, digital compass and microphone. In addition, they can link to external sensors. These capabilities are increasingly used in citizen science projects, such as Mappiness in which participants are asked to provide behavioural information (feeling of happiness) while the phone records their location to allow the linkage of different locations to wellbeing (MacKerron 2011). Other activities include the sensing of air-quality (Cuff 2007) or noise levels (Maisonneuve et al. 2010) by using the mobile phone’s location and the readings from the microphone.”

At the State of the Map (EU) 2011 conference that was held in Vienna from 15-17 July, I gave a keynote talk on the relationships between the OpenStreetMap  (OSM) community and the GIScience research community. Of course, the relationships are especially important for those researchers who are working on volunteered Geographic Information (VGI), due to the major role of OSM in this area of research.

The talk included an overview of what researchers have discovered about OpenStreetMap over the 5 years since we started to pay attention to OSM. One striking result is that the issue of positional accuracy does not require much more work by researchers. Another important outcome of the research is to understand that quality is impacted by the number of mappers, or that the data can be used with confidence for mainstream geographical applications when some conditions are met. These results are both useful, and of interest to a wide range of groups, but there remain key areas that require further research – for example, specific facets of quality, community characteristics  and how the OSM data is used.

Reflecting on the body of research, we can start to form a ‘code of engagement’ for both academics and mappers who are engaged in researching or using OpenStreetMap. One such guideline would be  that it is both prudent and productive for any researcher do some mapping herself, and understand the process of creating OSM data, if the research is to be relevant and accurate. Other aspects of the proposed ‘code’ are covered in the presentation.

The talk is also available as a video from the TU Wien Matterhorn server

 

 

GIS Research UK (GISRUK) is a long running conference series, and the 2011 instalment was hosted by the University of Portsmouth at the end of April.

During the conference, I was asked to give a keynote talk about Participatory GIS. I decided to cover the background of Participatory GIS in the mid-1990s, and the transition to more advanced Web Mapping applications from the mid-2000s. Of special importance are the systems that allow user-generated content, and the geographical types of systems that are now leading to the generation of Volunteer Geographic Information (VGI).

The next part of the talk focused on Citizen Science, culminating with the ideas that are the basis for Extreme Citizen Science.

Interestingly, as in previous presentations, one of the common questions about Citizen Science came up. Professional scientists seem to have a problem with the suggestion that citizens are as capable as scientists in data collection and analysis. While there is an acceptance about the concept, the idea that participants can suggest problems, collect data rigorously and analyse it seems to be too radical – or worrying.

What is important to understand is that the ideas of Extreme Citizen Science are not about replacing the role of scientists, but are a call to rethink the role of the participants and the scientists in cases where Citizen Science is used. It is a way to consider science as a collaborative process of learning and exploration of issues. My own experience is that participants have a lot of respect for the knowledge of the scientists, as long as the scientists have a lot of respect for the knowledge and ability of the participants. The participants would like to learn more about the topic that they are exploring and are keen to know: ‘what does the data that I collected mean?’ At the same time, some of the participants can become very serious in terms of data collection, reading about the specific issues and using the resources that are available online today to learn more. At some point, they are becoming knowledgeable participants and it is worth seeing them as such.

The slides below were used for this talk, and include links to the relevant literature.

Yesterday, for the first time, I came across the phrase ‘GIS Systems’ in an academic paper, written by geographers (not GIS experts). I have also noticed that the term is being used more often in recent times when people talk about packages such as ArcGIS or Mapinfo.

On the face of it, talking about a ‘GIS System’ is ridiculous – how can you say ‘geographic information system system’? However, people have a reason for using this phrase and it makes some sense to them.

Maybe the reason is that GIS now stands for a class or type of computer software that can manage, manipulate and visualise geographic information, so GIS system is the specific hardware and software that is used. Personally, I’ll continue to find it odd and use GIS for what it is…

The slides below are from my presentation in State of the Map 2010 in Girona, Spain. While the conference is about OpenStreetMap, the presentation covers a range of spatially implicint and explicit crowdsourcing projects and also activities that we carried out in Mapping for Change, which all show that unlike other crowdsourcing activities, geography (and places) are both limiting and motivating contribution to them.

In many ways, OpenStreetMap is similar to other open source and open knowledge projects, such as Wikipedia. These similarities include the patterns of contribution and the importance of participation inequalities, in which a small group of participants contribute very significantly, while a very large group of occasional participants contribute only occasionally; the general demographic of participants, with strong representation from educated young males; or the temporal patterns of engagements, in which some participants go through a peak of activity and lose interest, while a small group joins and continues to invest its time and effort to help the progress of the project. These aspects have been identified by researchers who explored volunteering and leisure activities, and crowdsourcing as well as those who explored commons-based peer production networks (Benkler & Nissenbaum 2006).

However, OpenStreetMap is a project about geography, and deals with the shape of features and information about places on the face of the Earth. Thus, the emerging question is ‘what influence does geography have on OSM?’ Does geography make some fundamental changes to the basic principles of crowdsourcing, or should OSM be treated as ‘wikipedia for maps’?

In the presentation, which is based on my work, as well as the work of Vyron Antoniou and Nama Budhathoki, we argue that geography is playing a ‘tyrannical’ role in OSM and other projects that are based on crowdsourced geographical information and shapes the nature of the project beyond what is usually accepted.

The first influence of geography is on motivation. A survey of OSM participants shows that specific geographical knowledge, which a participant acquired at first hand, and the wish to use this knowledge and see it mapped well is an important factor in participation in the project. We found that participants are driven to mapping activities by their desire to represent the places they care about and fix the errors on the map. Both of these motives require local knowledge.

A second influence is on the accuracy and completeness of coverage, with places that are highly populated, and therefore have a larger pool of potential participants, showing better coverage than suburban areas of well-mapped cities. Furthermore, there is an ongoing discussion within the OSM community about the value of mapping without local knowledge and the impact of such action on the willingness of potential contributors to fix errors and contribute to the map.

A third, and somewhat surprising, influence is the impact of mapping places that the participants haven’t or can’t visit, such as Haiti after the earthquake or Baghdad in 2007. Despite the willingness of participants to join in and help in the data collection process, the details that can be captured without being on the ground are fairly limited, even when multiple sources such as Flickr images, Google Street View and paper maps are used. The details are limited to what was captured at a certain point in time and to the limitations of the sensing device, so the mapping is, by necessity, incomplete.

We will demonstrate these and other aspects of what we termed ‘the tyranny of place’ and its impact on what can be covered by OSM without much effort and which locations will not be covered without a concentrated effort that requires some planning.

After the publication of the comparison of OpenStreetMap and Google Map Maker coverage of Haiti, Nicolas Chavent from the Humanitarian OpenStreetMap Team contacted me and turned my attention to the UN Stabilization Mission in Haiti’s (known as MINUSTAH) geographical dataset, which is seen as the core set for the post earthquake humanitarian effort, and therefore a comparison with this dataset might be helpful, too. The comparison of the two Volunteered Geographical Information (VGI) datasets of OpenStreetMap and Google Map Maker with this core dataset also exposed an aspect of the usability of geographical information in emergency situations that is worth commenting on.

For the purpose of the comparison, I downloaded two datasets from GeoCommons – the detailed maps of Port-au-Prince and the Haiti road network. Both are reported on GeoCommons as originating from MINUSTAH. I combined them together, and then carried out the comparison. As in the previous case, the comparison focused only on the length of the roads, with the hypothesis that, if there is a significant difference in the length of the road at a given grid square, it is likely that the longer dataset is more complete. The other comparisons between established and VGI datasets give ground to this hypothesis, although caution must be applied when the differences are small. The following maps show the differences between the MINUSTAH dataset and OpenStreetMap and MINUSTAH and Google Map Maker datasets. I have also reproduced the original map that compares OpenStreetMap and Map Maker for the purpose of comparison and consistency, as well as for cartographic quality.

OpenStreetMap and Google Map Maker - Haiti - 18 January 2010

MINUSTAH and OpenStreetMap - Haiti - 18 January 2010

MINUSTAH and Google Map Maker - Haiti - 18 January 2010

The maps show that MINUSTAH does provide fairly comprehensive coverage across Haiti (as expected) and that the volunteered efforts of OpenStreetMap and Map Maker provide further details in urban areas.  There are areas that are only covered by one of the datasets, so they all have value.
The final comparison uses the 3 datasets together, with the same criteria as in the previous map – the dataset with the longest length of roads is the one that is considered the most complete.

MINUSTAH, OpenStreetMap and Google Map Maker - Haiti - 18 January 2010

It is interesting to note the south/north divide between OpenStreetMap and Google Map Maker, with Google Map Maker providing more details in the north, and OpenStreetMap in the south (closer to the earthquake epicentre). When compared over the areas in which there is at least 100 metres of coverage of MINUSTAH, OpenStreetMap is, overall, 64.4% complete, while Map Maker is 41.2% complete. Map Maker is covering further 354 square kilometres which are not covered by MINUSTAH or OpenStreetMap, and OpneStreetMap is covering further 1044 square kilometres that are missing from the other datasets, so clearly there is a benefit in integrating them. The grid that includes the analysis of the integrated datasets in shapefile format is available here, in case that it is of any use or if you like to carry out further analysis and or visualise it.

While working on this comparison, it was interesting to explore the data fields in the MINUSTAH dataset, with some of them included to provide operational information, such as road condition, length of time that it takes to travel through it, etc. These are the hallmarks of practical and operational geographical information, with details that are relevant directly to the end-users in their daily tasks. The other two datasets have been standardised for universal coverage and delivery, and this is apparent in their internal data structure. Google Map Maker schema is closer to traditional geographical information products in field names and semantics, exposing the internal engineering of the system – for example, including a country code, which is clearly meaningless in a case where you are downloading one country! OpenStreetMap (as provided by either CloudMade or GeoFabrik) keeps with the simplicity mantra and is fairly basic. Yet, the scheme is the same in Haiti as in England or any other place. So just like Google, it takes a system view of the data and its delivery.

This means that, from an end-user perspective, while these VGI data sources were produced in a radically different way to traditional GI products, their delivery is similar to the way in which traditional products were delivered, burdening the user with the need to understand the semantics of the different fields before using the data.

In emergency situations, this is likely to present an additional hurdle for the use of any data, as it is not enough to provide the data for download through GeoCommons, GeoFabrik or Google – it is how it is going to be used that matters. Notice that the maps tell a story in which an end-user who wants to have full coverage of Haiti has to combine three datasets, so the semantic interpretation can be an issue for such a user.

So what should a user-centred design of GI for an emergency situation look like? The general answer is ‘find the core dataset that is used by the first responders, and adapt your data to this standard’. In the case of Haiti, I would suggest that the MINUSTAH dataset is a template for such a thing. It is more likely to find users of GI on the ground who are already exposed to the core dataset and familiar with it. The fields are relevant and operational and show that this is more ‘user-centred’ than the other two. Therefore, it would be beneficial for VGI providers who want to help in an emergency situation to ensure that their data comply to the local de facto standard, which is the dataset being used on the ground, and bring their schema to fit it.

Of course, this is what GI ontologies are for, to allow for semantic interoperability. The issue with them is that they add at least two steps – define the ontology and figure out the process to translate the dataset that you have acquired to the required format. Therefore, this is something that should be done by data providers, not by end-users when they are dealing with the real situation on the ground. They have more important things to do than to find a knowledge engineer that can understand semantic interoperability…

Follow

Get every new post delivered to your Inbox.

Join 2,082 other followers