Citizen Science and Ethics session (British Ecological Society – Citizen Science SIG)

As part of the activities of the Citizen Science Special Interest Group of the British Ecological Society (BES), Michael Pocock organised “A training event for citizen science: What you need to know, but no one told you!”. I was asked to lead a 30 minutes discussion on ethics and citizen science. This is a wide area, and some discussion about it is already happening.  In addition, there is an emerging working group within the Citizen Science Association (CSA) that will be dedicated to this issue, and I have summarised the session about ethics in the CSA conference in another post.

For the training event, and especially considering that the participants are more likely to be with a background in ecology, I have decided to focus on 4 documents with ‘codes of ethics’ that are the most relevant to ecology & citizen science, with 2 extra for comparison. Three of these are official – the codes of ethics of the Ecological Society of America – (ESA, available here), the Chartered Institute of Ecology and Environmental Management – (CIEE, available here), the International Society of Ethnobiology (ISE, available here). Finally, the European Citizen Science Association (ECSA) principles of citizen science (the latest draft available here). In the comparative group, I used the Royal Geographical Society and the Institute of Civil Engineers codes.

What is noticeable in professional codes of ethics (ESA, CIEEM) is that the profession, its reputation and the relationships between members are the top priority. This is common to almost all professional codes of ethics – and it demonstrate that ethics is about self-preservation. Later on, come the responsibility to the other stakeholders, the wider public and to non-humans that the activities can impact. Commonly, wider issues are covered in the principles, or in a preamble, but not within the code itself – although the Royal Geographical Society actually codified  “due regard to the need to protect the environment, human rights, and to ensure efficient use of natural resources” and the Institute of Civil Engineers also codified “due regard for the environment and for the sustainable management of natural resources.”. It is somewhat ironic that ecologists have not codified this aspect.

The two other documents are especially interesting from the point of view of citizen science. First, the ISE code of ethics is not mostly about the researchers and their professional standing, but “to facilitate ethical conduct and equitable relationships, and foster a commitment to meaningful collaboration and reciprocal responsibility by all parties.” it continues with “The fundamental value underlying the Code of Ethics is the concept of mindfulness – a continual willingness to evaluate one’s own understandings, actions, and responsibilities to others. The Code of Ethics acknowledges that biological and cultural harms have resulted from research undertaken without the consent of Indigenous peoples.” and it has a much stronger stance on the duty of care of the researcher as the powerful actor in the situation.

The code is especially relevant in bottom-up citizen science activities, but a lot of it seem to match the concepts behind ECSA principles of citizen science. The principles are calling for a meaningful activities with mutual respect and recognition of the scientists and the volunteers that working with them.

Will the ethics of citizen science evolve along this more inclusive lines, with an understanding that following this will also help to grow and preserve the field as a whole?

 

UCL Fossil Fuel Divestment debate

UCL organised a debate about fossil fuel divestment, with 7 knowledgeable speakers (all professors), raising argument for and against the suggestion that UCL should divest from fossil fuels and sell its £21million invested in the industry. In the room and on the panel there were more people who supported the motion than those who opposed it. Interestingly, at the end of the discussion more people switched to support divestment. I took notes of the positions and what people mentioned as a way to map the different views that were expressed. So here are my notes, the tl;dr point of each argument and something about my view at the end of this longish post.

Anthony Costello opened the debate and discuss that research from UCL provided evidence to justify the Guardian ‘keep it in the ground‘ campaign. The aim of the debate is to explore different views and see what the general views of the people who attended it.

Richard Horton, the editor of the Lancet opened with some comments officially Chaired the debate – there is a movement around divestment from fossil fuel that is very rapidly growing across society and different places. Universities are special in their role in society – they are creator of knowledge about public policy issues, but they are also a moral space within society, where position can be taken. Some of UK universities decided to divest – Glasgow, SOAS. Other university didn’t decide – e.g. Oxford. It is appropriate to ask what UCL should do, it is leading on considering the impacts of climate change on society at large – e.g. risk to health?

Chris Rapley, opened with nothing that we are the first humans to breath at 400ppm of CO in the atmosphere as a basic composition  – it is above the levels for the last 800,000 years. 40% rise is the same increment between ice age and interglacial age. The change is taking place 100 times faster than anything natural. The conclusion is that it is unwise to increase above 2C from pre-industrial levels, and we have very little left to burn. 80% of coal and 50% of oil are unburnable – we don’t have a solution for carbon capture and storage yet. The first reason to divest is that it’s prudent – it’s the energy of the past, and renewable are the future. The valuations are a bubble so it is best to put the money elsewhere. second point, we need to be put on a trajectory away from fossil fuel by this December – and lot of issues play out in Paris will not be ratified until 2020. We need to connect the trajectory that we are currently on and the future one, so we do it properly. The CEOs of BP and Shell suggested business as usual, and the recent budget gave 1.2 billion to North Sea oil, so the government is not following its own statement. UCL as radical thinker need to do a gesture in the right direction. We are all part of a web of carbon intensive world, and we need to manage the transition.

My TL;DR Rapley: Science is showing the need combined with need to change trajectory 

Jane Holder argued for divestment from the point of view of a teacher of environmental law. The meaning for teaching and learning – the movement that has gone on for the past 20 and more years to increase environmental education at university level. Helping students to deal with contested knowledge, uncertainty and environmental issues. UCL has done a lot of work over the years – changes to the estate and the curriculum, from this perspective, the UCL campaign make a connection between the estate, curriculum and its finance. There is linkages between environmental education and the learning and teaching of UCL. Significance of informal curriculum – the intangible way in which institution instil values in students – there is publicness of university building and the way it treat staff. Secondly, there are broader changes in university – in terms of education, students explained that since tuition fees, the student is viewed as a consumer, and not citizens of the university community. The divestment campaign allow students to act as citizens and not consumer. University is a site for environmental activity and the roles should be combined.

My TL;DR Holder: teaching and learning imperatives and finance is part of it. 

Anthony Finkelstein argued, this is not question of expertise, starting point that accepting the need to change – but expect to see change in energy sources happening with technological advances. The speed and extent of change is complex feedback systems. Generally, he adopts a precautionary view. However, fossil fuel will be part of our future because of their properties – we need to deal with consumption and not production. Consumption is within a political context and the condemnation of fossil fuel is about political failure. UCL should invest according to regulatory aspects. Ownership of asset can be used to exert influence, the concern is about research and UCL strategy – it’s hypocritical to use money from fossil fuel companies for research, but not to invest in them – it sends the wrong message. A lot of research in engineering is supported by fossil fuel companies – also raise the issue of academic freedom. It is not right to ask or to question ethics of people you disagree with. (see the full argument on Anthony’s blog)

My TL;DR Finkelstein: deal with consumption and not production, we are using a lot of funding from fossil fuel companies and there is a risk to academic freedom. 

Hugh Montgomery – fossil fuel helped humanity, but it need to stop. Energy gain to the planet is equivalent to 5 Hiroshima bombs a second. 7% will stay 100,000 years. Health impacts will be in all sort of directions – and these concern also among military bodies, or the WHO. Not only from ‘extreme left wing’ organisations. Even to reach 2C we have 27 years or even more like 19 years – if we are to act, we have to keep 2/3 of fossil fuel in the ground. There is over-exposure to stranded assets. Why divestment? it’s not ‘rabid anti-capitalist agenda’ – we should change market forces. The aim of the divestment is to force fossil fuel companies to go through transformative change. UCL should do what is right, not only the amount of money, but as a statement. The stigmatisation will be significant to fossil fuel companies.

My TL;DR Montgomery: it’s not only left wing politics, even if you are fairly conservative in outlook, this make sense. 

Jane Rendell – stated that she concerned about the environment and stood down from the Bartlett research vice-dean position because of BHP Billiton funding. Leading on ethics and the built environment in the Bartlett. From her view, investment in fossil fuel is not compatible with UCL research strategy of dealing with judicious application of knowledge to improve humanity. The investment is incompatible with its own ethical guidelines and its environmental strategy. It’s also incompatible with UCL research itself about the need to leave fossil fuel in the ground. The most profound change will come from breaking down practices of finance – it’s not acceptable for fund manager to hide behind claims to ignore their responsibility to everyone else. The only argument is for shareholder engagement, and there is no evidence for it – as well as Porritt declared recently about the uselessness of engagement.

My TL;DR Rendell: incompatibility with UCL policies, and there is no point in engagement. 

Alan Penn – Universities should concentrate in their place in society – relatively new institutions and important in generation of knowledge and passing it to future generation. The ability to critically question the world. We are all invested in this companies – we benefiting from tax from North Sea oil, pension funds. Arguing that money is just transactional property and therefore doesn’t hold value. Arguing that people should invest and force companies to change through engagement.

My TL;DR Penn: don’t mix money with values, and if you want change, buy controlling stake in shares.

After the discussion (with Anthony Finkelstein having to defend his position more than anyone else), there was more support for divestment, although most of the room started from that point of view.

Finally, my view – I’ve started my university studies in October 1991, and as I was getting interested in environment and society issues in my second year of combined Computer Science and Geography degree, the Earth Summit in Rio (June 1992) was all the rage. The people who taught me have been in the summit – that also explains how I got interested in Principle 10 of the Rio declaration which is central to my work. This biographical note is to point that Earth Summit was also the starting point for the Framework Convention on Climate Change, which opens with

The Parties to this Convention,
Acknowledging that change in the Earth’s climate and its adverse effects are a common concern of humankind,
Concerned that human activities have been substantially increasing the atmospheric concentrations of greenhouse gases, that these increases enhance the natural greenhouse effect, and that this will result on average in an additional warming of the Earth’s surface and atmosphere and may adversely affect natural ecosystems and humankind …

So for the past 23 years, I’ve been watching from the sidelines how the decision makers are just not getting to the heart of the matter, nor acting although they are told about the urgency. The science was clear then. If the actions of government and industries started in 1992 we could have all been well on the route for sustainability (at least in energy consumption). It was absolutely clear that the necessary technologies are already around. I therefore find the argument of shareholders engagement as unrealistic at this stage, nor do I see the link between investment, where you don’t have control over the actions of the company, and careful decision on which research project to carry out in collaboration and under which conditions. This why I have supported the call to UCL to divest.

 

I still need to find the time to write the academic paper to follow my blog post about the role of my research area in fossil fuel exploration

Geoweb, crowdsourcing, liability and moral responsibility

Yesterday, Tenille Brown led a Twitter discussion as part of the Geothink consortium. Tenille opened with a question about liability and wrongful acts that can harm others

If you follow the discussion (search in Twitter for #geothink) you can see how it evolved and which issues were covered.

At one point, I have asked the question:

It is always intriguing and frustrating, at the same time, when a discussion on Twitter is taking its own life and many times move away from the context in which a topic was brought up originally. At the same time, this is the nature of the medium. Here are the answers that came up to this question:

You can see that the only legal expert around said that it’s a tough question, but of course, everyone else shared their (lay) view on the basis of moral judgement and their own worldview and not on legality, and that’s also valuable. The reason I brought the question was that during the discussion, we started exploring the duality in the digital technology area to ownership and responsibility – or rights and obligations. It seem that technology companies are very quick to emphasise ownership (expressed in strong intellectual property right arguments) without responsibility over the consequences of technology use (as expressed in EULAs and the general attitude towards the users). So the nub of the issue for me was about agency. Software does have agency on its own but that doesn’t mean that it absolved the human agents from responsibility over what it is doing (be it software developers or the companies).

In ethics discussions with engineering students, the cases of Ford Pinto or the Thiokol O-rings in the Discovery Shuttle disaster come up as useful examples to explore the responsibility of engineers towards their end users. Ethics exist for GIS – e.g. the code of ethics of URISA, or the material online about ethics for GIS professional and in Esri publication. Somehow, the growth of the geoweb took us backward. The degree to which awareness of ethics is internalised within a discourse of ‘move fast and break things‘, software / hardware development culture of perpetual beta, lack of duty of care, and a search for fast ‘exit’ (and therefore IBG-YBG) make me wonder about which mechanisms we need to put in place to ensure the reintroduction of strong ethical notions into the geoweb. As some of the responses to my question demonstrate, people will accept the changes in societal behaviour and view them as normal…

Update: Tenille posted a detailed answer to this post at http://geothink.ca/torts-of-the-geoweb-or-the-liability-question-part-i/

Geographic Information Science and Citizen Science

Thanks to invitations from UNIGIS and Edinburgh Earth Observatory / AGI Scotland, I had an opportunity to reflect on how Geographic Information Science (GIScience) can contribute to citizen science, and what citizen science can contribute to GIScience.

Despite the fact that it’s 8 years since the term Volunteers Geographic Information (VGI) was coined, I didn’t assume that all the audience is aware of how it came about or the range of sources of VGI. I also didn’t assume knowledge of citizen science, which is far less familiar term for a GIScience audience. Therefore, before going into a discussion about the relationship between the two areas, I opened with a short introduction to both, starting with VGI, and then moving to citizen science. After introduction to the two areas, I’m suggesting the relationships between them – there are types of citizen science that are overlapping VGI – biological recording and environmental observations, as well as community (or civic) science, while other types, such as volunteer thinking includes many projects that are non-geographical (think EyeWire or Galaxy Zoo).

However, I don’t just list a catalogue of VGI and citizen science activities. Personally, I found trends a useful way to make sense of what happen. I’ve learned that from the writing of Thomas Friedman, who used it in several of his books to help the reader understand where the changes that he covers came from. Trends are, of course, speculative, as it is very difficult to demonstrate causality or to be certain about the contribution of each trends to the end result. With these caveats in mind, there are several technological and societal trends that I used in the talk to explain how VGI (and the VGI element of citizen science) came from.

Of all these trends, I keep coming back to one technical and one societal that I see as critical. The removal of selective availability of GPS in May 2000 is my top technical change, as the cascading effect from it led to the deluge of good enough location data which is behind VGI and citizen science. On the societal side, it is the Flynn effect as a signifier of the educational shift in the past 50 years that explains how the ability to participate in scientific projects have increased.

In terms of the reciprocal contributions between the fields, I suggest the following:

GIScience can support citizen science by considering data quality assurance methods that are emerging in VGI, there are also plenty of Spatial Analysis methods that take into account heterogeneity and therefore useful for citizen science data. The areas of geovisualisation and human-computer interaction studies in GIS can assist in developing more effective and useful applications for citizen scientists and people who use their data. There is also plenty to do in considering semantics, ontologies, interoperability and standards. Finally, since critical GIScientists have been looking for a long time into the societal aspects of geographical technologies such as privacy, trust, inclusiveness, and empowerment, they have plenty to contribute to citizen science activities in how to do them in more participatory ways.

On the other hand, citizen science can contribute to GIScience, and especially VGI research, in several ways. First, citizen science can demonstrate longevity of VGI data sources with some projects going back hundreds of years. It provides challenging datasets in terms of their complexity, ontology, heterogeneity and size. It can bring questions about Scale and how to deal with large, medium and local activities, while merging them to a coherent dataset. It also provide opportunities for GIScientists to contribute to critical societal issues such as climate change adaptation or biodiversity loss. It provides some of the most interesting usability challenges such as tools for non-literate users, and finally, plenty of opportunities for interdisciplinary collaborations.

The slides from the talk are available below.

Long-running citizen science and Flynn effect

If you have been reading the literature on citizen science, you must have noticed that many papers that describe citizen science start with an historical narrative, something along the lines of:

As Silvertown (2009) noted, until the late 19th century, science was mainly developed by people who had additional sources of employment that allowed them to spend time on data collection and analysis. Famously, Charles Darwin joined the Beagle voyage, not as a professional naturalist but as a companion to Captain FitzRoy[*]. Thus, in that era, almost all science was citizen science albeit mostly by affluent gentlemen and gentlewomen scientists[**]. While the first professional scientist is likely to be Robert Hooke, who was paid to work on scientific studies in the 17th century, the major growth in the professionalisation of scientists was mostly in the latter part of the 19th and throughout the 20th century.
Even with the rise of the professional scientist, the role of volunteers has not disappeared, especially in areas such as archaeology, where it is common for enthusiasts to join excavations, or in natural science and ecology, where they collect and send samples and observations to national repositories. These activities include the Christmas Bird Watch that has been ongoing since 1900 and the British Trust for Ornithology Survey, which has collected over 31 million records since its establishment in 1932 (Silvertown 2009). Astronomy is another area in which amateurs and volunteers have been on a par with professionals when observation of the night sky and the identification of galaxies, comets and asteroids are considered (BBC 2006). Finally, meteorological observations have also relied on volunteers since the early start of systematic measurements of temperature, precipitation or extreme weather events (WMO 2001). (Haklay 2013 emphasis added)

The general messages of this historical narrative are: first, citizen science is a legitimate part of scientific practice as it was always there, we just ignored it for 50+ years; second, that some citizen science is exactly as it was – continuous participation in ecological monitoring or astronomical observations, only that now we use smartphones or the Met Office WOW website and not pen, paper and postcards.

The second aspect of this argument is one that I was wondering about as I was writing a version of the historical narrative for a new report. This was done within a discussion on how the educational and technological transitions over the past century reshaped citizen science. I have argued that the demographic and educational transition in many parts of the world, and especially the rapid growth in the percentage and absolute numbers of people with higher education degrees who are potential participants is highly significant in explaining the popularity of citizen science. To demonstrate that this is a large scale and consistent change, I used the evidence of Flynn effect, which is the rapid increase in IQ test scores across the world during the 20th century.

However, while looking at the issue recently, I came across Jim Flynn TED talk ‘Why our IQ levels are higher than our grandparents (below). At 3:55, he raise a very interesting point, which also appears in his 2007 What is Intelligence? on pages 24-26. Inherently, Flynn argues that the use of cognitive skills have changed dramatically over the last century, from thinking that put connections to concrete relationship with everyday life as the main way of understanding the world, to one that emphasise scientific categories and abstractions. He use an example of a study from the early 20th Century, in which participants where asked about commonalities between fish and birds. He highlights that it was not the case that in the ‘pre-scientific’ worldview people didn’t know that both are animals, but more the case that this categorisation was not helpful to deal with concrete problems and therefore not common sense. Today, with scientific world view, categorisation such as ‘these are animals’ come first.

This point of view have implications to the way we interpret and understand the historical narrative. If correct, than the people who participate in William Whewell tide measurement work (see Caren Cooper blogpost about it), cannot be expected to think about contribution to science, but could systematically observed concrete events in their area. While Whewell view of participants as ‘subordinate labourers’ is still elitist and class based, it is somewhat understandable.  Moreover, when talking about projects that can show continuity over the 20th Century – such as Christmas Bird Count or phenology projects – we have to consider the option that an the worldview of the person that done that in 1910 was ‘how many birds there are in my area?’ while in 2010 the framing is ‘in order to understand the impact of climate change, we need to watch out for bird migration patterns’. Maybe we can explore in historical material to check for this change in framing? I hope that projects such as Constructing Scientific Communities which looks at citizen science in the 19th and 21th century will shed light on such differences.


[*] Later I found that this is not such a simple fact – see van Wyhe 2013 “My appointment received the sanction of the Admiralty”: Why Charles Darwin really was the naturalist on HMS Beagle

[**] And we shouldn’t forget that this was to the exclusion of people such as Mary Anning

 

International Encyclopedia of Geography – Quality Assurance of VGI

The Association of American Geographers is coordinating an effort to create an International Encyclopedia of Geography. Plans started in 2010, with an aim to see the 15 volumes project published in 2015 or 2016. Interestingly, this shows that publishers and scholars are still seeing the value in creating subject-specific encyclopedias. On the other hand, the weird decision by Wikipedians that Geographic Information Science doesn’t exist outside GIS, show that geographers need a place to define their practice by themselves. You can find more information about the AAG International Encyclopedia project in an interview with Doug Richardson from 2012.

As part of this effort, I was asked to write an entry on ‘Volunteered Geographic Information, Quality Assurance‘ as a short piece of about 3000 words. To do this, I have looked around for mechanisms that are used in VGI and in Citizen Science. This are covered in OpenStreetMap studies and similar work in GIScience, and in the area of citizen science, there are reviews such as the one by Andrea Wiggins and colleagues of mechanisms to ensure data quality in citizen science projects, which clearly demonstrated that projects are using multiple methods to ensure data quality.

Below you’ll find an abridged version of the entry (but still long). The citation for this entry will be:

Haklay, M., Forthcoming. Volunteered geographic information, quality assurance. in D. Richardson, N. Castree, M. Goodchild, W. Liu, A. Kobayashi, & R. Marston (Eds.) The International Encyclopedia of Geography: People, the Earth, Environment, and Technology. Hoboken, NJ: Wiley/AAG

In the entry, I have identified 6 types of mechanisms that are used to ensure quality assurance when the data has a geographical component, either VGI or citizen science. If I have missed a type of quality assurance mechanism, please let me know!

Here is the entry:

Volunteered geographic information, quality assurance

Volunteered Geographic Information (VGI) originate outside the realm of professional data collection by scientists, surveyors and geographers. Quality assurance of such information is important for people who want to use it, as they need to identify if it is fit-for-purpose. Goodchild and Li (2012) identified three approaches for VGI quality assurance , ‘crowdsourcing‘ and that rely on the number of people that edited the information, ‘social’ approach that is based on gatekeepers and moderators, and ‘geographic’ approach which uses broader geographic knowledge to verify that the information fit into existing understanding of the natural world. In addition to the approaches that Goodchild and li identified, there are also ‘domain’ approach that relate to the understanding of the knowledge domain of the information, ‘instrumental observation’ that rely on technology, and ‘process oriented’ approach that brings VGI closer to industrialised procedures. First we need to understand the nature of VGI and the source of concern with quality assurance.

While the term volunteered geographic information (VGI) is relatively new (Goodchild 2007), the activities that this term described are not. Another relatively recent term, citizen science (Bonney 1996), which describes the participation of volunteers in collecting, analysing and sharing scientific information, provide the historical context. While the term is relatively new, the collection of accurate information by non-professional participants turn out to be an integral part of scientific activity since the 17th century and likely before (Bonney et al 2013). Therefore, when approaching the question of quality assurance of VGI, it is critical to see it within the wider context of scientific data collection and not to fall to the trap of novelty, and to consider that it is without precedent.

Yet, this integration need to take into account the insights that emerged within geographic information science (GIScience) research over the past decades. Within GIScience, it is the body of research on spatial data quality that provide the framing for VGI quality assurance. Van Oort’s (2006) comprehensive synthesis of various quality standards identifies the following elements of spatial data quality discussions:

  • Lineage – description of the history of the dataset,
  • Positional accuracy – how well the coordinate value of an object in the database relates to the reality on the ground.
  • Attribute accuracy – as objects in a geographical database are represented not only by their geometrical shape but also by additional attributes.
  • Logical consistency – the internal consistency of the dataset,
  • Completeness – how many objects are expected to be found in the database but are missing as well as an assessment of excess data that should not be included.
  • Usage, purpose and constraints – this is a fitness-for-purpose declaration that should help potential users in deciding how the data should be used.
  • Temporal quality – this is a measure of the validity of changes in the database in relation to real-world changes and also the rate of updates.

While some of these quality elements might seem independent of a specific application, in reality they can be only be evaluated within a specific context of use. For example, when carrying out analysis of street-lighting in a specific part of town, the question of completeness become specific about the recording of all street-light objects within the bounds of the area of interest and if the data set includes does not include these features or if it is complete for another part of the settlement is irrelevant for the task at hand. The scrutiny of information quality within a specific application to ensure that it is good enough for the needs is termed ‘fitness for purpose’. As we shall see, fit-for-purpose is a central issue with respect to VGI.

To understand the reason that geographers are concerned with quality assurance of VGI, we need to recall the historical development of geographic information, and especially the historical context of geographic information systems (GIS) and GIScience development since the 1960s. For most of the 20th century, geographic information production became professionalised and institutionalised. The creation, organisation and distribution of geographic information was done by official bodies such as national mapping agencies or national geological bodies who were funded by the state. As a results, the production of geographic information became and industrial scientific process in which the aim is to produce a standardised product – commonly a map. Due to financial, skills and process limitations, products were engineered carefully so they can be used for multiple purposes. Thus, a topographic map can be used for navigation but also for urban planning and for many other purposes. Because the products were standardised, detailed specifications could be drawn, against which the quality elements can be tested and quality assurance procedures could be developed. This was the backdrop to the development of GIS, and to the conceptualisation of spatial data quality.

The practices of centralised, scientific and industrialised geographic information production lend themselves to quality assurance procedures that are deployed through organisational or professional structures, and explains the perceived challenges with VGI. Centralised practices also supported employing people with focus on quality assurance, such as going to the field with a map and testing that it complies with the specification that were used to create it. In contrast, most of the collection of VGI is done outside organisational frameworks. The people who contribute the data are not employees and seemingly cannot be put into training programmes, asked to follow quality assurance procedures, or expected to use standardised equipment that can be calibrated. The lack of coordination and top-down forms of production raise questions about ensuring the quality of the information that emerges from VGI.

To consider quality assurance within VGI require to understand some underlying principles that are common to VGI practices and differentiate it from organised and industrialised geographic information creation. For example, some VGI is collected under conditions of scarcity or abundance in terms of data sources, number of observations or the amount of data that is being used. As noted, the conceptualisation of geographic data collection before the emergence of VGI was one of scarcity where data is expensive and complex to collect. In contrast, many applications of VGI the situation is one of abundance. For example, in applications that are based on micro-volunteering, where the participant invest very little time in a fairly simple task, it is possible to give the same mapping task to several participants and statistically compare their independent outcomes as a way to ensure the quality of the data. Another form of considering abundance as a framework is in the development of software for data collection. While in previous eras, there will be inherently one application that was used for data capture and editing, in VGI there is a need to consider of multiple applications as different designs and workflows can appeal and be suitable for different groups of participants.

Another underlying principle of VGI is that since the people who collect the information are not remunerated or in contractual relationships with the organisation that coordinates data collection, a more complex relationships between the two sides are required, with consideration of incentives, motivations to contribute and the tools that will be used for data collection. Overall, VGI systems need to be understood as socio-technical systems in which the social aspect is as important as the technical part.

In addition, VGI is inherently heterogeneous. In large scale data collection activities such as the census of population, there is a clear attempt to capture all the information about the population over relatively short time and in every part of the country. In contrast, because of its distributed nature, VGI will vary across space and time, with some areas and times receiving more attention than others. An interesting example has been shown in temporal scales, where some citizen science activities exhibit ‘weekend bias’ as these are the days when volunteers are free to collect more information.

Because of the difference in the organisational settings of VGI, a different approaches to quality assurance is required, although as noted, in general such approaches have been used in many citizen science projects. Over the years, several approaches emerged and these include ‘crowdsourcing ‘, ‘social’, ‘geographic’, ‘domain’, ‘instrumental observation’ and ‘process oriented’. We now turn to describe each of these approaches.

Thecrowdsourcing approach is building on the principle of abundance. Since there are is a large number of contributors, quality assurance can emerge from repeated verification by multiple participants. Even in projects where the participants actively collect data in uncoordinated way, such as the OpenStreetMap project, it has been shown that with enough participants actively collecting data in a given area, the quality of the data can be as good as authoritative sources. The limitation of this approach is when local knowledge or verification on the ground (‘ground truth’) is required. In such situations, the ‘crowdsourcing’ approach will work well in central, highly populated or popular sites where there are many visitors and therefore the probability that several of them will be involved in data collection rise. Even so, it is possible to encourage participants to record less popular places through a range of suitable incentives.

Thesocial approach is also building on the principle of abundance in terms of the number of participants, but with a more detailed understanding of their knowledge, skills and experience. In this approach, some participants are asked to monitor and verify the information that was collected by less experienced participants. The social method is well established in citizen science programmes such as bird watching, where some participants who are more experienced in identifying bird species help to verify observations by other participants. To deploy the social approach, there is a need for a structured organisations in which some members are recognised as more experienced, and are given the appropriate tools to check and approve information.

Thegeographic approach uses known geographical knowledge to evaluate the validity of the information that is received by volunteers. For example, by using existing knowledge about the distribution of streams from a river, it is possible to assess if mapping that was contributed by volunteers of a new river is comprehensive or not. A variation of this approach is the use of recorded information, even if it is out-of-date, to verify the information by comparing how much of the information that is already known also appear in a VGI source. Geographic knowledge can be potentially encoded in software algorithms.

Thedomain approach is an extension of the geographic one, and in addition to geographical knowledge uses a specific knowledge that is relevant to the domain in which information is collected. For example, in many citizen science projects that involved collecting biological observations, there will be some body of information about species distribution both spatially and temporally. Therefore, a new observation can be tested against this knowledge, again algorithmically, and help in ensuring that new observations are accurate.

Theinstrumental observation approach remove some of the subjective aspects of data collection by a human that might made an error, and rely instead on the availability of equipment that the person is using. Because of the increased in availability of accurate-enough equipment, such as the various sensors that are integrated in smartphones, many people keep in their pockets mobile computers with ability to collect location, direction, imagery and sound. For example, images files that are captured in smartphones include in the file the GPS coordinates and time-stamp, which for a vast majority of people are beyond their ability to manipulate. Thus, the automatic instrumental recording of information provide evidence for the quality and accuracy of the information.

Finally, the ‘process oriented approach bring VGI closer to traditional industrial processes. Under this approach, the participants go through some training before collecting information, and the process of data collection or analysis is highly structured to ensure that the resulting information is of suitable quality. This can include provision of standardised equipment, online training or instruction sheets and a structured data recording process. For example, volunteers who participate in the US Community Collaborative Rain, Hail & Snow network (CoCoRaHS) receive standardised rain gauge, instructions on how to install it and an online resources to learn about data collection and reporting.

Importantly, these approach are not used in isolation and in any given project it is likely to see a combination of them in operation. Thus, an element of training and guidance to users can appear in a downloadable application that is distributed widely, and therefore the method that will be used in such a project will be a combination of the process oriented with the crowdsourcing approach. Another example is the OpenStreetMap project, which in the general do not follow limited guidance to volunteers in terms of information that they collect or the location in which they collect it. Yet, a subset of the information that is collected in OpenStreetMap database about wheelchair access is done through the highly structured process of the WheelMap application in which the participant is require to select one of four possible settings that indicate accessibility. Another subset of the information that is recorded for humanitarian efforts is following the social model in which the tasks are divided between volunteers using the Humanitarian OpenStreetMap Team (H.O.T) task manager, and the data that is collected is verified by more experienced participants.

The final, and critical point for quality assurance of VGI that was noted above is fitness-for-purpose. In some VGI activities the information has a direct and clear application, in which case it is possible to define specifications for the quality assurance element that were listed above. However, one of the core aspects that was noted above is the heterogeneity of the information that is collected by volunteers. Therefore, before using VGI for a specific application there is a need to check for its fitness for this specific use. While this is true for all geographic information, and even so called ‘authoritative’ data sources can suffer from hidden biases (e.g. luck of update of information in rural areas), the situation with VGI is that variability can change dramatically over short distances – so while the centre of a city will be mapped by many people, a deprived suburb near the centre will not be mapped and updated. There are also limitations that are caused by the instruments in use – for example, the GPS positional accuracy of the smartphones in use. Such aspects should also be taken into account, ensuring that the quality assurance is also fit-for-purpose.

References and Further Readings

Bonney, Rick. 1996. Citizen Science – a lab tradition, Living Bird, Autumn 1996.
Bonney, Rick, Shirk, Jennifer, Phillips, Tina B. 2013. Citizen Science, Encyclopaedia of science education. Berlin: Springer-Verlag.
Goodchild, Michael F. 2007. Citizens as sensors: the world of volunteered geography. GeoJournal, 69(4), 211–221.
Goodchild, Michael F., and Li, Linna. 2012, Assuring the quality of volunteered geographic information. Spatial Statistics, 1 110-120
Haklay, Mordechai. 2010. How Good is volunteered geographical information? a comparative study of OpenStreetMap and ordnance survey datasets. Environment and Planning B: Planning and Design, 37(4), 682–703.
Sui, Daniel, Elwood, Sarah and Goodchild, Michael F. (eds), 2013. Crowdsourcing Geographic Knowledge, Berlin:Springer-Verlag.
Van Oort, Pepjin .A.J. 2006. Spatial data quality: from description to application, PhD Thesis, Wageningen: Wageningen Universiteit, p. 125.