Citizen Science & Scientific Crowdsourcing – week 5 – Data quality

This week, in the “Introduction to Citizen Science & Scientific Crowdsourcing“, our focus was on data management, to complete the first part of the course (the second part starts in a week’s time since we have a mid-term “Reading Week” at UCL).

The part that I’ve enjoyed most in developing was the segment that addresses the data quality concerns that are frequently raised about citizen science and geographic crowdsourcing. Here are the slides from this segment, and below them a rationale for the content and detailed notes

I’ve written a lot on this blog about data quality and in many talks that I gave about citizen science and crowdsourced geographic information, the question about data quality is the first one to come up. It is a valid question, and it had led to useful research – for example on OpenStreetMap and I recall the early conversations, 10 years ago, during a journey to the Association for Geographic Information (AGI) conference about the quality and the longevity potential of OSM.

However, when you are being asked the same question again, and again, and again, at some point, you start considering “why am I being asked this question?”. Especially when you know that it’s been over 10 years since it was demonstrated that the quality is beyond “good enough”, and that there are over 50 papers on citizen science quality. So why is the problem so persistent?

Therefore, the purpose of the segment was to explain the concerns about citizen science data quality and their origin, then to explain a core misunderstanding (that the same quality assessment methods that are used in “scarcity” conditions work in “abundance” conditions), and then cover the main approaches to ensure quality (based on my article for the international encyclopedia of geography). The aim is to equip the students with a suitable explanation on why you need to approach citizen science projects differently, and then to inform them of the available methods. Quite a lot for 10 minutes!

So here are the notes from the slides:

[Slide 1] When it comes to citizen science, it is very common to hear suggestions that the data is not good enough and that volunteers cannot collect data at a good quality, because unlike trained researchers, they don’t understand who they are – a perception that we know little about the people that are involved and therefore we don’t know about their ability. There are also perceptions that like Wikipedia, it is all a very loosely coordinate and therefore there are no strict data quality procedures. However, we know that even in the Wikipedia case that when the scientific journal Nature shown over a decade ago (2005) that Wikipedia is resulting with similar quality to Encyclopaedia Britannica, and we will see that OpenStreetMap is producing data of a similar quality to professional services.
In citizen science where sensing and data collection from instruments is included, there are also concerns over the quality of the instruments and their calibration – the ability to compare the results with high-end instruments.
The opening of the Hunter et al. paper (which offers some solutions), summarises the concerned that are raised over data

[Slide 2] Based on conversations with scientists and concerned that are appearing in the literature, there is also a cultural aspect at play which is expressed in many ways – with data quality being used as an outlet to express them. This can be similar to the concerns that were raised in the cult of the amateur (which we’ve seen in week 2 regarding the critique of crowdsourcing) to protect the position of professional scientists and to avoid the need to change practices. There are also special concerns when citizen science is connected to activism, as this seems to “politicise” science or make the data suspicious – we will see next lecture that the story is more complex. Finally, and more kindly, we can also notice that because scientists are used to top-down mechanisms, they find alternative ways of doing data collection and ensuring quality unfamiliar and untested.

[Slide 3] Against this background, it is not surprising to see that checking data quality in citizen science is a popular research topic. Caren Cooper have identified over 50 papers that compare citizen science data with those that were collected by professional – as she points: “To satisfy those who want some nitty gritty about how citizen science projects actually address data quality, here is my medium-length answer, a brief review of the technical aspects of designing and implementing citizen science to ensure the data are fit for intended uses. When it comes to crowd-driven citizen science, it makes sense to assess how those data are handled and used appropriately. Rather than question whether citizen science data quality is low or high, ask whether it is fit or unfit for a given purpose. For example, in studies of species distributions, data on presence-only will fit fewer purposes (like invasive species monitoring) than data on presence and absence, which are more powerful. Designing protocols so that citizen scientists report what they do not see can be challenging which is why some projects place special emphasize on the importance of “zero data.”
It is a misnomer that the quality of each individual data point can be assessed without context. Yet one of the most common way to examine citizen science data quality has been to compare volunteer data to those collected by trained technicians and scientists. Even a few years ago I’d noticed over 50 papers making these types of comparisons and the overwhelming evidence suggested that volunteer data are fine. And in those few instances when volunteer observations did not match those of professionals, that was evidence of poor project design. While these studies can be reassuring, they are not always necessary nor would they ever be sufficient.” (http://blogs.plos.org/citizensci/2016/12/21/quality-and-quantity-with-citizen-science/)

[Slide 4] One way to examine the issue with data quality is to think of the clash between two concepts and systems of thinking on how to address quality issue – we can consider the condition of standard scientific research conditions as ones of scarcity: limited funding, limited number of people with the necessary skills, a limited laboratory space, expensive instruments that need to be used in a very specific way – sometimes unique instruments.
The conditions of citizen science, on the other hand, are of abundance – we have a large number of participants, with multiple skills, but the cost per participant is low, they bring their own instruments, use their own time, and are also distributed in places that we usually don’t get to (backyards, across the country – we talked about it in week 2). Conditions of abundance are different and require different thinking for quality assurance.

[Slide 5] Here some of the differences. Under conditions of scarcity, it is worth investing in long training to ensure that the data collection is as good as possible the first time it is attempted since time is scarce. Also, we would try to maximise the output from each activity that our researcher carried out, and we will put procedures and standards to ensure “once & good” or even “once & best” optimisation. We can also force all the people in the study to use the same equipment and software, as this streamlines the process.
On the other hand, in abundance conditions we need to assume that people are coming with a whole range of skills and that training can be variable – some people will get trained on the activity over a long time, while to start the process we would want people to have light training and join it. We also thinking of activities differently – e.g. conceiving the data collection as micro-tasks. We might also have multiple procedures and even different ways to record information to cater for a different audience. We will also need to expect a whole range of instrumentation, with sometimes limited information about the characteristics of the instruments.
Once we understand the new condition, we can come up with appropriate data collection procedures that ensure data quality that is suitable for this context.

[Slide 6] There are multiple ways of ensuring data quality in citizen science data. Let’s briefly look at each one of these. The first 3 methods were suggested by Mike Goodchild and Lina Li in a paper from 2012.

[Slide 7] The first method for quality assurance is crowdsourcing – the use of multiple people who are carrying out the same work, in fact, doing peer review or replication of the analysis which is desirable across the sciences. As Watson and Floridi argued, using the examine of Zooniverse, the approaches that are being used in crowdsourcing give these methods a stronger claim on accuracy and scientific correct identification because they are comparing multiple observers who work independently.

[Slide 8] The social form of quality assurance is using more and less experienced participants as a way to check the information and ensure that the data is correct. This is fairly common in many areas of biodiversity observations and integrated into iSpot, but also exist in other areas, such as mapping, where some information get moderated (we’ve seen that in Google Local Guides, when a place is deleted).

[Slide 9] The geographical rules are especially relevant to information about mapping and locations. Because we know things about the nature of geography – the most obvious is land and sea in this example – we can use this knowledge to check that the information that is provided makes sense, such as this sample of two bumble bees that are recorded in OPAL in the middle of the sea. While it might be the case that someone seen them while sailing or on some other vessel, we can integrate a rule into our data management system and ask for more details when we get observations in such a location. There are many other such rules – about streams, lakes, slopes and more.

[Slide 10] The ‘domain’ approach is an extension of the geographic one, and in addition to geographical knowledge uses a specific knowledge that is relevant to the domain in which information is collected. For example, in many citizen science projects that involved collecting biological observations, there will be some body of information about species distribution both spatially and temporally. Therefore, a new observation can be tested against this knowledge, again algorithmically, and help in ensuring that new observations are accurate. If we see a monarch butterfly within the marked area, we can assume that it will not harm the dataset even if it was a mistaken identity, while an outlier (temporally, geographically, or in other characteristics) should stand out.

[Slide 11] The ‘instrumental observation’ approach removes some of the subjective aspects of data collection by a human that might make an error, and rely instead on the availability of equipment that the person is using. Because of the increase in availability of accurate-enough equipment, such as the various sensors that are integrated in smartphones, many people keep in their pockets mobile computers with the ability to collect location, direction, imagery and sound. For example, images files that are captured in smartphones include in the file the GPS coordinates and time-stamp, which for a vast majority of people are beyond their ability to manipulate. Thus, the automatic instrumental recording of information provides evidence for the quality and accuracy of the information. This is where the metadata of the information becomes very valuable as it provides the necessary evidence.

[Slide 12] Finally, the ‘process oriented’ approach bring citizen science closer to traditional industrial processes. Under this approach, the participants go through some training before collecting information, and the process of data collection or analysis is highly structured to ensure that the resulting information is of suitable quality. This can include the provision of standardised equipment, online training or instruction sheets and a structured data recording process. For example, volunteers who participate in the US Community Collaborative Rain, Hail & Snow network (CoCoRaHS) receive standardised rain gauge, instructions on how to install it and online resources to learn about data collection and reporting.

[Slide 13]  What is important to be aware of is that methods are not being used alone but in combination. The analysis by Wiggins et al. in 2011 includes a framework that includes 17 different mechanisms for ensuring data quality. It is therefore not surprising that with appropriate design, citizen science projects can provide high-quality data.

 

 

Advertisements

Usability of VGI in Haiti earthquake response and the 2nd workshop on usability of geographic information

On the 23rd March 2010, UCL hosted the second workshop on usability of geographic information, organised by Jenny Harding (Ordnance Survey Research), Sarah Sharples (Nottingham), and myself. This workshop was extending the range of topics that we have covered in the first one, on which we have reported during the AGI conference last year. This time, we had about 20 participants and it was an excellent day, covering a wide range of topics – from a presentation by Martin Maguire (Loughborough) on the visualisation and communication of Climate Change data, to Johannes Schlüter (Münster) discussion on the use of XO computers with schoolchildren, to a talk by Richard Treves (Southampton) on the impact of Google Earth tours on learning. Especially interesting are the combination of sound and other senses in the work on Nick Bearman (UEA) and Paul Kelly (Queens University, Belfast).

Jenny’s introduction highlighted the different aspects of GI usability, from those that are specific to data to issues with application interfaces. The integration of data with software that creates the user experience in GIS was discussed throughout the day, and it is one of the reasons that the issue of the usability of the information itself is important in this field. The Ordnance Survey is currently running a project to explore how they can integrate usability into the design of their products – Michael Brown’s presentation discusses the development of a survey as part of this project. The integration of data and application was also central to Philip Robinson (GE Energy) presentation on the use of GI by utility field workers.

My presentation focused on some preliminary thoughts that are based on the analysis of OpenStreetMap  and Google Map communities response to the earthquake in Haiti at the beginning of 2010. The presentation discussed a set of issues that, if explored, will provide insights that are relevant beyond the specific case and that can illuminate issues that are relevant to daily production and use of geographic information. For example, the very basic metadata that was provided on portals such as GeoCommons and what users can do to evaluate fitness for use of a specific data set (See also Barbara Poore’s (USGS) discussion on the metadata crisis).

Interestingly, the day after giving this presentation I had a chance to discuss GI usability with Map Action volunteers who gave a presentation in GEO-10 . Their presentation filled in some gaps, but also reinforced the value of researching GI usability for emergency situations.

For a detailed description of the workshop and abstracts – see this site. All the presentations from the conference are available on SlideShare and my presentation is below.

OpenStreetMap Quality evalution and other comparisons

A comparison of my analysis of OpenStreetMap (OSM) quality evaluation to other examples of quality evaluation brings up some core issues about the nature of the new GeoWeb and the use of traditional sources. The examples that I’m referring to are from Etienne Cherdlu’s SOTM 2007 ‘OSM and the art of bicycle maintenance’, Dair Grant’s comparison of OSM to Google Maps and reality, Ed Johnson’s analysis this summer and Steven Feldman’s brief evaluation in Highgate.

Meridian 2 and OSM in the area of Highgate, North London
Meridian 2 and OSM in the area of Highgate, North London

The first observation is of the importance and abundance of well georeferenced, vector-derived public mapping sites, which make several of these comparisons possible (Chedlu, Dair and Feldman). The previous generation of stylised street maps is not readily available for a comparison. In addition to the availability, the ease with which they can be mashed-up is also a significant enabling factor. Without this comparable geographical information, the evaluation would be much more difficult.

Secondly, when a public mapping website was used, it was Google Maps. If Microsoft’s Virtual Earth had also been used, it would arguably allow a three-way comparison as the Microsoft site uses Navteq information, while Google uses TeleAtlas information. Using Ordnance Survey (OS) OpenSpace for comparison is also a natural candidate. Was this familiarity that led to the selection of Google Maps? Or is it because the method of comparison is visual inspection, so adding a third source makes it more difficult? Notice that Google has the cachet of being a correct depiction of reality, which Etienne, Dair and Bob Barr demonstrated not to be the case!

Thirdly, and most significantly, only when vector data was used – in our comparison and in parts of what Ed Johnson has done – a comprehensive analysis of large areas became possible. This shows the important aspect of the role of formats in the GeoWeb – raster is fabulous for the delivery of cartographic representations, but it is a vector that is suitable for analytical and computational analysis. Only OSM allows the user easy download of vector data – no other mass provider of public mapping does.

Finally, there is the issue of access to information, tools and knowledge. As a team that works at a leading research university (UCL), I and the people who worked with me got easy access to detailed vector datasets and the OS 1:10,000 raster. We also have at our disposal multiple GIS packages, so we can use whichever one performs the task with the least effort. The other comparisons had to rely on publically available datasets and software. In such unequal conditions, it is not surprising that I will argue that the comparison that we carried out is more robust and consistent. The issue that is coming up here is the balance between amateurs and experts, which is quite central to Web 2.0 in general. Should my analysis be more trusted than those of Dair’s or Etienne’s, both of whom who are very active in OSM? Does Steven’s familiarity with Highgate, which is greater than mine, make him more of an expert in that area than my consistent application of analysis?

I think that the answer is not clear cut; academic knowledge entails the consistent scrutiny of the data, and I do have the access and the training to conduct a very detailed geographical information quality assessment. In addition, my first job in 1988 was in geographical data collection and GIS development, so I also have professional knowledge in this area. Yet, local knowledge is just as valuable in a specific area and is much better than a mechanical, automatic evaluation. So what is happening is an exchange of knowledge, methods and experiences between the two sides in which both, I hope, can benefit.

OSM quality evaluation

In the past year I have worked on the evaluation of OpenStreetMap data. I was helped by Patrick Weber, Claire Ellul, and especially Naureen Zulfiqar who carried out part of the analysis of motorways. The OSM data was compared against Ordnance Survey Meridian 2 and the 1:10,000 raster as they have enough similarity to justify a comparison. Now, as the fourth birthday of OSM is approaching, it is good time to evaluate what was achieved. The analysis shows that, where OSM was collected by several users and benefited from some quality assurance, the quality of the data is comparable and can be fit for many applications. The positional accuracy is about 6 metres, which is expected for the data collection methods that are used in OSM. The comparison of motorways shows about 80% overlap between OSM and OS – but more research is required. The challenges are the many areas that are not covered – currently, OSM has good coverage for only 25% of the land area of England. In addition, in areas that are covered well, quality assurance procedures should be considered – and I’m sure that the OSM crowd will find great ways to make these procedures fun. OSM also doesn’t covered areas at the bottom of the deprivation scale as well as it covers areas that are wealthier. The map below shows the quality of coverage of the two datasets for England, with blue marking areas where OSM coverage is good and red where it is poor.

Difference between OSM and OS Meridian for England
Difference between OSM and OS Meridian for England

The full report is available here, and if someone is willing to sponsor further analysis – please get in touch!

The paper itself have been published – Haklay M, 2010, “How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets” Environment and Planning B: Planning and Design 37(4) 682 – 703