New book: European Handbook of Crowdsourced Geographic Information

COST EnergicCOST ENERGIC is a network of researchers across Europe (and beyond) that are interested in research crowdsourced geographic information, also known as Volunteered Geographic Information (VGI). The acronym stands for ‘Co-Operation in Science & Technology’ (COST) through ‘European Network Researching Geographic Information Crowdsourcing’ (ENREGIC). I have written about this programme before, through events such as twitter chats, meetings, summer schools and publications. We started our activities in December 2012, and now, 4 years later, the funding is coming to an end.

bookcoverOne of the major outcomes of the COST ENERGIC network is an edited book that is dedicated to the research on VGI, and we have decided that following the openness of the field, in which many researchers use open sources to analyse locations, places, and movement, we should have the publication as open access – free to download and reuse. To achieve that, we’ve approached Ubiquity Press, who specialise in open access academic publishing, and set a process of organising the writing of short and accessible chapters from across the spectrum of research interests and topics that are covered by members of the network. Dr Haosheng Huang (TU Wien) volunteered to assist with the editing and management of the process. The chapters then went through internal peer review, and another cycle of peer review following Ubiquity Press own process, so it is thoroughly checked!

The book includes 31 chapters with relevant information about application of VGI and citizen science, management of data, examples of projects, and high level concepts in this area.

The book is now available for download hereHere is the description of the book:

This book focuses on the study of the remarkable new source of geographic information that has become available in the form of user-generated content accessible over the Internet through mobile and Web applications. The exploitation, integration and application of these sources, termed volunteered geographic information (VGI) or crowdsourced geographic information (CGI), offer scientists an unprecedented opportunity to conduct research on a variety of topics at multiple scales and for diversified objectives.
The Handbook is organized in five parts, addressing the fundamental questions:

  • What motivates citizens to provide such information in the public domain, and what factors govern/predict its validity?
  • What methods might be used to validate such information?
  • Can VGI be framed within the larger domain of sensor networks, in which inert and static sensors are replaced or combined by intelligent and mobile humans equipped with sensing devices?
  • What limitations are imposed on VGI by differential access to broadband Internet, mobile phones, and other communication technologies, and by concerns over privacy?
  • How do VGI and crowdsourcing enable innovation applications to benefit human society?

Chapters examine how crowdsourcing techniques and methods, and the VGI phenomenon, have motivated a multidisciplinary research community to identify both fields of applications and quality criteria depending on the use of VGI. Besides harvesting tools and storage of these data, research has paid remarkable attention to these information resources, in an age when information and participation is one of the most important drivers of development.
The collection opens questions and points to new research directions in addition to the findings that each of the authors demonstrates. Despite rapid progress in VGI research, this Handbook also shows that there are technical, social, political and methodological challenges that require further studies and research



Notes from ICCB/ECCB 2015 (Day 2) – Citizen Science data quality

Posters session at ICCB/ECCB2015

The second day of the ICCB/ECCB 2015 started with a session that focused on the use and interpretation of citizen science data. The  Symposium Citizen Science in Conservation Science: the new paths, from data collection to data interpretation was organised by Karine Princé and included the following talks:

Bias, information, signal and noise in citizen science data – Nick Isaac – information content of a dataset is question dependent on what was captured and how, as well as survey effort. Data is coming in different ways from a range of people who collect them for different purposes. Biological records are unstructured – they don’t address a specific question and need to know how they come about – information about the data collection protocols is important to make sense of the data. If you are collecting data through citizen science, remember that data will outlive the porject, so need good metadata, and data standards to ensure that it can be used by others. There are powerful statistical tools and we should use to model the bias and not try to avoid it, and little bit of metadata would go a long way so worth recording it.

Conservation management prioritization with citizen science data and species abundance models – Alison Johnston (BTO/Cornell Lab of Ornithology) distribution of species are dynamic and they change by seasons. This is especially important for migratory birds – conservation at specific times (wintering, breading or migrating). The BirdReturns programme in California is a way to flood rice field to provide water-birds habitat, and is an effective and not hugely costly. However, dynamic conservation need precision in information. Citizen Science data can help in occurrence model and want to identify abundance as this will help to prioritise the activities. They used eBird data. In California there are 230,000 checklists but there are biases in the data. There are variable efforts and expertise, and bias in sites, seasons, time. There are also different relationships with habitat, it is also difficult to identify the extreme abundance. They used the Spatio-Temporal Exploratory Models (STEM) which allow modelling with random grids – averaging across cells that have different origins (Fink et al 2010 Ecological Applications). Using the model, they identified areas of high activities – especially the abundance model. Of the two models, the abundance model seem more suitable in using citizen science data for dynamic conservation. The results were used with reverse auction to maximise the use of the available funds to provide large areas of temporary wetland.

Citizen sciences for monitoring biodiversity in habitat structured spaces – Camille Coron (Paris Sud)  described a model estimate for several species and their abundances – they wanted to use several datasets that are at different types of protocols from citizen science projects. Some with strong protocols and some without. They assume that space is covered wtih different types of habitat, but the habitat itself is not known. They look at bird species in Aquitaine – 34 species. 2 datasets are from precise protocols and the third dataset is oportunistics. They developed a statistical model to allow to estimate the data, using a detection probability, abundance, and the intensity of the observation activity. In opportunistic dataset the effort is not known. The model have important gains when species are rare, secondly when the considered species in hardly detected in the data and when there are many species. By using the combined robust protocol projects, the estimation of species distribution is improved.

Can opportunistic occurrence records improve the large-scale estimation of abundance trends? – Joern Pagel – there is lack of comprehensive data large scale variation in abundance and he describe a model that deal with it. The model is based on the assumption that population density is a main driver of variation in species detectability. Using UK butterfly data they tested the model, combining the very details local transects (140 with weekly monitoring) with opportunistic presence recording (over 500K records) using 10×10 km grid. The transects were used to estimate the abundance (described in a paper in methods in ecology and evolution). They found that opportunistic occurrences records can carry a signal of population density but need to be careful about assumptions and there are high uncertainties that are associated with it.

When do occupancy models produce reliable inferences from opportunistic data?– Arco Van Strien (statistics Netherlands) Statistics Netherlands are involved in butterflies and dragonflies monitoring – from transects and also opportunistic data. opportunistic data – unstandardised data, and can see artificial trends if effort varies over time – so the idea was to changes in recorder efforts derived from occupancy models. They coupled two logistic regression models – modelling the ecological process and the observation process. They wanted to explore the usefulness of opportunistic data & occupancy models, and used a Bayesian model, evaluating the results against standardised data. They looked for inferences – phenology (trying to find the pick date in detection), national trend in distribution, species richness per site, local trends in distribution.  The peak date- found a 0.9 correlation between opportunistic data and standardised data. National trends – there is also strong correlation – 0.8/0.9. Species richness – also correlation of over 0.9, but in local trends, the correlation is dropping to 0.4-0.5 for both butterfly and dragonfly. the conclusion – opportunistic data is great and need to be careful about the inference from it.

Making sense of citizen science data: A review of methods – Olivier Gimenez (CNRS) – interest in large terrestrial and marine mammals, they are difficult to monitor in the field and thinking of citizen science data can be used for that. Looked at all the papers with citizen science, and looked as specifically those that look at the data. Wanted to build taxonomy of methods that are used to handle citizen science data. He identified five methods. First, filtering and correction approach – so know or assume to know bias and trying to correct it – e.g. list length analysis. They are highly sensitive to specific biases. The second category – simulation approach, simulate the bias and check how your favourite method behaves given this bias. Third approach is a regression approach – use relevant variables to account for biases -e.g. ecological variables that used to build and predict models, and then use observer bias variables – e.g. distance from cities. The fourth approach is combination approach – combine citizen science data with data from standard protocol to allow to understand and correct the data. The last approach is the occupancy approach – correction for false-negatives and time/spatial variation in detection, so it can be used also extended to deal with false-positives and and also to deal with multiple species. Conclusion: we should focus more on citizens, to describe the models – we need to understand more about them (e.g. record data and the people that collected it) and social science have a major role to play.


In the session paths for the future: building conservation leadership capacity, Kirithi Karanth (Wildlife Conservation Society) looked at ‘Citizen Scientists as agents for conservation‘. In the 1980s WCS started monitoring tigers and some people who are not trained scientists wanted to join in. What draw in people was interest in tigers, and that was the start of their citizen science journey. 5000 km walked in 229 transects in the forest. It started with ecological survey across entire regions from charismatic species but also to rare species. Current project projects have 40-50 volunteer in amphibian and bird survey outside protected areas. The volunteers identify rare species. As project grown, so the challenges – e.g. around human-wildlife conflicts and that helped in having over 5000 villages and 7000 households surveyed in their area. Through the fieldwork, people understand conservation better. Another project recruited 75 volunteer to document tourism impact and the result were used by decision in the supreme court on how to regulate tourism. The have over 5000 citizen scientists, with active group of 1000 at each moment. The impact over 30 years – over 10,00 surveys in 15 states in India, with over 250 academic publications and 300 popular articles. A lot of the people who volunteers evolved into educators, film-makers, conservationists, and also share information blogs, articles, films, activists, and academics. The recognition also increase in graduate programmes – with professional masters programmes. Some of the volunteers – 10% become fully committed to conservation, but the other 90% are critical to wider engagement in society.