30 June, 2014
Today marks the publication of the report ‘crowdsourced geographic information in government‘. The report is the result of a collaboration that started in the autumn of last year, when the World Bank Global Facility for Disaster Reduction and Recovery(GFDRR) requested to carry out a study of the way crowdsourced geographic information is used by governments. The identification of barriers and success factors were especially needed, since GFDRR invest in projects across the world that use crowdsourced geographic information to help in disaster preparedness, through activities such as the Open Data for Resilience Initiative. By providing an overview of factors that can help those that implement such projects, either in governments or in the World Bank, we can increase the chances of successful implementations. To develop the ideas of the project, Robert Soden (GFDRR) and I run a short workshop during State of the Map 2013 in Birmingham, which helped in shaping the details of project plan as well as some preliminary information gathering. The project team included myself, Vyron Antoniou, Sofia Basiouka, and Robert Soden (GFDRR). Later on, Peter Mooney (NUIM) and Jamal Jokar (Heidelberg) volunteered to help us – demonstrating the value in research networks such as COST ENERGIC which linked us.
The general methodology that we decided to use is the identification of case studies from across the world, at different scales of government (national, regional, local) and domains (emergency, environmental monitoring, education). We expected that with a large group of case studies, it will be possible to analyse common patterns and hopefully reach conclusions that can assist future projects. In addition, this will also be able to identify common barriers and challenges.
We have paid special attention to information flows between the public and the government, looking at cases where the government absorbed information that provided by the public, and also cases where two-way communication happened.
Originally, we were aiming to ‘crowdsource’ the collection of the case studies. We identified the information that is needed for the analysis by using few case studies that we knew about, and constructing the way in which they will be represented in the final report. After constructing these ‘seed’ case study, we aimed to open the questionnaire to other people who will submit case studies. Unfortunately, the development of a case study proved to be too much effort, and we received only a small number of submissions through the website. However, throughout the study we continued to look out for cases and get all the information so we can compile them. By the end of April 2014 we have identified about 35 cases, but found clear and useful information only for 29 (which are all described in the report). The cases range from basic mapping to citizen science. The analysis workshop was especially interesting, as it was carried out over a long Skype call, with members of the team in Germany, Greece, UK, Ireland and US (Colorado) while working together using Google Docs collaborative editing functionality. This approach proved successful and allowed us to complete the report.
18 June, 2014
The INSPIRE 2014 conference marks the middle of the implementation process of the INSPIRE directive (Infrastructure for Spatial Information in the European Community). The directive is aimed at establishing a pan-European Spatial Data Infrastructure (SDI), and that mean lots of blueprints, pipes, machine rooms and protocols for enabling the sharing of geographic information. In GIS jargon, blueprints translate to metadata which is a standardise way to describe a GIS dataset; pipes and machine rooms translate to data portals and servers, and the protocols translate to web services that use known standards (here you’ll have a real acronym soup of WMS, WCS, WFS and OGC). It is all aimed to allow people across Europe to share data in an efficient way so data can be found and used. In principle, at least!
This is the stuff of governmental organisations that are producing the data (national mapping agencies, government offices, statistical offices etc.) and the whole INSPIRE language and aims are targeted at the producers of the information, encouraging them to publish information about their data and share it with others. A domain of well established bureaucracies (in the positive sense of the word) and organisations that are following internal procedure in producing, quality checking and distributing their information products. At first sight, seem like the opposite world of ‘upscience‘ where sometime there are only ad-hoc structures and activities.
That is why providing a talk in the plenary session that was dedicated to Governance and Information, and aimed to “assess how INSPIRE is contributing to a more effective and participated environmental policy in Europe, and how it provides connectivity with other policies affecting our environment, society, and the economy” was of concern. So where are the meeting points of INSPIRE and citizen science?
One option, is to try a top-down approach and force those who collect data to provide it in INSPIRE compliant way. Of course this is destined to fail. So the next option is to force the intermediaries to do the translation – and projects such as COBWEB is doing that, although it remain to be seen what compromises will be needed. Finally, there is an option to adapt and change procedures such as INSPIRE to reflect the change in the way the world works.
To prepare the talk, I teamed with Dr Claire Ellul, who specialises in metadata (among many other things) and knows about INSPIRE more than me.
The talk started with my previous work about the three eras of environmental information, noticing the move from data by experts, and for experts (1969-1992) to by experts & the public, for experts & the public (2012 on)
As the diagrams show, a major challenges of INSPIRE is that it is a regulation that was created on the basis of the “first era” and “second era” and it inherently assumes stable institutional practices in creating and disseminating and sharing environmental information.
Alas, the world has changed – and one particular moment of change is August 2004 when OpenStreetMap started, so by the time INSPIRE came into force, crowdsourced geographic information and citizen science became legitimate part of the landscape. These data sources are coming from a completely different paradigm of production and management, and now, with 10 years of experience in OSM and growing understanding of citizen science data, we can notice the differences in production, organisation and practices. For example, while being very viable source of geographic information, OSM still doesn’t have an office and ‘someone to call’.
Furthermore, data quality methods also require different framing for these data. We have metadata standards and quality standards that are assuming the second era, but we need to find ways to integrate into sharing frameworks like INSPIRE the messy, noisy but also rich and important data from citizen science and crowdsourcing.
Claire provided a case study that analyses the challenges in the area of metadata in particular. The case looks at different noise mapping sources and how the can be understood. Her analysis demonstrates how the ‘producer centric’ focus of INSPIRE is challenging when trying to create systems that record and use metadata for crowdsourced information. The case study is based on our own experiences over the past 6 years and in different projects, so there is information that is explicit in the map, some in a documentation – but some that is only hidden (e.g. calibration and quality of smart phone apps).
We conclude with the message that the INSPIRE community need to start noticing these sources of data and consider how they can be integrated in the overall infrastructure.
The slides from the talk are provided below.
Following the two previous assertions, namely that:
‘you can be supported by a huge crowd for a very short time, or by few for a long time, but you can’t have a huge crowd all of the time (unless data collection is passive)’ (original post here)
‘All information sources are heterogeneous, but some are more honest about it than others’ (original post here)
The third assertion is about pattern of participation. It is one that I’ve mentioned before and in some way it is a corollary of the two assertions above.
‘When looking at crowdsourced information, always keep participation inequality in mind’
Because crowdsourced information, either Volunteered Geographic Information or Citizen Science, is created through a socio-technical process, all too often it is easy to forget the social side – especially when you are looking at the information without the metadata of who collected it and when. So when working with OpenStreetMap data, or viewing the distribution of bird species in eBird (below), even though the data source is expected to be heterogeneous, each observation is treated as similar to other observation and assumed to be produced in a similar way.
Yet, data is not only heterogeneous in terms of consistency and coverage, it is also highly heterogeneous in terms of contribution. One of the most persistence findings from studies of various systems – for example in Wikipedia , OpenStreetMap and even in volunteer computing is that there is a very distinctive heterogeneity in contribution. The phenomena was term ‘Participation Inequality‘ by Jakob Nielsn in 2006 and it is summarised succinctly in the diagram below (from Visual Liberation blog) – very small number of contributors add most of the content, while most of the people that are involved in using the information will not contribute at all. Even when examining only those that actually contribute, in some project over 70% contribute only once, with a tiny minority contributing most of the information.
Therefore, when looking at sources of information that were created through such process, it is critical to remember the nature of contribution. This has far reaching implications on quality as it is dependent on the expertise of the heavy contributors, on their spatial and temporal engagement, and even on their social interaction and practices (e.g. abrasive behaviour towards other participants).
Because of these factors, it is critical to remember the impact and implications of participation inequality on the analysis of the information. There will be some analysis to which it will have less impact and some where it will have major one. In either cases, it need to be taken into account.
Following the last post, which focused on an assertion about crowdsourced geographic information and citizen science I continue with another observation. As was noted in the previous post, these can be treated as ‘laws’ as they seem to emerge as common patterns from multiple projects in different areas of activity – from citizen science to crowdsourced geographic information. The first assertion was about the relationship between the number of volunteers who can participate in an activity and the amount of time and effort that they are expect to contribute.
This time, I look at one aspect of data quality, which is about consistency and coverage. Here the following assertion applies:
‘All information sources are heterogeneous, but some are more honest about it than others’
What I mean by that is the on-going argument about authoritative and crowdsourced information sources (Flanagin and Metzger 2008 frequently come up in this context), which was also at the root of the Wikipedia vs. Britannica debate, and the mistrust in citizen science observations and the constant questioning if they can do ‘real research’.
There are many aspects for these concerns, so the assertion deals with the aspects of comprehensiveness and consistency which are used as a reason to dismiss crowdsourced information when comparing them to authoritative data. However, at a closer look we can see that all these information sources are fundamentally heterogeneous. Despite of all the effort to define precisely standards for data collection in authoritative data, heterogeneity creeps in because of budget and time limitations, decisions about what is worthy to collect and how, and the clash between reality and the specifications. Here are two examples:
Take one of the Ordnance Survey Open Data sources – the map present themselves as consistent and covering the whole country in an orderly way. However, dig in to the details for the mapping, and you discover that the Ordnance Survey uses different standards for mapping urban, rural and remote areas. Yet, the derived products that are generalised and manipulated in various ways, such as Meridian or Vector Map District, do not provide a clear indication which parts originated from which scale – so the heterogeneity of the source disappeared in the final product.
The census is also heterogeneous, and it is a good case of specifications vs. reality. Not everyone fill in the forms and even with the best effort of enumerators it is impossible to collect all the data, and therefore statistical analysis and manipulation of the results are required to produce a well reasoned assessment of the population. This is expected, even though it is not always understood.
Therefore, even the best information sources that we accept as authoritative are heterogeneous, but as I’ve stated, they just not completely honest about it. The ONS doesn’t release the full original set of data before all the manipulations, nor completely disclose all the assumptions that went into reaching the final value. The Ordnance Survey doesn’t tag every line with metadata about the date of collection and scale.
Somewhat counter-intuitively, exactly because crowdsourced information is expected to be inconsistent, we approach it as such and ask questions about its fitness for use. So in that way it is more honest about the inherent heterogeneity.
Importantly, the assertion should not be taken to be dismissive of authoritative sources, or ignoring that the heterogeneity within crowdsources information sources is likely to be much higher than in authoritative ones. Of course all the investment in making things consistent and the effort to get universal coverage is indeed worth it, and it will be foolish and counterproductive to consider that such sources of information can be replaced as is suggest for the census or that it’s not worth investing in the Ordnance Survey to update the authoritative data sets.
Moreover, when commercial interests meet crowdsourced geographic information or citizen science, the ‘honesty’ disappear. For example, even though we know that Google Map Maker is now used in many part
s of the world (see the figure), even in cases when access to vector data is provided by Google, you cannot find out about who contribute, when and where. It is also presented as an authoritative source of information.
Despite the risk of misinterpretation, the assertion can be useful as a reminder that the differences between authoritative and crowdsourced information are not as big as it may seem.
Looking across the range of crowdsourced geographic information activities, some regular patterns are emerging and it might be useful to start notice them as a way to think about what is possible or not possible to do in this area. Since I don’t like the concept of ‘laws’ – as in Tobler’s first law of geography which is stated as ‘Everything is related to everything else, but near things are more related than distant things.’ – I would call them assertions. There is also something nice about using the word ‘assertion’ in the context of crowdsourced geographic information, as it echos Mike Goodchild’s differentiation between asserted and authoritative information. So not laws, just assertions or even observations.
The first one, is rephrasing a famous quote:
‘you can be supported by a huge crowd for a very short time, or by few for a long time, but you can’t have a huge crowd all of the time (unless data collection is passive)’
So the Christmas Bird Count can have tens of thousands of participants for a short time, while the number of people who operate weather observation stations will be much smaller. Same thing is true for OpenStreetMap – for crisis mapping, which is a short term task, you can get many contributors but for the regular updating of an area under usual conditions, there will be only few.
The exception for the assertion is the case for passive data collection, where information is collected automatically through the logging of information from a sensor – for example the recording of GPS track to improve navigation information.
20 October, 2012
The Spatial Data Infrastructure Magazine (SDIMag.com) is a relatively new e-zine dedicated to the development of spatial data infrastructures around the world. Roger Longhorn, the editor of the magazine, conducted an email interview with me, which is now published.
In the interview, we are covering the problematic terminology used to describe a wider range of activities; the need to consider social and technical aspects as well as goals of the participants; and, of course, the role of the information that is produced through crowdsourcing, citizen science, VGI with spatial data infrastructures.
The previous post focused on citizen science as participatory science. This post is discussing the meaning of this differentiation. It is the final part of the chapter that will appear in the book:
The typology of participation can be used across the range of citizen science activities, and one project should not be classified only in one category. For example, in volunteer computing projects most of the participants will be at the bottom level, while participants that become committed to the project might move to the second level and assist other volunteers when they encounter technical problems. Highly committed participants might move to a higher level and communicate with the scientist who coordinates the project to discuss the results of the analysis and suggest new research directions.
This typology exposes how citizen science integrates and challenges the way in which science discovers and produces knowledge. Questions about the way in which knowledge is produced and truths are discovered are part of the epistemology of science. As noted above, throughout the 20th century, as science became more specialised, it also became professionalised. While certain people were employed as scientists in government, industry and research institutes, the rest of the population – even if they graduated from a top university with top marks in a scientific discipline – were not regarded as scientists or as participants in the scientific endeavour unless they were employed professionally to do so. In rare cases, and following the tradition of ‘gentlemen/women scientists’, wealthy individuals could participate in this work by becoming an ‘honorary fellow’ or affiliated to a research institute that, inherently, brought them into the fold. This separation of ‘scientists’ and ‘public’ was justified by the need to access specialist equipment, knowledge and other privileges such as a well-stocked library. It might be the case that the need to maintain this separation is a third reason that practising scientists shy away from explicitly mentioning the contribution of citizen scientists to their work in addition to those identified by Silvertown (2009).
However, similarly to other knowledge professionals who operate in the public sphere, such as medical experts or journalists, scientists need to adjust to a new environment that is fostered by the Web. Recent changes in communication technologies, combined with the increased availability of open access information and the factors that were noted above, mean that processes of knowledge production and dissemination are opening up in many areas of social and cultural activities (Shirky 2008). Therefore, some of the elitist aspects of scientific practice are being challenged by citizen science, such as the notion that only dedicated, full-time researchers can produce scientific knowledge. For example, surely it should be professional scientists who can solve complex scientific problems such as long-standing protein-structure prediction of viruses. Yet, this exact problem was recently solved through a collaboration of scientists working with amateurs who were playing the computer game Foldit (Khatib et al. 2011). Another aspect of the elitist view of science can be witnessed in interaction between scientists and the public, where the assumption is of unidirectional ‘transfer of knowledge’ from the expert to lay people. Of course, as in the other areas mentioned above, it is a grave mistake to argue that experts are unnecessary and can be replaced by amateurs, as Keen (2007) eloquently argued. Nor is it suggested that, because of citizen science, the need for professionalised science will diminish, as, in citizen science projects, the participants accept the difference in knowledge and expertise of the scientists who are involved in these projects. At the same time, the scientists need to develop respect towards those who help them beyond the realisation that they provide free labour, which was noted above.
Given this tension, the participation hierarchy can be seen to be moving from a ‘business as usual’ scientific epistemology at the bottom, to a more egalitarian approach to scientific knowledge production at the top. The bottom level, where the participants are contributing resources without cognitive engagement, keeps the hierarchical division of scientists and the public. The public is volunteering its time or resources to help scientists while the scientists explain the work that is to be done but without expectation that any participant will contribute intellectually to the project. Arguably, even at this level, the scientists will be challenged by questions and suggestions from the participants and, if they do not respond to them in a sensitive manner, they will risk alienating participants. Intermediaries such as the IBM World Community Grid, where a dedicated team is in touch with scientists who want to run projects and a community of volunteered computing providers, are cases of ‘outsourcing’ the community management and thus allowing, to an extent, the maintenance of the separation of scientists and the public.
As we move up the ladder to a higher level of participation, the need for direct engagement between the scientist and the public increases. At the highest level, the participants are assumed to be on equal footing with the scientists in terms of scientific knowledge production. This requires a different epistemological understanding of the process, in which it is accepted that the production of scientific insights is open to any participant while maintaining scientific standards and practices such as systematic observations or rigorous statistical analysis to verify that the results are significant. The belief that, given suitable tools, many lay people are capable of such endeavours is challenging to some scientists who view their skills as unique. As the case of the computer game that helped in the discovery of new protein formations (Khatib et al. 2011) demonstrated, such collaboration can be fruitful even in cutting-edge areas of science. However, it can be expected that the more mundane and applied areas of science will lend themselves more easily to the fuller sense of collaborative science in which participants and scientists identify problems and develop solutions together. This is because the level of knowledge required in cutting-edge areas of science is so demanding.
Another aspect in which the ‘extreme’ level challenges scientific culture is that it requires scientists to become citizen scientists in the sense that Irwin (1995), Wilsdon, Wynne and Stilgoe (2005) and Stilgoe (2009) advocated (Notice Stilgoe’s title: Citizen Scientists). In this interpretation of the phrase, the emphasis is not on the citizen as a scientist, but on the scientist as a citizen. It requires the scientists to engage with the social and ethical aspects of their work at a very deep level. Stilgoe (2009, p.7) suggested that, in some cases, it will not be possible to draw the line between the professional scientific activities, the responsibilities towards society and a fuller consideration of how a scientific project integrates with wider ethical and societal concerns. However, as all these authors noted, this way of conceptualising and practising science is not widely accepted in the current culture of science.
Therefore, we can conclude that this form of participatory and collaborative science will be challenging in many areas of science. This will not be because of technical or intellectual difficulties, but mostly because of the cultural aspects. This might end up being the most important outcome of citizen science as a whole, as it might eventually catalyse the education of scientists to engage more fully with society.
27 November, 2011
This post continues to the theme of the previous one, and is also based on the chapter that will appear next year in the book:
The post focuses on the participatory aspect of different Citizen Science modes:
Against the technical, social and cultural aspects of citizen science, we offer a framework that classifies the level of participation and engagement of participants in citizen science activity. While there is some similarity between Arnstein’s (1969) ‘ladder of participation’ and this framework, there is also a significant difference. The main thrust in creating a spectrum of participation is to highlight the power relationships that exist within social processes such as urban planning or in participatory GIS use in decision making (Sieber 2006). In citizen science, the relationship exists in the form of the gap between professional scientists and the wider public. This is especially true in environmental decision making where there are major gaps between the public’s and the scientists’ perceptions of each other (Irwin 1995).
In the case of citizen science, the relationships are more complex, as many of the participants respect and appreciate the knowledge of the professional scientists who are leading the project and can explain how a specific piece of work fits within the wider scientific body of work. At the same time, as volunteers build their own knowledge through engagement in the project, using the resources that are available on the Web and through the specific project to improve their own understanding, they are more likely to suggest questions and move up the ladder of participation. In some cases, the participants would want to volunteer in a passive way, as is the case with volunteered computing, without full understanding of the project as a way to engage and contribute to a scientific study. An example of this is the many thousands of people who volunteered to the Climateprediction.net project, where their computers were used to run global climate models. Many would like to feel that they are engaged in one of the major scientific issues of the day, but would not necessarily want to fully understand the science behind it.
Therefore, unlike Arnstein’s ladder, there shouldn’t be a strong value judgement on the position that a specific project takes. At the same time, there are likely benefits in terms of participants’ engagement and involvement in the project to try to move to the highest level that is suitable for the specific project. Thus, we should see this framework as a typology that focuses on the level of participation.
At the most basic level, participation is limited to the provision of resources, and the cognitive engagement is minimal. Volunteered computing relies on many participants that are engaged at this level and, following Howe (2006), this can be termed ‘crowdsourcing’. In participatory sensing, the implementation of a similar level of engagement will have participants asked to carry sensors around and bring them back to the experiment organiser. The advantage of this approach, from the perspective of scientific framing, is that, as long as the characteristics of the instrumentation are known (e.g. the accuracy of a GPS receiver), the experiment is controlled to some extent, and some assumptions about the quality of the information can be used. At the same time, running projects at the crowdsourcing level means that, despite the willingness of the participants to engage with a scientific project, their most valuable input – their cognitive ability – is wasted.
The second level is ‘distributed intelligence’ in which the cognitive ability of the participants is the resource that is being used. Galaxy Zoo and many of the ‘classic’ citizen science projects are working at this level. The participants are asked to take some basic training, and then collect data or carry out a simple interpretation activity. Usually, the training activity includes a test that provides the scientists with an indication of the quality of the work that the participant can carry out. With this type of engagement, there is a need to be aware of questions that volunteers will raise while working on the project and how to support their learning beyond the initial training.
The next level, which is especially relevant in ‘community science’ is a level of participation in which the problem definition is set by the participants and, in consultation with scientists and experts, a data collection method is devised. The participants are then engaged in data collection, but require the assistance of the experts in analysing and interpreting the results. This method is common in environmental justice cases, and goes towards Irwin’s (1995) call to have science that matches the needs of citizens. However, participatory science can occur in other types of projects and activities – especially when considering the volunteers who become experts in the data collection and analysis through their engagement. In such cases, the participants can suggest new research questions that can be explored with the data they have collected. The participants are not involved in detailed analysis of the results of their effort – perhaps because of the level of knowledge that is required to infer scientific conclusions from the data.
Finally, collaborative science is a completely integrated activity, as it is in parts of astronomy where professional and non-professional scientists are involved in deciding on which scientific problems to work and the nature of the data collection so it is valid and answers the needs of scientific protocols while matching the motivations and interests of the participants. The participants can choose their level of engagement and can be potentially involved in the analysis and publication or utilisation of results. This form of citizen science can be termed ‘extreme citizen science’ and requires the scientists to act as facilitators, in addition to their role as experts. This mode of science also opens the possibility of citizen science without professional scientists, in which the whole process is carried out by the participants to achieve a specific goal.
This typology of participation can be used across the range of citizen science activities, and one project should not be classified only in one category. For example, in volunteer computing projects most of the participants will be at the bottom level, while participants that become committed to the project might move to the second level and assist other volunteers when they encounter technical problems. Highly committed participants might move to a higher level and communicate with the scientist who coordinates the project to discuss the results of the analysis and suggest new research directions.
20 July, 2011
As part of the Volunteered Geographic Information (VGI) workshop that was held in Seattle in April 2011, Daniel Sui, Sarah Elwood and Mike Goodchild announced that they will be editing a volume dedicated to the topic, published as ‘Crowdsourcing Geographic Knowledge‘ (Here is a link to the Chapter in Crowdsourcing Geographic Knowledge)
My contribution to this volume focuses on citizen science, and shows the links between it and VGI. The chapter is currently under review, but the following excerpt discusses different types of citizen science activities, and I would welcome comments:
“While the aim here is not to provide a precise definition of citizen science. Yet, a definition and clarification of what the core characteristics of citizen science are is unavoidable. Therefore, it is defined as scientific activities in which non-professional scientists volunteer to participate in data collection, analysis and dissemination of a scientific project (Cohn 2008; Silvertown 2009). People who participate in a scientific study without playing some part in the study itself – for example, volunteering in a medical trial or participating in a social science survey – are not included in this definition.
While it is easy to identify a citizen science project when the aim of the project is the collection of scientific information, as in the recording of the distribution of plant species, there are cases where the definition is less clear-cut. For example, the process of data collection in OpenStreetMap or Google Map Maker is mostly focused on recording verifiable facts about the world that can be observed on the ground. The tools that OpenStreetMap mappers use – such as remotely sensed images, GPS receivers and map editing software – can all be considered scientific tools. With their attempt to locate observed objects and record them on a map accurately, they follow the footsteps of surveyors such as Robert Hooke, who also carried out an extensive survey of London using scientific methods – although, unlike OpenStreetMap volunteers, he was paid for his effort. Finally, cases where facts are collected in a participatory mapping activity, such as the one that Ghose (2001) describes, should probably be considered a citizen science only if the participants decided to frame it as such. For the purpose of the discussion here, such a broad definition is more useful than a limiting one that tries to reject certain activities.
Notice also that, by definition, citizen science can only exist in a world in which science is socially constructed as the preserve of professional scientists in academic institutions and industry, because, otherwise, any person who is involved in a scientific project would simply be considered a contributor and potentially a scientist. As Silvertown (2009) noted, until the late 19th century, science was mainly developed by people who had additional sources of employment that allowed them to spend time on data collection and analysis. Famously, Charles Darwin joined the Beagle voyage, not as a professional naturalist but as a companion to Captain FitzRoy. Thus, in that era, almost all science was citizen science albeit mostly by affluent gentlemen scientists and gentlewomen. While the first professional scientist is likely to be Robert Hooke, who was paid to work on scientific studies in the 17th century, the major growth in the professionalisation of scientists was mostly in the latter part of the 19th and throughout the 20th centuries.
Even with the rise of the professional scientist, the role of volunteers has not disappeared, especially in areas such as archaeology, where it is common for enthusiasts to join excavations, or in natural science and ecology, where they collect and send samples and observations to national repositories. These activities include the Christmas Bird Watch that has been ongoing since 1900 and the British Trust for Ornithology Survey, which has collected over 31 million records since its establishment in 1932 (Silvertown 2009). Astronomy is another area where amateurs and volunteers have been on par with professionals when observation of the night sky and the identification of galaxies, comets and asteroids are considered (BBC 2006). Finally, meteorological observations have also relied on volunteers since the early start of systematic measurements of temperature, precipitation or extreme weather events (WMO 2001).
This type of citizen science provides the first type of ‘classic’ citizen science – the ‘persistence’ parts of science where the resources, geographical spread and the nature of the problem mean that volunteers sometimes predate the professionalisation and mechanisation of science. These research areas usually require a large but sparse network of observers who carry out their work as part of a hobby or leisure activity. This type of citizen science has flourished in specific enclaves of scientific practice, and the progressive development of modern communication tools has made the process of collating the results from the participants easier and cheaper, while inherently keeping many of the characteristics of data collection processes close to their origins.
A second set of citizen science activities is environmental management and, even more specifically, within the context of environmental justice campaigns. Modern environmental management includes strong technocratic and science oriented management practices (Bryant & Wilson 1998; Scott & Barnett 2009) and environmental decision making is heavily based on scientific environmental information. As a result, when an environmental conflict emerges – such as a community protest over a local noisy factory or planned expansion of an airport – the valid evidence needs to be based on scientific data collection. This aspect of environmental justice struggle is encouraging communities to carry out ‘community science’ in which scientific measurements and analysis are carried out by members of local communities so they can develop an evidence base and set out action plans to deal with problems in their area. A successful example of such an approach is the ‘Global Community Monitor’ method to allow communities to deal with air pollution issues (Scott & Barnett 2009). This is performed through a simple method of sampling air using plastic buckets followed by analysis in an air pollution laboratory, and, finally, the community being provided with instructions on how to understand the results. This activity is termed ‘Bucket Brigade’ and was used across the world in environmental justice campaigns. In London, community science was used to collect noise readings in two communities that are impacted by airport and industrial activities. The outputs were effective in bringing environmental problems to the policy arena (Haklay, Francis & Whitaker 2008). As in ‘classic’ citizen science, the growth in electronic communication has enabled communities to identify potential methods – e.g. through the ‘Global Community Monitor’ website – as well as find international standards , regulations and scientific papers that can be used together with the local evidence.
However, the emergence of the Internet and the Web as a global infrastructure has enabled a new incarnation of citizen science: the realisation of scientists that the public can provide free labour, skills, computing power and even funding, and, the growing demands from research funders for public engagement all contributing to the motivation of scientists to develop and launch new and innovative projects (Silvertown 2009; Cohn 2008). These projects utilise the abilities of personal computers, GPS receivers and mobile phones to double as scientific instruments.
This third type of citizen science has been termed ‘citizen cyberscience’ by Francois Grey (2009). Within it, it is possible to identify three sub-categories: volunteered computing, volunteered thinking and participatory sensing.
Volunteered computing was first developed in 1999, with the foundation of SETI@home (Anderson et al. 2002), which was designed to distribute the analysis of data that was collected from a radio telescope in the search for extra-terrestrial intelligence. The project utilises the unused processing capacity that exists in personal computers, and uses the Internet to send and receive ‘work packages’ that are analysed automatically and sent back to the main server. Over 3.83 million downloads were registered on the project’s website by July 2002. The system on which SETI@home is based, the Berkeley Open Infrastructure for Network Computing (BOINC), is now used for over 100 projects, covering Physics, processing data from the Large Hadron Collider through LHC@home; Climate Science with the running of climate models in Climateprediction.net; and Biology in which the shape of proteins is calculated in Rosetta@home.
While volunteered computing requires very little from the participants, apart from installing software on their computers, in volunteered thinking the volunteers are engaged at a more active and cognitive level (Grey 2009). In these projects, the participants are asked to use a website in which information or an image is presented to them. When they register onto the system, they are trained in the task of classifying the information. After the training, they are exposed to information that has not been analysed, and are asked to carry out classification work. Stardust@home (Westphal et al. 2006) in which volunteers were asked to use a virtual microscope to try to identify traces of interstellar dust was one of the first projects in this area, together with the NASA ClickWorkers that focused on the classification of craters on Mars. Galaxy Zoo (Lintott et al. 2008), a project in which volunteers classify galaxies, is now one of the most developed ones, with over 100,000 participants and with a range of applications that are included in the wider Zooniverse set of projects (see http://www.zooniverse.org/) .
Participatory sensing is the final and most recent type of citizen science activity. Here, the capabilities of mobile phones are used to sense the environment. Some mobile phones have up to nine sensors integrated into them, including different transceivers (mobile network, WiFi, Bluetooth), FM and GPS receivers, camera, accelerometer, digital compass and microphone. In addition, they can link to external sensors. These capabilities are increasingly used in citizen science projects, such as Mappiness in which participants are asked to provide behavioural information (feeling of happiness) while the phone records their location to allow the linkage of different locations to wellbeing (MacKerron 2011). Other activities include the sensing of air-quality (Cuff 2007) or noise levels (Maisonneuve et al. 2010) by using the mobile phone’s location and the readings from the microphone.”