Spatial Data Infrastructures, Crowdsourcing and VGI
20 October, 2012
The Spatial Data Infrastructure Magazine (SDIMag.com) is a relatively new e-zine dedicated to the development of spatial data infrastructures around the world. Roger Longhorn, the editor of the magazine, conducted an email interview with me, which is now published.
In the interview, we are covering the problematic terminology used to describe a wider range of activities; the need to consider social and technical aspects as well as goals of the participants; and, of course, the role of the information that is produced through crowdsourcing, citizen science, VGI with spatial data infrastructures.
The full interview can be found here.
The previous post focused on citizen science as participatory science. This post is discussing the meaning of this differentiation. It is the final part of the chapter that will appear next year in the book:
The typology of participation can be used across the range of citizen science activities, and one project should not be classified only in one category. For example, in volunteer computing projects most of the participants will be at the bottom level, while participants that become committed to the project might move to the second level and assist other volunteers when they encounter technical problems. Highly committed participants might move to a higher level and communicate with the scientist who coordinates the project to discuss the results of the analysis and suggest new research directions.
This typology exposes how citizen science integrates and challenges the way in which science discovers and produces knowledge. Questions about the way in which knowledge is produced and truths are discovered are part of the epistemology of science. As noted above, throughout the 20th century, as science became more specialised, it also became professionalised. While certain people were employed as scientists in government, industry and research institutes, the rest of the population – even if they graduated from a top university with top marks in a scientific discipline – were not regarded as scientists or as participants in the scientific endeavour unless they were employed professionally to do so. In rare cases, and following the tradition of ‘gentlemen/women scientists’, wealthy individuals could participate in this work by becoming an ‘honorary fellow’ or affiliated to a research institute that, inherently, brought them into the fold. This separation of ‘scientists’ and ‘public’ was justified by the need to access specialist equipment, knowledge and other privileges such as a well-stocked library. It might be the case that the need to maintain this separation is a third reason that practising scientists shy away from explicitly mentioning the contribution of citizen scientists to their work in addition to those identified by Silvertown (2009).
However, similarly to other knowledge professionals who operate in the public sphere, such as medical experts or journalists, scientists need to adjust to a new environment that is fostered by the Web. Recent changes in communication technologies, combined with the increased availability of open access information and the factors that were noted above, mean that processes of knowledge production and dissemination are opening up in many areas of social and cultural activities (Shirky 2008). Therefore, some of the elitist aspects of scientific practice are being challenged by citizen science, such as the notion that only dedicated, full-time researchers can produce scientific knowledge. For example, surely it should be professional scientists who can solve complex scientific problems such as long-standing protein-structure prediction of viruses. Yet, this exact problem was recently solved through a collaboration of scientists working with amateurs who were playing the computer game Foldit (Khatib et al. 2011). Another aspect of the elitist view of science can be witnessed in interaction between scientists and the public, where the assumption is of unidirectional ‘transfer of knowledge’ from the expert to lay people. Of course, as in the other areas mentioned above, it is a grave mistake to argue that experts are unnecessary and can be replaced by amateurs, as Keen (2007) eloquently argued. Nor is it suggested that, because of citizen science, the need for professionalised science will diminish, as, in citizen science projects, the participants accept the difference in knowledge and expertise of the scientists who are involved in these projects. At the same time, the scientists need to develop respect towards those who help them beyond the realisation that they provide free labour, which was noted above.
Given this tension, the participation hierarchy can be seen to be moving from a ‘business as usual’ scientific epistemology at the bottom, to a more egalitarian approach to scientific knowledge production at the top. The bottom level, where the participants are contributing resources without cognitive engagement, keeps the hierarchical division of scientists and the public. The public is volunteering its time or resources to help scientists while the scientists explain the work that is to be done but without expectation that any participant will contribute intellectually to the project. Arguably, even at this level, the scientists will be challenged by questions and suggestions from the participants and, if they do not respond to them in a sensitive manner, they will risk alienating participants. Intermediaries such as the IBM World Community Grid, where a dedicated team is in touch with scientists who want to run projects and a community of volunteered computing providers, are cases of ‘outsourcing’ the community management and thus allowing, to an extent, the maintenance of the separation of scientists and the public.
As we move up the ladder to a higher level of participation, the need for direct engagement between the scientist and the public increases. At the highest level, the participants are assumed to be on equal footing with the scientists in terms of scientific knowledge production. This requires a different epistemological understanding of the process, in which it is accepted that the production of scientific insights is open to any participant while maintaining scientific standards and practices such as systematic observations or rigorous statistical analysis to verify that the results are significant. The belief that, given suitable tools, many lay people are capable of such endeavours is challenging to some scientists who view their skills as unique. As the case of the computer game that helped in the discovery of new protein formations (Khatib et al. 2011) demonstrated, such collaboration can be fruitful even in cutting-edge areas of science. However, it can be expected that the more mundane and applied areas of science will lend themselves more easily to the fuller sense of collaborative science in which participants and scientists identify problems and develop solutions together. This is because the level of knowledge required in cutting-edge areas of science is so demanding.
Another aspect in which the ‘extreme’ level challenges scientific culture is that it requires scientists to become citizen scientists in the sense that Irwin (1995), Wilsdon, Wynne and Stilgoe (2005) and Stilgoe (2009) advocated (Notice Stilgoe’s title: Citizen Scientists). In this interpretation of the phrase, the emphasis is not on the citizen as a scientist, but on the scientist as a citizen. It requires the scientists to engage with the social and ethical aspects of their work at a very deep level. Stilgoe (2009, p.7) suggested that, in some cases, it will not be possible to draw the line between the professional scientific activities, the responsibilities towards society and a fuller consideration of how a scientific project integrates with wider ethical and societal concerns. However, as all these authors noted, this way of conceptualising and practising science is not widely accepted in the current culture of science.
Therefore, we can conclude that this form of participatory and collaborative science will be challenging in many areas of science. This will not be because of technical or intellectual difficulties, but mostly because of the cultural aspects. This might end up being the most important outcome of citizen science as a whole, as it might eventually catalyse the education of scientists to engage more fully with society.
Citizen Science as Participatory Science
27 November, 2011
This post continues to the theme of the previous one, and is also based on the chapter that will appear next year in the book:
The post focuses on the participatory aspect of different Citizen Science modes:
Against the technical, social and cultural aspects of citizen science, we offer a framework that classifies the level of participation and engagement of participants in citizen science activity. While there is some similarity between Arnstein’s (1969) ‘ladder of participation’ and this framework, there is also a significant difference. The main thrust in creating a spectrum of participation is to highlight the power relationships that exist within social processes such as urban planning or in participatory GIS use in decision making (Sieber 2006). In citizen science, the relationship exists in the form of the gap between professional scientists and the wider public. This is especially true in environmental decision making where there are major gaps between the public’s and the scientists’ perceptions of each other (Irwin 1995).
In the case of citizen science, the relationships are more complex, as many of the participants respect and appreciate the knowledge of the professional scientists who are leading the project and can explain how a specific piece of work fits within the wider scientific body of work. At the same time, as volunteers build their own knowledge through engagement in the project, using the resources that are available on the Web and through the specific project to improve their own understanding, they are more likely to suggest questions and move up the ladder of participation. In some cases, the participants would want to volunteer in a passive way, as is the case with volunteered computing, without full understanding of the project as a way to engage and contribute to a scientific study. An example of this is the many thousands of people who volunteered to the Climateprediction.net project, where their computers were used to run global climate models. Many would like to feel that they are engaged in one of the major scientific issues of the day, but would not necessarily want to fully understand the science behind it.
Therefore, unlike Arnstein’s ladder, there shouldn’t be a strong value judgement on the position that a specific project takes. At the same time, there are likely benefits in terms of participants’ engagement and involvement in the project to try to move to the highest level that is suitable for the specific project. Thus, we should see this framework as a typology that focuses on the level of participation.
At the most basic level, participation is limited to the provision of resources, and the cognitive engagement is minimal. Volunteered computing relies on many participants that are engaged at this level and, following Howe (2006), this can be termed ‘crowdsourcing’. In participatory sensing, the implementation of a similar level of engagement will have participants asked to carry sensors around and bring them back to the experiment organiser. The advantage of this approach, from the perspective of scientific framing, is that, as long as the characteristics of the instrumentation are known (e.g. the accuracy of a GPS receiver), the experiment is controlled to some extent, and some assumptions about the quality of the information can be used. At the same time, running projects at the crowdsourcing level means that, despite the willingness of the participants to engage with a scientific project, their most valuable input – their cognitive ability – is wasted.
The second level is ‘distributed intelligence’ in which the cognitive ability of the participants is the resource that is being used. Galaxy Zoo and many of the ‘classic’ citizen science projects are working at this level. The participants are asked to take some basic training, and then collect data or carry out a simple interpretation activity. Usually, the training activity includes a test that provides the scientists with an indication of the quality of the work that the participant can carry out. With this type of engagement, there is a need to be aware of questions that volunteers will raise while working on the project and how to support their learning beyond the initial training.
The next level, which is especially relevant in ‘community science’ is a level of participation in which the problem definition is set by the participants and, in consultation with scientists and experts, a data collection method is devised. The participants are then engaged in data collection, but require the assistance of the experts in analysing and interpreting the results. This method is common in environmental justice cases, and goes towards Irwin’s (1995) call to have science that matches the needs of citizens. However, participatory science can occur in other types of projects and activities – especially when considering the volunteers who become experts in the data collection and analysis through their engagement. In such cases, the participants can suggest new research questions that can be explored with the data they have collected. The participants are not involved in detailed analysis of the results of their effort – perhaps because of the level of knowledge that is required to infer scientific conclusions from the data.
Finally, collaborative science is a completely integrated activity, as it is in parts of astronomy where professional and non-professional scientists are involved in deciding on which scientific problems to work and the nature of the data collection so it is valid and answers the needs of scientific protocols while matching the motivations and interests of the participants. The participants can choose their level of engagement and can be potentially involved in the analysis and publication or utilisation of results. This form of citizen science can be termed ‘extreme citizen science’ and requires the scientists to act as facilitators, in addition to their role as experts. This mode of science also opens the possibility of citizen science without professional scientists, in which the whole process is carried out by the participants to achieve a specific goal.
This typology of participation can be used across the range of citizen science activities, and one project should not be classified only in one category. For example, in volunteer computing projects most of the participants will be at the bottom level, while participants that become committed to the project might move to the second level and assist other volunteers when they encounter technical problems. Highly committed participants might move to a higher level and communicate with the scientist who coordinates the project to discuss the results of the analysis and suggest new research directions.
Classification of Citizen Science activities
20 July, 2011
As part of the Volunteered Geographic Information (VGI) workshop that was held in Seattle in April 2011, Daniel Sui, Sarah Elwood and Mike Goodchild announced that they will be editing a volume dedicated to the topic, published as ‘Crowdsourcing Geographic Knowledge‘ .
My contribution to this volume focuses on citizen science, and shows the links between it and VGI. The chapter is currently under review, but the following excerpt discusses different types of citizen science activities, and I would welcome comments:
“While the aim here is not to provide a precise definition of citizen science. Yet, a definition and clarification of what the core characteristics of citizen science are is unavoidable. Therefore, it is defined as scientific activities in which non-professional scientists volunteer to participate in data collection, analysis and dissemination of a scientific project (Cohn 2008; Silvertown 2009). People who participate in a scientific study without playing some part in the study itself – for example, volunteering in a medical trial or participating in a social science survey – are not included in this definition.
While it is easy to identify a citizen science project when the aim of the project is the collection of scientific information, as in the recording of the distribution of plant species, there are cases where the definition is less clear-cut. For example, the process of data collection in OpenStreetMap or Google Map Maker is mostly focused on recording verifiable facts about the world that can be observed on the ground. The tools that OpenStreetMap mappers use – such as remotely sensed images, GPS receivers and map editing software – can all be considered scientific tools. With their attempt to locate observed objects and record them on a map accurately, they follow the footsteps of surveyors such as Robert Hooke, who also carried out an extensive survey of London using scientific methods – although, unlike OpenStreetMap volunteers, he was paid for his effort. Finally, cases where facts are collected in a participatory mapping activity, such as the one that Ghose (2001) describes, should probably be considered a citizen science only if the participants decided to frame it as such. For the purpose of the discussion here, such a broad definition is more useful than a limiting one that tries to reject certain activities.
Notice also that, by definition, citizen science can only exist in a world in which science is socially constructed as the preserve of professional scientists in academic institutions and industry, because, otherwise, any person who is involved in a scientific project would simply be considered a contributor and potentially a scientist. As Silvertown (2009) noted, until the late 19th century, science was mainly developed by people who had additional sources of employment that allowed them to spend time on data collection and analysis. Famously, Charles Darwin joined the Beagle voyage, not as a professional naturalist but as a companion to Captain FitzRoy. Thus, in that era, almost all science was citizen science albeit mostly by affluent gentlemen scientists and gentlewomen. While the first professional scientist is likely to be Robert Hooke, who was paid to work on scientific studies in the 17th century, the major growth in the professionalisation of scientists was mostly in the latter part of the 19th and throughout the 20th centuries.
Even with the rise of the professional scientist, the role of volunteers has not disappeared, especially in areas such as archaeology, where it is common for enthusiasts to join excavations, or in natural science and ecology, where they collect and send samples and observations to national repositories. These activities include the Christmas Bird Watch that has been ongoing since 1900 and the British Trust for Ornithology Survey, which has collected over 31 million records since its establishment in 1932 (Silvertown 2009). Astronomy is another area where amateurs and volunteers have been on par with professionals when observation of the night sky and the identification of galaxies, comets and asteroids are considered (BBC 2006). Finally, meteorological observations have also relied on volunteers since the early start of systematic measurements of temperature, precipitation or extreme weather events (WMO 2001).
This type of citizen science provides the first type of ‘classic’ citizen science – the ‘persistence’ parts of science where the resources, geographical spread and the nature of the problem mean that volunteers sometimes predate the professionalisation and mechanisation of science. These research areas usually require a large but sparse network of observers who carry out their work as part of a hobby or leisure activity. This type of citizen science has flourished in specific enclaves of scientific practice, and the progressive development of modern communication tools has made the process of collating the results from the participants easier and cheaper, while inherently keeping many of the characteristics of data collection processes close to their origins.
A second set of citizen science activities is environmental management and, even more specifically, within the context of environmental justice campaigns. Modern environmental management includes strong technocratic and science oriented management practices (Bryant & Wilson 1998; Scott & Barnett 2009) and environmental decision making is heavily based on scientific environmental information. As a result, when an environmental conflict emerges – such as a community protest over a local noisy factory or planned expansion of an airport – the valid evidence needs to be based on scientific data collection. This aspect of environmental justice struggle is encouraging communities to carry out ‘community science’ in which scientific measurements and analysis are carried out by members of local communities so they can develop an evidence base and set out action plans to deal with problems in their area. A successful example of such an approach is the ‘Global Community Monitor’ method to allow communities to deal with air pollution issues (Scott & Barnett 2009). This is performed through a simple method of sampling air using plastic buckets followed by analysis in an air pollution laboratory, and, finally, the community being provided with instructions on how to understand the results. This activity is termed ‘Bucket Brigade’ and was used across the world in environmental justice campaigns. In London, community science was used to collect noise readings in two communities that are impacted by airport and industrial activities. The outputs were effective in bringing environmental problems to the policy arena (Haklay, Francis & Whitaker 2008). As in ‘classic’ citizen science, the growth in electronic communication has enabled communities to identify potential methods – e.g. through the ‘Global Community Monitor’ website – as well as find international standards , regulations and scientific papers that can be used together with the local evidence.
However, the emergence of the Internet and the Web as a global infrastructure has enabled a new incarnation of citizen science: the realisation of scientists that the public can provide free labour, skills, computing power and even funding, and, the growing demands from research funders for public engagement all contributing to the motivation of scientists to develop and launch new and innovative projects (Silvertown 2009; Cohn 2008). These projects utilise the abilities of personal computers, GPS receivers and mobile phones to double as scientific instruments.
This third type of citizen science has been termed ‘citizen cyberscience’ by Francois Grey (2009). Within it, it is possible to identify three sub-categories: volunteered computing, volunteered thinking and participatory sensing.
Volunteered computing was first developed in 1999, with the foundation of SETI@home (Anderson et al. 2002), which was designed to distribute the analysis of data that was collected from a radio telescope in the search for extra-terrestrial intelligence. The project utilises the unused processing capacity that exists in personal computers, and uses the Internet to send and receive ‘work packages’ that are analysed automatically and sent back to the main server. Over 3.83 million downloads were registered on the project’s website by July 2002. The system on which SETI@home is based, the Berkeley Open Infrastructure for Network Computing (BOINC), is now used for over 100 projects, covering Physics, processing data from the Large Hadron Collider through LHC@home; Climate Science with the running of climate models in Climateprediction.net; and Biology in which the shape of proteins is calculated in Rosetta@home.
While volunteered computing requires very little from the participants, apart from installing software on their computers, in volunteered thinking the volunteers are engaged at a more active and cognitive level (Grey 2009). In these projects, the participants are asked to use a website in which information or an image is presented to them. When they register onto the system, they are trained in the task of classifying the information. After the training, they are exposed to information that has not been analysed, and are asked to carry out classification work. Stardust@home (Westphal et al. 2006) in which volunteers were asked to use a virtual microscope to try to identify traces of interstellar dust was one of the first projects in this area, together with the NASA ClickWorkers that focused on the classification of craters on Mars. Galaxy Zoo (Lintott et al. 2008), a project in which volunteers classify galaxies, is now one of the most developed ones, with over 100,000 participants and with a range of applications that are included in the wider Zooniverse set of projects (see http://www.zooniverse.org/) .
Participatory sensing is the final and most recent type of citizen science activity. Here, the capabilities of mobile phones are used to sense the environment. Some mobile phones have up to nine sensors integrated into them, including different transceivers (mobile network, WiFi, Bluetooth), FM and GPS receivers, camera, accelerometer, digital compass and microphone. In addition, they can link to external sensors. These capabilities are increasingly used in citizen science projects, such as Mappiness in which participants are asked to provide behavioural information (feeling of happiness) while the phone records their location to allow the linkage of different locations to wellbeing (MacKerron 2011). Other activities include the sensing of air-quality (Cuff 2007) or noise levels (Maisonneuve et al. 2010) by using the mobile phone’s location and the readings from the microphone.”
At the State of the Map (EU) 2011 conference that was held in Vienna from 15-17 July, I gave a keynote talk on the relationships between the OpenStreetMap (OSM) community and the GIScience research community. Of course, the relationships are especially important for those researchers who are working on volunteered Geographic Information (VGI), due to the major role of OSM in this area of research.
The talk included an overview of what researchers have discovered about OpenStreetMap over the 5 years since we started to pay attention to OSM. One striking result is that the issue of positional accuracy does not require much more work by researchers. Another important outcome of the research is to understand that quality is impacted by the number of mappers, or that the data can be used with confidence for mainstream geographical applications when some conditions are met. These results are both useful, and of interest to a wide range of groups, but there remain key areas that require further research – for example, specific facets of quality, community characteristics and how the OSM data is used.
Reflecting on the body of research, we can start to form a ‘code of engagement’ for both academics and mappers who are engaged in researching or using OpenStreetMap. One such guideline would be that it is both prudent and productive for any researcher do some mapping herself, and understand the process of creating OSM data, if the research is to be relevant and accurate. Other aspects of the proposed ‘code’ are covered in the presentation.
The talk is also available as a video from the TU Wien Matterhorn server
GISRUK 2011 talk – Participatory GIS, Volunteered Geographic Information and Citizen Science
12 May, 2011
GIS Research UK (GISRUK) is a long running conference series, and the 2011 instalment was hosted by the University of Portsmouth at the end of April.
During the conference, I was asked to give a keynote talk about Participatory GIS. I decided to cover the background of Participatory GIS in the mid-1990s, and the transition to more advanced Web Mapping applications from the mid-2000s. Of special importance are the systems that allow user-generated content, and the geographical types of systems that are now leading to the generation of Volunteer Geographic Information (VGI).
The next part of the talk focused on Citizen Science, culminating with the ideas that are the basis for Extreme Citizen Science.
Interestingly, as in previous presentations, one of the common questions about Citizen Science came up. Professional scientists seem to have a problem with the suggestion that citizens are as capable as scientists in data collection and analysis. While there is an acceptance about the concept, the idea that participants can suggest problems, collect data rigorously and analyse it seems to be too radical – or worrying.
What is important to understand is that the ideas of Extreme Citizen Science are not about replacing the role of scientists, but are a call to rethink the role of the participants and the scientists in cases where Citizen Science is used. It is a way to consider science as a collaborative process of learning and exploration of issues. My own experience is that participants have a lot of respect for the knowledge of the scientists, as long as the scientists have a lot of respect for the knowledge and ability of the participants. The participants would like to learn more about the topic that they are exploring and are keen to know: ‘what does the data that I collected mean?’ At the same time, some of the participants can become very serious in terms of data collection, reading about the specific issues and using the resources that are available online today to learn more. At some point, they are becoming knowledgeable participants and it is worth seeing them as such.
The slides below were used for this talk, and include links to the relevant literature.
Following successful funding for the European Union FP7 EveryAware and the EPSRC Extreme Citizen Science activities, the department of Civil, Environmental and Geomatic Engineering at UCL is inviting applications for a postdoctoral position and 3 PhD studentships. Please note that these positions are open to students from any EU country.
These positions are in the ‘Extreme Citizen Science’ (ExCiteS) research group. The group’s activities focus on the theory, methodologies, techniques and tools that are needed to allow any community to start its own bottom-up citizen science activity, regardless of the level of literacy of the users. Importantly, Citizen Science is understood in the widest sense, including perceptions and views – so participatory mapping and participatory geographic information are integral parts of the activities.
The research themes that the group explores include Citizen Science and Citizen Cyberscience; Community and participatory mapping/GIS; Volunteered Geographic Information (OpenStreetMap, Green Mapping, Participatory GeoWeb); Usability of geographic information and geographic information technology, especially with non-expert users; GeoWeb and mobile GeoWeb technologies that facilitate Extreme Citizen Science; and identifying scientific models and visualisations that are suitable for Citizen Science.
The positions that are opening now are part of an effort to extend Dr Jerome Lewis’ research with forest communities (see BBC Report and report on software development):
Research Associate in Extreme Citizen Science – a 2-year, postdoctoral research associate position commencing 1 May 2011.
The research associate will lead the development of an ‘Intelligent Map’ that allows non-literate users to upload data securely; and the system should allow the users to visualise their information with data from other users. Permissions need to be developed in accordance with cultural sensitivities. As uploaded data from multiple users sharing the same system increase over time, repeating patterns will begin to emerge that indicate particular environmental trends.
The role will also include some general project-management duties, guiding the PhD students who are working on the project. Travel to Cameroon to the forest communities that we are working with is necessary.
Complete details about this post and application procedure are available on the UCL jobs website.
PhD Studentship – understanding citizen scientists’ motivations, incentives and group organisation – a 3.5-year fully funded studentship. We are looking for applicants with a good honours degree (1st Class or 2:1 minimum), and an MA or MSc in anthropology, geography, sociology, psychology or related discipline. The applicant needs to be familiar with quantitative and qualitative research methods, and be able to work with a team that will include programmers and human-computer interaction experts who will design systems to be used in citizen science projects. Travel will be required as part of the project. A willingness to live for short periods in remote forest locations in simple lodgings, eating local food, will be necessary. French language skills are desirable.
The research itself will focus on motivations, incentives and understanding of the needs and wishes of participants in citizen science projects. We will specifically focus on engagement of non-literate people in such projects and need to understand how the process – from data collection to analysis – can be made meaningful and useful for their everyday life. The research will involve using quantitative methods to analyse large-scale patterns of engagement in existing projects, as well as ethnographic and qualitative study of participants. The project will include working with non-literate forest communities in Cameroon as well as marginalised communities in London.
Complete details about this post and application procedure are available on the UCL jobs website.
PhD Studentship in geographic visualisation for non-literate citizen scientists - a 3.5-year fully funded studentship. The applicant should possess a good honours degree (1st Class or 2:1 minimum), and an MSc in computer science, human-computer interaction, electronic engineering or related discipline. In addition, they need to be familiar with geographic information and software development, and be able to work with a team that will include anthropologists and human-computer interaction experts who will design systems to be used in citizen science projects. Travel will be required as part of the project. A willingness to live for short periods in remote forest locations in simple lodgings, eating local food, will be necessary. French language skills are desirable.
Complete details about this post and application procedure are available on the UCL jobs website.
In addition, we offer a PhD Studentship on How interaction design and mobile mapping influences participation in Citizen Science, which is part of the EveryAware project and is also open to any EU citizen.
How Many Volunteers Does It Take To Map An Area Well? The validity of Linus’ law to Volunteered Geographic Information
10 January, 2011
The paper “How Many Volunteers Does It Take To Map An Area Well? The validity of Linus’ law to Volunteered Geographic Information“ has appeared in The Cartographic Journal. The proper citation for the paper is:
Haklay, M and Basiouka, S and Antoniou, V and Ather, A (2010) How Many Volunteers Does It Take To Map An Area Well? The validity of Linus’ law to Volunteered Geographic Information. The Cartographic Journal , 47 (4) , 315 – 322.
The abstract of the paper is as follows:
In the area of volunteered geographical information (VGI), the issue of spatial data quality is a clear challenge. The data that are contributed to VGI projects do not comply with standard spatial data quality assurance procedures, and the contributors operate without central coordination and strict data collection frameworks. However, similar to the area of open source software development, it is suggested that the data hold an intrinsic quality assurance measure through the analysis of the number of contributors who have worked on a given spatial unit. The assumption that as the number of contributors increases so does the quality is known as `Linus’ Law’ within the open source community. This paper describes three studies that were carried out to evaluate this hypothesis for VGI using the OpenStreetMap dataset, showing that this rule indeed applies in the case of positional accuracy.
To access the paper on the journal’s website, you can follow the link: 10.1179/000870410X12911304958827. However, if you don’t hold a subscription to the journal, a postprint of the paper is available at the UCL Discovery repository. If you would like to get hold of the printed version, email me.


