Following the last post, which focused on an assertion about crowdsourced geographic information and citizen science I continue with another observation. As was noted in the previous post, these can be treated as ‘laws’ as they seem to emerge as common patterns from multiple projects in different areas of activity – from citizen science to crowdsourced geographic information. The first assertion was about the relationship between the number of volunteers who can participate in an activity and the amount of time and effort that they are expect to contribute.
This time, I look at one aspect of data quality, which is about consistency and coverage. Here the following assertion applies:
‘All information sources are heterogeneous, but some are more honest about it than others’
What I mean by that is the on-going argument about authoritative and crowdsourced information sources (Flanagin and Metzger 2008 frequently come up in this context), which was also at the root of the Wikipedia vs. Britannica debate, and the mistrust in citizen science observations and the constant questioning if they can do ‘real research’.
There are many aspects for these concerns, so the assertion deals with the aspects of comprehensiveness and consistency which are used as a reason to dismiss crowdsourced information when comparing them to authoritative data. However, at a closer look we can see that all these information sources are fundamentally heterogeneous. Despite of all the effort to define precisely standards for data collection in authoritative data, heterogeneity creeps in because of budget and time limitations, decisions about what is worthy to collect and how, and the clash between reality and the specifications. Here are two examples:
Take one of the Ordnance Survey Open Data sources – the map present themselves as consistent and covering the whole country in an orderly way. However, dig in to the details for the mapping, and you discover that the Ordnance Survey uses different standards for mapping urban, rural and remote areas. Yet, the derived products that are generalised and manipulated in various ways, such as Meridian or Vector Map District, do not provide a clear indication which parts originated from which scale – so the heterogeneity of the source disappeared in the final product.
The census is also heterogeneous, and it is a good case of specifications vs. reality. Not everyone fill in the forms and even with the best effort of enumerators it is impossible to collect all the data, and therefore statistical analysis and manipulation of the results are required to produce a well reasoned assessment of the population. This is expected, even though it is not always understood.
Therefore, even the best information sources that we accept as authoritative are heterogeneous, but as I’ve stated, they just not completely honest about it. The ONS doesn’t release the full original set of data before all the manipulations, nor completely disclose all the assumptions that went into reaching the final value. The Ordnance Survey doesn’t tag every line with metadata about the date of collection and scale.
Somewhat counter-intuitively, exactly because crowdsourced information is expected to be inconsistent, we approach it as such and ask questions about its fitness for use. So in that way it is more honest about the inherent heterogeneity.
Importantly, the assertion should not be taken to be dismissive of authoritative sources, or ignoring that the heterogeneity within crowdsources information sources is likely to be much higher than in authoritative ones. Of course all the investment in making things consistent and the effort to get universal coverage is indeed worth it, and it will be foolish and counterproductive to consider that such sources of information can be replaced as is suggest for the census or that it’s not worth investing in the Ordnance Survey to update the authoritative data sets.
Moreover, when commercial interests meet crowdsourced geographic information or citizen science, the ‘honesty’ disappear. For example, even though we know that Google Map Maker is now used in many part
s of the world (see the figure), even in cases when access to vector data is provided by Google, you cannot find out about who contribute, when and where. It is also presented as an authoritative source of information.
Despite the risk of misinterpretation, the assertion can be useful as a reminder that the differences between authoritative and crowdsourced information are not as big as it may seem.
Looking across the range of crowdsourced geographic information activities, some regular patterns are emerging and it might be useful to start notice them as a way to think about what is possible or not possible to do in this area. Since I don’t like the concept of ‘laws’ – as in Tobler’s first law of geography which is stated as ‘Everything is related to everything else, but near things are more related than distant things.’ – I would call them assertions. There is also something nice about using the word ‘assertion’ in the context of crowdsourced geographic information, as it echos Mike Goodchild’s differentiation between asserted and authoritative information. So not laws, just assertions or even observations.
The first one, is rephrasing a famous quote:
‘you can be supported by a huge crowd for a very short time, or by few for a long time, but you can’t have a huge crowd all of the time (unless data collection is passive)’
So the Christmas Bird Count can have tens of thousands of participants for a short time, while the number of people who operate weather observation stations will be much smaller. Same thing is true for OpenStreetMap – for crisis mapping, which is a short term task, you can get many contributors but for the regular updating of an area under usual conditions, there will be only few.
The exception for the assertion is the case for passive data collection, where information is collected automatically through the logging of information from a sensor – for example the recording of GPS track to improve navigation information.
8 July, 2013
The term ‘Citizen Science’ is clearly gaining more recognition and use. It is now get mentioned in radio and television broadcasts, social media channels as well as conferences and workshops. Some of the clearer signs for the growing attention include discussion of citizen science in policy oriented conferences such as UNESCO’s World Summit on Information Society (WSIS+10) review meeting discussion papers (see page ), or the Eye on Earth users conference (see the talks here) or the launch of the European Citizen Science Association in the recent EU Green Week conference.
Another aspect of the expanding world of citizen science is the emerging questions from those who are involved in such projects or study them about the efficacy of the term. As is very common with general terms, some reflections on the accuracy of the term are coming to the fore – so Rick Bonney and colleagues suggest to use ‘Public Participation in Scientific Research‘ (significantly, Bonney was the first to use ‘Citizen Science’ in 1995); Francois Grey coined Citizen Cyberscience to describe projects that are dependent on the Internet; recently Chris Lintott discussed some doubts about the term in the context of Zooniverse; and Katherine Mathieson asks if Citizen Science is just a passing fad. In our own group, there are also questions about the correct terminology, with Cindy Regalado suggestions to focus on ‘Publicly Initiated Scientific Research (PIScR)‘, and discussion on the meaning of ‘Extreme Citizen Science‘.
One way to explore what is going on is to consider the evolution of the ‘hype’ around citizen science through ‘Gartner’s Hype Cycle‘ which can be seen as a way to consider the way technologies are being adopted in a world of rapid communication and inflated expectations from technologies. leaving aside Gartner own hype, the story that the model is trying to tell is that once a new approach (technology) emerges because it is possible or someone reconfigured existing elements and claim that it’s a new thing (e.g. Web 2.0), it will go through a rapid growth in terms of attention and publicity. This will go on until it reaches the ‘peak of inflated expectations’ where the expectations from the technology are unrealistic (e.g. that it will revolutionize the way we use our fridges). This must follow by a slump, as more and more failures come to light and the promises are not fulfilled. At this stage, the disillusionment is so deep that even the useful aspects of the technology are forgotten. However, if it passes this stage, then after the realisation of what is possible, the technology is integrated into everyday life and practices and being used productively.
So does the hype cycle apply to citizen science?
If we look at Gartner cycle from last September, Crowdsourcing is near the ‘peak of inflated expectations’ and some descriptions of citizen science as scientific crowdsourcing clearly match the same mindset.
There is a growing evidence of academic researchers entering citizen science out of opportunism, without paying attention to the commitment and work that is require to carry out such projects. With some, it seems like that they decided that they can also join in because someone around know how to make an app for smartphones or a website that will work like Galaxy Zoo (failing to notice the need all the social aspects that Arfon Smith highlights in his talks). When you look around at the emerging projects, you can start guessing which projects will succeed or fail by looking at the expertise and approach that the people behind it take.
Another cause of concern are the expectations that I noticed in the more policy oriented events about the ability of citizen science to solve all sort of issues – from raising awareness to behaviour change with limited professional involvement, or that it will reduce the resources that are needed for activities such as environmental monitoring, but without an understanding that significant sustained investment is required – community coordinator, technical support and other aspects are needed here just as much. This concern is heightened by statements that promote citizen science as a mechanism to reduce the costs of research, creating a source of free labour etc.
On the other hand, it can be argued that the hype cycle doesn’t apply to citizen science because of history. Citizen science existed for many years, as Caren Cooper describe in her blog posts. Therefore, conceptualising it as a new technology is wrong as there are already mechanisms, practices and institutions to support it.
In addition, and unlike the technologies that are on Gartner chart, academic projects within which citizen science happen benefit from access to what is sometime termed patient capital without expectations for quick returns on investment. Even with the increasing expectations of research funding bodies for explanations on how the research will lead to an impact on wider society, they have no expectations that the impact will be immediate (5-10 years is usually fine) and funding come in chunks that cover 3-5 years, which provides the breathing space to overcome the ‘through of disillusionment’ that is likely to happen within the technology sector regarding crowdsourcing.
And yet, I would guess that citizen science will suffer some examples of disillusionment from badly designed and executed projects – to get these projects right you need to have a combination of domain knowledge in the specific scientific discipline, science communication to tell the story in an accessible way, technical ability to build mobile and web infrastructure, understanding of user interaction and user experience to to build an engaging interfaces, community management ability to nurture and develop your communities and we can add further skills to the list (e.g. if you want gamification elements, you need experts in games and not to do it amateurishly). In short, it need to be taken seriously, with careful considerations and design. This is not a call for gatekeepers , more a realisation that the successful projects and groups are stating similar things.
Which bring us back to the issue of the definition of citizen science and terminology. I have been following terminology arguments in my own discipline for over 20 years. I have seen people arguing about a data storage format for GIS and should it be raster or vector (answer: it doesn’t matter). Or arguing if GIS is tool or science. Or unhappy with Geographic Information Science and resolutely calling it geoinformation, geoinformatics etc. Even in the minute sub-discipline that deals with participation and computerised maps that are arguments about Public Participation GIS (PPGIS) or Participatory GIS (PGIS). Most recently, we are debating the right term for mass-contribution of geographic information as volunteered geographic information (VGI), Crowdsourced geographic information or user-generated geographic information.
It’s not that terminology and precision in definition is not useful, on the contrary. However, I’ve noticed that in most cases the more inclusive and, importantly, vague and broad church definition won the day. Broad terminologies, especially when they are evocative (such as citizen science), are especially powerful. They convey a good message and are therefore useful. As long as we don’t try to force a canonical definition and allow people to decide what they include in the term and express clearly why what they are doing is falling within citizen science, it should be fine. Some broad principles are useful and will help all those that are committed to working in this area to sail through the hype cycle safely.
17 May, 2013
The UCL Urban Laboratory is a cross-disciplinary initiative that links various research interest in urban issues, from infrastructure to the way they are expressed in art, films and photography. The Urban Laboratory has just published its first Urban Pamphleteer which aim to ‘confront key contemporary urban questions from diverse perspectives. Written in a direct and accessible tone, the intention of these pamphlets is to draw on the history of radical pamphleteering to stimulate debate and instigate change.’
My contribution to the first pamphleteer, which focused on ‘Future & Smart Cities’ is dealing with the balance between technology companies, engineers and scientists and the values, needs and wishes of the wider society. In particular, I suggest the potential of citizen science in opening up some of the black boxes of smart cities to wider societal control. Here are the opening and the closing paragraphs of my text, titled Beyond quantification: we need a meaningful smart city:
‘When approaching the issue of Smart Cities, there is a need to discuss the underlying assumptions at the basis of Smart Cities and challenge the prevailing thought that only efficiency and productivity are the most important values. We need to ensure that human and environmental values are taken into account in the design and implementation of systems that will influence the way cities operate…
…Although these Citizen Science approaches can potentially develop new avenues for discussing alternatives to the efficiency and productivity logic of Smart Cities, we cannot absolve those with most resources and knowledge from responsibility. There is an urgent need to ensure that the development and use of the Smart Cities technologies that are created is open to democratic and societal control, and that they are not being developed only because the technologists and scientists think that they are possible.’
The pamphleteer is not too long – 32 pages – and include many thought-provoking pieces from researchers in Geography, Environmental Engineering, Architecture, Computer Science and Art. It can be downloaded here.
As I’ve noted in the previous post, I have just attended CHI (Computer-Human Interaction) conference for the first time. It’s a fairly big conference, with over 3000 participants, multiple tracks that evolved over the 30 years that CHI have been going, including the familiar paper presentations, panels, posters and courses, but also the less familiar ‘interactivity areas’, various student competitions, alt.CHI or Special Interest Groups meetings. It’s all fairly daunting even with all my existing experience in academic conferences. During the GeoHCI workshop I have discovered the MyCHI application, which helps in identifying interesting papers and sessions (including social recommendations) and setting up a conference schedule from these papers. It is a useful and effective app that I used throughout the conference (and wish that something similar can be made available in other large conferences, such as the AAG annual meeting).
With MyCHI in hand, while the fog started to lift and I could see a way through the programme, the trepidation about the relevance of CHI to my interests remained and even somewhat increased, after a quick search of the words ‘geog’,’marginal’,’disadvantage’ returned nothing. The conference video preview (below) also made me somewhat uncomfortable. I have a general cautious approach to the understanding and development of digital technologies, and a strong dislike to the breathless excitement from new innovations that are not necessarily making the world a better place.
Luckily, after few more attempts I have found papers about ‘environment’, ‘development’ and ‘sustainability’. Moreover, I discovered the special interest groups (SIG) that are dedicated to HCI for Development (HCI4D) and HCI for Sustainability and the programme started to build up. The sessions of these two SIGs were an excellent occasion to meet other people who are active in similar topics, and even to learn about the fascinating concept of ‘Collapse Informatics‘ which is clearly inspired by Jared Diamond book and explores “the study, design, and development of sociotechnical systems in the abundant present for use in a future of scarcity“.
Beyond the discussions, meeting people with shared interests and seeing that there is a scope within CHI to technology analysis and development that matches my approach, several papers and sessions were especially memorable. The studies by Elaine Massung an colleagues about community activism in encouraging shops to close the doors (and therefore waste less heating energy) and Kate Starbird on the use of social media in passing information between first responders during the Haiti earthquake, explored how volunteered, ‘crowd’ information can be used in crisis and environmental activism.
Other valuable papers in the area of HCI for development and sustainability include the excellent longitudinal study by Susan Wyche and Laura Murphy on the way mobile charging technology is used in Kenya , a study by Adrian Clear and colleagues about energy use and cooking practices of university students in Lancaster, a longitudinal study of responses to indoor air pollution monitoring by Sunyoung Kim and colleagues, and an interesting study of 8-bit, $10 computers that are common in many countries across the world by Derek Lomas and colleagues.
The ‘CHI at the Barricades – an activist agenda?‘ was one of the high points of the conference, with a showcase of the ways in which researchers in HCI can take a more active role in their research and lead to social or environmental change, and considering how the role of interactions in enabling or promoting such changes can be used to achieve positive outcomes. The discussions that followed the short interventions from the panel covered issues from accessibility to ethics to ways of acting and leading changes. Interestingly, while some presenters were comfortable with their activist role, the term ‘action-research’ was not mentioned. It was also illuminating to hear Ben Shneiderman emphasising his view that HCI is about representing and empowering the people who use the technologies that are being developed. His call for ‘activist HCI’ provides a way to interpret ‘universal usability‘ as an ethical and moral imperative.
So despite the early concerned, CHI was a conference worth attending and the specific jargon of CHI now seem more understandable. I wish that there was on the conference website a big sign ‘new to CHI? Start here…’
The talk, which is titled ‘Science for everyone by everyone – the re-emergence of citizen science‘ covered the area of citizen science and explained what we are trying to achieve within the Extreme Citizen Science research group.
Because the lunch hour lectures are open to all, I preferred not to assume any prior knowledge of citizen science (or public participation in scientific research) and start by highlighting that public participation in scientific research is not new. After a short introduction to the history and to the fact that many people are involved in scientific activities in their free time, from bird watching to weather or astronomical observations and that this never stopped, there is a notable difference in the attention that is paid to citizen science in recent years.
Therefore, I covered the trends in education and technology that are ushering in a new era of citizen science – access to information through the internet, use of location aware mobile devices, growth in social knowledge creation web-based systems, increased in education and the ability to deal with abstract ideas (Flynn effect is an indicator of this last point). The talk explored the current trends and types of citizen science, and demonstrate a model for extreme citizen science, in which any community, regardless of their literacy, can utilise scientific methods and tools to understand and control their environment. I have used examples of citizen science activities from other groups at UCL, to demonstrate the range of topics, domains and activities that are now included in this area.
The talk was recorded, and is available on YouTube and below
Since early 2010, I had the privilege of being a member of the editorial board of the journal Transactions of the Institute of British Geographers . It is a fascinating position, as the journal covers a wide range of topics in geography, and is also recognised as one of the top journals in the field and therefore the submissions are usually of high quality. Over the past 3 years, I was following a range of papers that deal with various aspects of Geographic Information Science (GIScience) from submission to publication either as a reviewer or as associate editor.
In early 2011, I agreed to coordinate a virtual issue on GIScience. The virtual issue is a collection of papers from the archives of the journal, demonstrating the breadth of coverage and the development of GIScience within the discipline of geography over the years. The virtual issues provide free access to a group of papers for a period of a year, so they can be used for teaching and research.
Editing the virtual issue was a very interesting task – I was exploring the archives of the journal, going back to papers that appeared in the 1950s and 1960s. When looking for papers that are relevant to GIScience, I came across various papers that relate to geography’s ‘Quantitative Revolution‘. The evolution of use of computers in geography and later on the applications of GIS is covered in many papers, so the selection was a challenge. Luckily, another member of the editorial board, Brian Lees, is also well versed in GIScience as the editor of the International Journal of GIScience. Together, we made the selection of the papers that are included in the issue. Other papers are not part of the virtual issue but are valuable further reading.
To accompany the virtual issue, I have written a short piece, focusing on the nature of GIScience in geography. The piece is titled “Geographic Information Science: tribe, badge and sub-discipline” and is exploring how the latest developments in technology and practice are integrated and resisted by the core group of people who are active GIScience researchers in geography.
You can access the virtual issue on Wiley-Blackwell online library and you will find papers from 1965 to today, with links to further papers that are relevant but not free for access. The list of authors is impressive, including many names that are associated with the development of GIScience over the years from Torstan Hägerstrand or David Rhind to current researchers such as Sarah Elwood, Agnieszka Leszczynski or Matt Zook.
The virtual issue will be officially launched (and was timed to coincide with) at the GIScience 2012 conference.
As I cannot attend the conference, and as my paper mentioned the Twitter-based GeoWebChat (see http://mappingmashups.net/geowebchat/) which is coordinated by Alan McConchie, I am planning to use this medium for running a #geowebchat that is dedicated to the virtual issue on the 18th September 2012, at 4pm EDT, 9pm BST so those who attend the conference can join at the end of the workshops day.