19 February, 2014
The Citizen Cyberscience Summit that will be running in London this week sparked the interest of the producers of BBC World Service ‘Click’ programme, and it was my first experience of visiting BBC Broadcasting House – about 15 minutes walk from UCL.
Here is the clip from the programme that covers the discussion about the summit and Extreme Citizen Science
More information is provided in the Citizens of Science podcast - where myself and the other organisers discuss and preview the summit. That is an opportunity to recommend the other podcasts that can be found in the series.
11 February, 2014
A special delight during my PhD research was to discover, at the UCL library the proceedings of the first ever symposium on GIS. Dr Tomlinson studied towards a PhD at UCL, and probably that is how the copy found its way to the library. It was fairly symbolic for me that the symposium was titled ‘environmental information systems’. See my earlier comment about the terminology: Geographic information or Environmental Information.
The Guardian’s Political Science blog post by Alice Bell about the Memorandum of Understanding between the UK Natural Environment Research Council and Shell, reminded me of a nagging issue that has concerned me for a while: to what degree GIS contributed to anthropocentric climate change? and more importantly, what should GIS professionals do?
I’ll say from the start that the reason it concerns me is that I don’t have easy answers to these questions, especially not to the second one. While I personally would like to live in a society that moves very rapidly to renewable energy resources, I also take flights, drive to the supermarket and benefit from the use of fossil fuels – so I’m in the Hypocrites in The Air position, as Kevin Anderson defined it. At the same time, I feel that I do have responsibility as someone who teaches future generations of GIS professionals how they should use the tools and methods of GIScience responsibly. The easy way would be to tell myself that since, for the past 20 years, I’ve been working on ‘environmental applications’ of GIS, I’m on the ‘good’ side as far as sustainability is concerned. After all, the origins of the biggest player in our industry are environmental (environmental systems research, even!), we talk regularly about ‘Design With Nature’ as a core text that led to the overlays concept in GIS, and we praise the foresight of the designers of the UNEP Global Resource Information Database in the early 1980s. Even better, Google Earth brings Climate Change information and education to anyone who want to downloaded the information from the Met Office.
But technologies are not value-free, and do encapsulate certain values in them. That’s what critical cartography and critical GIS has highlighted since the late 1990s. Nadine Schuurman’s review is still a great starting point to this literature, but most of it analysed the link of the history of cartography and GIS to military applications, or, in the case of the volume ‘Ground Truth’, the use of GIS in marketing and classification of people. To the best of my knowledge, Critical GIScience has not focused its sight on oil exploration and extraction. Of course, issues such as pollution, environmental justice or environmental impacts of oil pipes are explored, but do we need to take a closer look at the way that GIS technology was shaped by the needs of the oil industry? For example, we use, without a second thought, the EPSG (European Petroleum Survey Group) definitions of co-ordinates reference systems in many tools. There are histories of products that are used widely, such as Oracle Spatial, where some features were developed specifically for the oil & gas industry. There are secretive and proprietary projections and datums, and GIS products that are unique to this industry. One of the most common spatial analysis methods, Kriging, was developed for the extractive industry. I’m sure that there is much more to explore.
So, what is the problem with that, you would say?
Fossil fuels – oil, coal, gas – are at the centre of the process that lead to climate change. Another important thing about them is that once they’ve been extracted, they are likely to be used. That’s why there are calls to leave them in the ground. When you look at the way explorations and production work, such as the image here from ‘Well Architect‘, you realise that geographical technologies are critical to the abilities to find and extract oil and gas. They must have played a role in the abilities of the industry to identify, drill and extract in places that were not feasible few decades ago. I remember my own amazement at the first time that I saw the complexity of the information that is being used and the routes that wells take underground, such as what is shown in the image (I’ll add that this was during an MSc project sponsored by Shell). In another project (sponsored by BP), it was just as fascinating to see how paleogeography is used for oil exploration. Therefore, within the complex process of finding and extracting fossil fuels, which involves many engineering aspects, geographical technologies do have an important role, but how important? Should Critical GIScientists or the emerging Critical Physical Geographers explore it?
This brings about the more thorny issue of the role of GIS professionals today and more so with people who are entering the field, such as the students who are studying for an MSc in GIS, and similar programmes. If we accept that most of the fossil fuels should stay underground and not be extracted, than what should we say to students? If the person that involved in working to help increasing oil production does not accept the science of climate change, or doesn’t accept that there is an imperative to leave fossil fuels in the ground, I may accept and respect their personal view. After all, as Mike Hulme noted, the political discussion is more important now than the science and we can disagree about it. On the other hand, we can take the point of view that we should deal with climate change urgently and go on the path towards reducing extraction rapidly. In terms of action, we see students joining campaigns for fossil free universities, with which I do have sympathy. However, we’re hitting another difficult point. We need to consider the personal cost of higher education and the opportunity for well paid jobs, which include tackling interesting and challenging problems. With the closure of many other jobs in GIS, what is the right thing to do?
I don’t have an easy answer, nor can I say that categorically I will never work with the extractive sector. But when I was asked recently to provide a reference letter by a student in the oil and gas industry, I felt obliged to state that ‘I can completely understand why you have chosen this career, I just hope that you won’t regret it when you talk with your grandchildren one day in the future’
29 January, 2014
Once upon a time, Streetmap.co.uk was one of the most popular Web Mapping sites in the UK, competing successfully with the biggest rival at the time, Multimap. Moreover, it was ranked second in The Daily Telegraph list of leading mapping sites in October 2000 and described at ‘Must be one of the most useful services on the web – and it’s completely free. Zoom in on any UK area by entering a place name, postcode, Ordnance Survey grid reference or telephone code.’ It’s still running and because of its legacy, it’s around the 1250 popular website in the UK (though 4 years ago it was among the top 350).
So far, nothing is especially noteworthy – popular website a decade ago replaced by a newer website, Google Maps, which provide better search results, more information and is the de facto standard for web mapping. Moreover, already in 2006 Artemis Skaraltidou demonstrated that of the UK Web Mapping crop, Streetmap scored lowest on usability with only MapQuest, which largely ignored the UK, being worse.
However, recently, while running a practical session introducing User-Centred Design principles to our MSc in GIS students, I have noticed an interesting implication of the changes in the environment of Web Mapping – Streetmap has stopped being usable just because it didn’t bother to update its interaction. By doing nothing, while the environment around it changed, it became unusable, with users failing to perform even the most basic of tasks.
The students explored the mapping offering from Google, Bing, Here and Streetmap. It was fairly obvious that across this cohort (early to mid 20s), Google Maps were the default, against which other systems were compared. It was not surprising to find impressions that Streetmap is ‘very old fashioned‘ or ‘archaic‘. However, more interesting was to notice people getting frustrated that the ‘natural’ interaction of zooming in and out using the mouse wheel just didn’t worked. Or failing to find the zoom in and out buttons. At some point in the past 10 years, people internalised the interaction mode of using the mouse and stopped using the zoom in and out button on the application, which explains the design decision in the new Google Maps interface to eliminate the dominant zoom slider from the left side of the map. Of course, Streetmap interface is also not responsive to touch screen interactions which are also learned across applications.
I experienced a similar, and somewhat amusing incident during the registration process of SXSW Eco, when I handed over my obviously old laptop at the registration desk to provide some detail, and the woman was trying to ‘pinch’ the screen in an attempt to zoom in. Considering that she was likely to be interacting with tablets most of the day (it was, after all, SXSW), this was not surprising. Interactions are learned and internalised, and we expect to experience them across devices and systems.
So what’s to learn? while this is another example of ‘Jacob’s Law of Internet User Experience‘ which states that ‘Users spend most of their time on other sites’, it is very relevant to many websites that use Web Mapping APIs to present information – from our own communitymaps.org.uk to the Environment Agency What’s in Your Backyard. In all these cases, it is critical to notice the basic map exploration interactions (pan, zoom, search) and make sure that they match common practices across the web. Otherwise, you might end like Streetmap.
Following the two previous assertions, namely that:
‘you can be supported by a huge crowd for a very short time, or by few for a long time, but you can’t have a huge crowd all of the time (unless data collection is passive)’ (original post here)
‘All information sources are heterogeneous, but some are more honest about it than others’ (original post here)
The third assertion is about pattern of participation. It is one that I’ve mentioned before and in some way it is a corollary of the two assertions above.
‘When looking at crowdsourced information, always keep participation inequality in mind’
Because crowdsourced information, either Volunteered Geographic Information or Citizen Science, is created through a socio-technical process, all too often it is easy to forget the social side – especially when you are looking at the information without the metadata of who collected it and when. So when working with OpenStreetMap data, or viewing the distribution of bird species in eBird (below), even though the data source is expected to be heterogeneous, each observation is treated as similar to other observation and assumed to be produced in a similar way.
Yet, data is not only heterogeneous in terms of consistency and coverage, it is also highly heterogeneous in terms of contribution. One of the most persistence findings from studies of various systems – for example in Wikipedia , OpenStreetMap and even in volunteer computing is that there is a very distinctive heterogeneity in contribution. The phenomena was term ‘Participation Inequality‘ by Jakob Nielsn in 2006 and it is summarised succinctly in the diagram below (from Visual Liberation blog) – very small number of contributors add most of the content, while most of the people that are involved in using the information will not contribute at all. Even when examining only those that actually contribute, in some project over 70% contribute only once, with a tiny minority contributing most of the information.
Therefore, when looking at sources of information that were created through such process, it is critical to remember the nature of contribution. This has far reaching implications on quality as it is dependent on the expertise of the heavy contributors, on their spatial and temporal engagement, and even on their social interaction and practices (e.g. abrasive behaviour towards other participants).
Because of these factors, it is critical to remember the impact and implications of participation inequality on the analysis of the information. There will be some analysis to which it will have less impact and some where it will have major one. In either cases, it need to be taken into account.
Following the last post, which focused on an assertion about crowdsourced geographic information and citizen science I continue with another observation. As was noted in the previous post, these can be treated as ‘laws’ as they seem to emerge as common patterns from multiple projects in different areas of activity – from citizen science to crowdsourced geographic information. The first assertion was about the relationship between the number of volunteers who can participate in an activity and the amount of time and effort that they are expect to contribute.
This time, I look at one aspect of data quality, which is about consistency and coverage. Here the following assertion applies:
‘All information sources are heterogeneous, but some are more honest about it than others’
What I mean by that is the on-going argument about authoritative and crowdsourced information sources (Flanagin and Metzger 2008 frequently come up in this context), which was also at the root of the Wikipedia vs. Britannica debate, and the mistrust in citizen science observations and the constant questioning if they can do ‘real research’.
There are many aspects for these concerns, so the assertion deals with the aspects of comprehensiveness and consistency which are used as a reason to dismiss crowdsourced information when comparing them to authoritative data. However, at a closer look we can see that all these information sources are fundamentally heterogeneous. Despite of all the effort to define precisely standards for data collection in authoritative data, heterogeneity creeps in because of budget and time limitations, decisions about what is worthy to collect and how, and the clash between reality and the specifications. Here are two examples:
Take one of the Ordnance Survey Open Data sources – the map present themselves as consistent and covering the whole country in an orderly way. However, dig in to the details for the mapping, and you discover that the Ordnance Survey uses different standards for mapping urban, rural and remote areas. Yet, the derived products that are generalised and manipulated in various ways, such as Meridian or Vector Map District, do not provide a clear indication which parts originated from which scale – so the heterogeneity of the source disappeared in the final product.
The census is also heterogeneous, and it is a good case of specifications vs. reality. Not everyone fill in the forms and even with the best effort of enumerators it is impossible to collect all the data, and therefore statistical analysis and manipulation of the results are required to produce a well reasoned assessment of the population. This is expected, even though it is not always understood.
Therefore, even the best information sources that we accept as authoritative are heterogeneous, but as I’ve stated, they just not completely honest about it. The ONS doesn’t release the full original set of data before all the manipulations, nor completely disclose all the assumptions that went into reaching the final value. The Ordnance Survey doesn’t tag every line with metadata about the date of collection and scale.
Somewhat counter-intuitively, exactly because crowdsourced information is expected to be inconsistent, we approach it as such and ask questions about its fitness for use. So in that way it is more honest about the inherent heterogeneity.
Importantly, the assertion should not be taken to be dismissive of authoritative sources, or ignoring that the heterogeneity within crowdsources information sources is likely to be much higher than in authoritative ones. Of course all the investment in making things consistent and the effort to get universal coverage is indeed worth it, and it will be foolish and counterproductive to consider that such sources of information can be replaced as is suggest for the census or that it’s not worth investing in the Ordnance Survey to update the authoritative data sets.
Moreover, when commercial interests meet crowdsourced geographic information or citizen science, the ‘honesty’ disappear. For example, even though we know that Google Map Maker is now used in many part
s of the world (see the figure), even in cases when access to vector data is provided by Google, you cannot find out about who contribute, when and where. It is also presented as an authoritative source of information.
Despite the risk of misinterpretation, the assertion can be useful as a reminder that the differences between authoritative and crowdsourced information are not as big as it may seem.
Looking across the range of crowdsourced geographic information activities, some regular patterns are emerging and it might be useful to start notice them as a way to think about what is possible or not possible to do in this area. Since I don’t like the concept of ‘laws’ – as in Tobler’s first law of geography which is stated as ‘Everything is related to everything else, but near things are more related than distant things.’ – I would call them assertions. There is also something nice about using the word ‘assertion’ in the context of crowdsourced geographic information, as it echos Mike Goodchild’s differentiation between asserted and authoritative information. So not laws, just assertions or even observations.
The first one, is rephrasing a famous quote:
‘you can be supported by a huge crowd for a very short time, or by few for a long time, but you can’t have a huge crowd all of the time (unless data collection is passive)’
So the Christmas Bird Count can have tens of thousands of participants for a short time, while the number of people who operate weather observation stations will be much smaller. Same thing is true for OpenStreetMap – for crisis mapping, which is a short term task, you can get many contributors but for the regular updating of an area under usual conditions, there will be only few.
The exception for the assertion is the case for passive data collection, where information is collected automatically through the logging of information from a sensor – for example the recording of GPS track to improve navigation information.