Completeness in volunteered geographical information – the evolution of OpenStreetMap coverage (2008-2009)
13 August, 2010
The Journal of Spatial Information Science (JOSIS) is a new open access journal in GIScience, edited by Matt Duckham, Jörg-Rüdiger Sack, and Michael Worboys. In addition, the journal adopted an open peer review process, so readers are invited to comment on a paper while it goes through the formal peer review process. So this seem to be the most natural outlet for a new paper that analyses the completeness of OpenStreetMap over 18 months – March 2008 to October 2009. The paper was written in collaboration with Claire Ellul. The abstract of the paper provided below, and you are very welcome to comment on the paper on JOSIS forum that is dedicated to it, where you can also download it.
Abstract: The ability of lay people to collect and share geographical information has increased markedly over the past 5 years as results of the maturation of web and location technologies. This ability has led to a rapid growth in Volunteered Geographical Information (VGI) applications. One of the leading examples of this phenomenon is the OpenStreetMap project, which started in the summer of 2004 in London, England. This paper reports on the development of the project over the period March 2008 to October 2009 by focusing on the completeness of coverage in England. The methodology that is used to evaluate the completeness is comparison of the OpenStreetMap dataset to the Ordnance Survey dataset Meridian 2. The analysis evaluates the coverage in terms of physical coverage (how much area is covered), followed by estimation of the percentage of England population which is covered by completed OpenStreetMap data and finally by using the Index of Deprivation 2007 to gauge socio-economic aspects of OpenStreetMap activity. The analysis shows that within 5 years of project initiation, OpenStreetMap already covers 65% of the area of England, although when details such as street names are taken into consideration, the coverage is closer to 25%. Significantly, this 25% of England’s area covers 45% of its population. There is also a clear bias in data collection practices – more affluent areas and urban locations are better covered than deprived or rural locations. The implications of these outcomes to studies of volunteered geographical information are discussed towards the end of the paper.
This is call for papers for a workshop on methods and research techniques that are suitable for geospatial technologies. The workshop is planned for the day before GISRUK 2010, and we are aware of the clashes with the AAG 2010 annual meeting, CHI 2010 and the Ergonomics Society Annual Conference. However, if you would like to contribute to the book that the commission is developing but can’t attend the workshop, please send an abstract and inform us that you can’t attend.
In the near future I’ll publish information about another workshop in March 2010 about the usability and Human-Computer Interaction aspects of geographical information itself – see the report from the Ordnance Survey workshop earlier in 2009.
So here is the full call:
Workshop on Methods and Techniques of Use, User and Usability Research in Geo-information Processing and Dissemination
Tuesday 13 April 2010 at University College London
The Commission on Use and User Issues of the International Cartographic Association (ICA) is currently working on a new handbook specifically addressing the application of user research methods and techniques in the geodomain.
In order to share experiences and interesting case studies a workshop is organized by the Commission, in collaboration with UCL, on the day preceding GISRUK 2010.
CALL FOR PAPERS
While there is growing awareness within the research community on the need to develop usability engineering and use and user research methods that are suitable for geographical and spatial information and systems, to date there is a lack of organized and documented experience in this area.
We therefore invite researchers with recent experience with use, user and usability research in the broad geodomain (cartography, GIS, geovisualization, Location Based Services, geographical information, GeoWeb etc.) to present a paper specifically focusing on the research methods and techniques applied, with an aim to develop the body of knowledge for the domain.
To participate, please send an abstract of 1 page A4 at maximum containing:
- A description of the research method(s) and technique(s) applied
- A short description of the case in which they have been applied
- The overall research framework
- Contact details and affiliation of the author(s)
We are also encouraging PhD researchers to submit paper proposals and share experiences from their research. At the workshop there will be ample time for discussing the application of user research methods and techniques. Good papers may be the basis for contributions to the handbook that is planned for publication in 2011.
Abstracts should be submitted on or before 1 December 2009 to the Chairman of the Commission Corné van Elzakker ( firstname.lastname@example.org )
the website of the ICA Commission on Use and User Issues and the GISRUK2010 website
21 January, 2009
Trying to track down the source of a term is one of the more interesting academic tasks. For example, finding out when people started researching Human-Computer Interaction and GIS is a bit like following the thread. First of all, the term Human-Computer Interaction is sometimes presented as Computer-Human Interaction, especially in the early 1980s, when it emerged – the ACM Special Interest Group still uses CHI and not HCI. Before that, the common term used was Man-Machine Interaction which was actually a term that came out of studies in the 1940s. The way to uncover this terminology chain is to find papers that mention both terms and follow it through. Quite quickly you develop an understanding of the chain…
Then there is the issue of GIS – after all, the term was invented only around the mid 1960s: surely many people outside the small circle of researchers that became familiar with the term used other terminology. So you need to look for other terms, such as geographic information (as well as geographical information), maps, etc.
Following this approach, I have found a paper from 1963 by Malcolm Pivar, Ed Fredkin and Henry Stommel about ‘Computer-Compiled Oceanographic Atlas: an Experiment in Man-Machine Interaction’. The paper is as interesting as its writers – with Pivar and Fredkin among the Artificial Intelligence group at MIT, and Stommel a leading oceanographer. The data came from surveys that were part of the International Geophysical Year (1957/8 ) – and the paper shows that information overload is nothing new.
For me, the most interesting passage in the paper is:
‘[I]n preparing a printed atlas certain irrevocable choices of scale, of map projections, of contour interval, and of type of map (shall we plot temperature at standard depths, or on density surfaces, etc.?) must be made from the vast infinitude of all possible mappings. An atlas-like representation, generated by digital computer and displayed upon a cathode-ray screen, enables the oceanographer to modify these choices at will. Only a high-speed computer has the capacity and speed to follow the quickly shifting demands and questions of a human mind exploring a large field of numbers. The ideal computer-compiled oceanographic atlas will be immediately responsive to any demand of the user, and will provide the precise detailed information requested without any extraneous information. The user will be able to interrogate the display to evoke further information; it will help him track down errors and will offer alternative forms of presentation. Thus, the display on the screen is not a static one; instead, it embodies animation as varying presentations are scanned. In a very real sense, the user “converses” with the machine about the stored data.’ (Pivar et al., 1963, p. 396)
What an amazing vision in 1963 – it would take another 30 years and even more before what they are describing became a reality!
21 October, 2008
These are the slides from the Worldwide Universities Network Global GIS Academy Seminar from the 22nd October. The seminar’s title is ‘What’s So New in Neogeography?’ and it is aimed largely at an academic audience with background in GIScience.
The aim of the talk is to critically review Neogeography: explain its origins, discuss the positive lessons from it – mainly in improved usability of geographic technologies, as well as highlighting aspects that I see as problematic.
The presentation starts with some definitions and with the notice that mapping/location is central to Web 2.0, and thus we shouldn’t be surprised that we’ve noticed a step change in the use of GI over the past 3 years.
By understanding what changed around 2005, it is possible to explain the development of Neogeography. These changes are not just technical but also societal.
The core of the discussion is on the new issues that are important to Neogeography’d success, but also raising some theoretical and practical aspects that must be included in a comprehensive analysis of the changes and what they mean to Geography and geographers.
The presentation is available below from slideshare, and the (very rough and without proofing) notes are available here.
19 August, 2008
A comparison of my analysis of OpenStreetMap (OSM) quality evaluation to other examples of quality evaluation brings up some core issues about the nature of the new GeoWeb and the use of traditional sources. The examples that I’m referring to are from Etienne Cherdlu’s SOTM 2007 ‘OSM and the art of bicycle maintenance’, Dair Grant’s comparison of OSM to Google Maps and reality, Ed Johnson’s analysis this summer and Steven Feldman’s brief evaluation in Highgate.
The first observation is of the importance and abundance of well georeferenced, vector-derived public mapping sites, which make several of these comparisons possible (Chedlu, Dair and Feldman). The previous generation of stylised street maps is not readily available for a comparison. In addition to the availability, the ease with which they can be mashed-up is also a significant enabling factor. Without this comparable geographical information, the evaluation would be much more difficult.
Secondly, when a public mapping website was used, it was Google Maps. If Microsoft’s Virtual Earth had also been used, it would arguably allow a three-way comparison as the Microsoft site uses Navteq information, while Google uses TeleAtlas information. Using Ordnance Survey (OS) OpenSpace for comparison is also a natural candidate. Was this familiarity that led to the selection of Google Maps? Or is it because the method of comparison is visual inspection, so adding a third source makes it more difficult? Notice that Google has the cachet of being a correct depiction of reality, which Etienne, Dair and Bob Barr demonstrated not to be the case!
Thirdly, and most significantly, only when vector data was used – in our comparison and in parts of what Ed Johnson has done – a comprehensive analysis of large areas became possible. This shows the important aspect of the role of formats in the GeoWeb – raster is fabulous for the delivery of cartographic representations, but it is a vector that is suitable for analytical and computational analysis. Only OSM allows the user easy download of vector data – no other mass provider of public mapping does.
Finally, there is the issue of access to information, tools and knowledge. As a team that works at a leading research university (UCL), I and the people who worked with me got easy access to detailed vector datasets and the OS 1:10,000 raster. We also have at our disposal multiple GIS packages, so we can use whichever one performs the task with the least effort. The other comparisons had to rely on publically available datasets and software. In such unequal conditions, it is not surprising that I will argue that the comparison that we carried out is more robust and consistent. The issue that is coming up here is the balance between amateurs and experts, which is quite central to Web 2.0 in general. Should my analysis be more trusted than those of Dair’s or Etienne’s, both of whom who are very active in OSM? Does Steven’s familiarity with Highgate, which is greater than mine, make him more of an expert in that area than my consistent application of analysis?
I think that the answer is not clear cut; academic knowledge entails the consistent scrutiny of the data, and I do have the access and the training to conduct a very detailed geographical information quality assessment. In addition, my first job in 1988 was in geographical data collection and GIS development, so I also have professional knowledge in this area. Yet, local knowledge is just as valuable in a specific area and is much better than a mechanical, automatic evaluation. So what is happening is an exchange of knowledge, methods and experiences between the two sides in which both, I hope, can benefit.