OpenStreetMap in Athens – as accurate as London
25 November, 2009
Most of the work that we carried out at UCL in evaluating the quality of OpenStreetMap is focused on England, and particularly on London. This is mainly due to the accessibility of comparative datasets. The reason for this was the availability of data, as the Ordnance Survey research unit kindly provided me with the full Meridian 2 dataset for comparison. More detailed comparison, for which we used MasterMap, came from the wonderful Digimap service, though because of the time that it takes to process it we were limited in the size of the area that was used for comparison.
One of the open questions that remained was the accuracy of data collection in other parts of the world. Luckily, Ourania (Rania) Kounadi, who studied our MSc in GIS at UCL, had access to detailed maps of Athens. She used a 1:10,000 map from the Hellenic Military Geographic Service (HGMS) and focused on an area of 25 square kilometres at the centre of the city. The roads were digitised from the HGMS map, and then the Goodchild-Hunter procedure was used to evaluate the positional accuracy.
The results show that for most of the roads in the evaluation area there was an overlap of 69% to 100% between OSM and HGMS datasets. The average overlap was very close to 90%. Her analysis also included attribute and completeness evaluation, showing that the quality is high on these aspects too.
So a pattern is starting to emerge showing that the quality of OSM data is indeed good in terms of positional accuracy. This is surprising at first glance – how come people who are not necessarily trained in geographical data collection and do not use rigorous quality assurance processes produce data that is as good as the authoritative data?
My explanation for this, as I’ve written in my paper about OSM quality, is that it ‘demonstrates the importance of the infrastructure, which is funded by the private and public sector and which allows the volunteers to do their work without significant personal investment. The GPS system and the receivers allow untrained users to automatically acquire their position accurately, and thus simplify the process of gathering geographical information. This is, in a way, the culmination of the process in which highly trained surveyors were replaced by technicians, with the introduction of high-accuracy GPS receivers in the construction and mapping industries over the last decade. The imagery also provides such an infrastructure function – the images were processed, rectified and georeferenced by experts and thus, an OSM volunteer who uses this imagery for digitising benefits from the good positional accuracy which is inherent in the image. So the issue here is not to compare the work of professionals and amateurs, but to understand that the amateurs are actually supported by the codified professionalised infrastructure and develop their skills through engagement with the project.’
Rania’s dissertation is available to download from here.
OpenStreetMap and Ordnance Survey Meridian 2 – Progress maps
14 November, 2009
As part of an update of the work that I published in August 2008, I re-ran the comparison between the OpenStreetMap and Ordnance Survey Meridian 2 datasets. In a future post, I will provide a full report of this assessment. As I have now completed the evaluation for October 2009 and a re-evaluation of the data from March 2008, I decided to publish some outputs. The map below shows the completeness of OpenStreetMap across England for the two periods. Click on the map to enlarge.
The second set of maps show the estimation of completeness when attributes are considered. For this purpose, the calculation takes into account only line objects that are comparable to those in Meridian 2; thus not including features such as footpaths. The following types of roads were used: motorway, motorway_link, primary, primary_link, secondary, secondary_link, trunk, trunk_link, tertiary, tertiary_link, minor, unclassified and residential.
In addition, a test verified that the ‘name’ field is not empty. This is an indication that a street name or road number is included in the attributes of the objects, and thus it can be considered to be complete with basic attributes. In order to make the comparison appropriate, only objects that contain a road name or number in Meridian 2 were included.
The growth within just over a year and a half is very impressive – rising from 27% in March 2008 to 65% in October 2009. When attributes are considered, it has risen from 7% to 25%. Notice that the criteria that I have set for this comparison are stringent than the one in the previous study, so the numbers – especially for the attribute completeness – are lower than those published in August 2008.
Linus’ Law and OpenStreetMap
7 November, 2009
One of the interesting questions that emerged from the work on the quality of OpenStreetMap (OSM) in particular, and Volunteered Geographical Information (VGI) in general, is the validity of the ‘Linus’ Law’ for this type of information.
The law came from Open Source software development and states that ‘Given enough eyeballs, all bugs are shallow’ (Raymond, 2001, p.19). For mapping, I suggest that this can be translated into the number of contributors that have worked on a given area. The rationale behind it is that if there is only one contributor in an area he or she might inadvertently introduce some errors. For example, they might forget to survey a street or might position a feature in the wrong location. If there are several contributors, they might notice inaccuracies or ‘bugs’ and therefore the more users, the less ‘bugs’.
In my original analysis, I looked only at the number of contributors per square kilometre as a proxy for accuracy, and provided a visualisation of the difference across England.
During the past year, Aamer Ather and Sofia Basiouka looked at this issue, by comparing the positional accuracy of OSM in 125 sq km of London. Aamer carried out a detailed comparison of OSM and the Ordnance Survey MasterMap Integrated Transport Network (ITN) layer. Sofia took the results from his study and divided them for each grid square, so it was possible to calculate an overall value for every cell. The value is the average of the overlap between OSM and OS objects, weighted by the length of the ITN object. The next step was to compare the results to the number of users at each grid square, as calculated from the nodes in the area.
The results show that, above 5 users, there is no clear pattern of improved quality. The graph below provide the details – but the pattern is that the quality, while generally very high, is not dependent on the number of users – so Linus’ Law does not apply to OSM (and probably not to VGI in general).
From looking at OSM data, my hypothesis is that, due to the participation inequality in OSM contribution (some users contribute a lot while others don’t contribute very much), the quality is actually linked to a specific user, and not to the number of users.
Yet, I will qualify the conclusion with the statement that further research is necessary. Firstly, the analysis was carried out in London, so checking what is happening in other parts of the country where different users collected the data is necessary. Secondly, the analysis did not include the interesting range of 1 to 5 users, so it might be the case that there is rapid improvement in quality from 1 to 5 and then it doesn’t matter. Maybe the big change is from 1 to 3? Finally, the analysis focused on positional accuracy, and it is worth exploring the impact of the number of users on completeness.
GISRUK 2010 at UCL – Call for papers
15 October, 2009
Geographical Information Science Research UK (GISRUK) is a research conference that has been taking place in different university campuses around the UK (and once in Ireland) since 1993. Despite the name, it is open not just to researchers from the UK, but also to international participants, who are very welcome.
For me, GISRUK was the first international conference in which I presented a paper eleven years ago, so I have a soft spot for it. It was very friendly and welcoming for a starting research student (which I was at the time). It was especially useful to discover that all the famous academics who attended it were friendly and open to questions.
The conference will be held at UCL in April 2010, and the call for papers is now out, so consider submitting a paper.
The papers are rather short, about 1500 words, so there is plenty of time to write one in time for the deadline of the end of November.
Volunteered Geographical Information Research Network
17 July, 2009
Chris Parker, a PhD student at Loughborough University, organised a dedicated Volunteered Geographical Information research group site on ResearchGate. While I dislike the term – I usually interpret it as the version of ‘volunteered’ as in ‘mum volunteered me to help the old lady cross the street’ – there is no point in trying to change it. When Mike Goodchild coins an acronym, it will stick; it’s sort of a GIScience law!
If you are interested in user-generated geographical content, crowdsourced geographical information, commons-based peer-produced geographical information, or any other way to call this phenomena (for example VGI) – join the group. It will be good to keep in touch, share information and discuss research aspects.
If you are researching in this area you are also welcome to submit a paper to GISRUK 2010 which will be hosted at UCL – we are keen to have a VGI element in the programme, considering that UCL is the host of OpenStreetMap .
In June, Aamer Ather, an M.Eng. student at the department, completed his research comparing OpenStreetMap (OSM) to Ordnance Survey Master Map Integrated Transport Layer (ITN). This was based on the previous piece of research in which another M.Eng. student, Naureen Zulfiqar, compared OSM to Meridian 2.
There are really surprising results. The analysis shows that when A-roads, B-roads and a motorway from ITN are compared to OSM data, the overlap can reach values that are over 95%. When the comparison with Master Map was completed, it became clear that OSM is of better quality than Meridian 2. It is also interesting to note that the results of higher overlap with ITN were achieved under stricter criteria for the buffering procedure that is used for comparison.
As noted, in the original analysis, Meridian 2 was used as the reference dataset, the ground truth. However, comparing Meridian 2 and OSM is not like with like, because OSM is not generalised and Meridian 2 is. The justification for treating Meridian 2 as the reference dataset was that the nodes are derived from high-accuracy datasets and it was expected that the 20 metres filter would not change positions significantly. It turns out that the generalisation impacts the quality of Meridian more than I anticipated. Yet, the advantage of Meridian 2 is that it allows comparisons for the whole of England, since the file size is still manageable, while the complexity of ITN would make an extensive comparison difficult, time-consuming and lengthy.
The results show that for the 4 Ordnance Survey London tiles that we’ve compared, the results put OSM only 10-30% from the ITN centre line. Rather impressive when you consider the knowledge, skills and backgrounds of the participants. My presentation from the State of the Map conference, below, provides more details of this analysis – and the excellent dissertation by Aamer Ather, which is the basis for this analysis, is available to download here.
The one caveat that will need to be explored in future projects is that the comparison in London means that OSM mappers had access to very high-resolution imagery from Yahoo! which have been georeferenced and rectified. Therefore, the high precision might be a result of tracing these images, and the question is what happens in places where high resolution images are not available. Thus, we need to test more tiles and in other places to validate the results in other areas of the UK.
Another student is currently comparing OSM to 1:10,000 map of Athens, so by the end of the summer I hope that it will be possible to estimate quality in other countries. The comparison to ITN in other areas of the UK will wait for a future student who will be interested in this topic!
Terra Future 2009 – OpenStreetMap and Ordnance Survey
28 April, 2009
I have checked on Twitter to see how the follow-up meeting to Terra Future 2009, last Friday, went. It was a very pleasant surprise to see that the idea that I have put forward in February, that the Ordnance Survey should consider hosting OpenStreetMap and donate some data to it, was voted the best idea that came out of Terra Future 2009. With this sort of peer-review of the idea, and with the added benefit of 2 months for rethinking, I still think that it is quite a good idea.
The most important aspect of this idea is to understand that OpenStreetMap and Ordnance Survey can both thrive in the GeoWeb era. Despite the imaginary competition, each has a clear value to certain parts of the marketplace. There are a very clear benefits that the OpenStreetMap community can gain from working closely with the Ordnance Survey – such as some aspects of mapping that the Ordnance Survey are highly knowledgeable about, and vice versa, such as how to innovate in delivery of geographical information. A collaborative model might work after all…
I wonder how this idea will evolve now?




