After the publication of the comparison of OpenStreetMap and Google Map Maker coverage of Haiti, Nicolas Chavent from the Humanitarian OpenStreetMap Team contacted me and turned my attention to the UN Stabilization Mission in Haiti’s (known as MINUSTAH) geographical dataset, which is seen as the core set for the post earthquake humanitarian effort, and therefore a comparison with this dataset might be helpful, too. The comparison of the two Volunteered Geographical Information (VGI) datasets of OpenStreetMap and Google Map Maker with this core dataset also exposed an aspect of the usability of geographical information in emergency situations that is worth commenting on.
For the purpose of the comparison, I downloaded two datasets from GeoCommons – the detailed maps of Port-au-Prince and the Haiti road network. Both are reported on GeoCommons as originating from MINUSTAH. I combined them together, and then carried out the comparison. As in the previous case, the comparison focused only on the length of the roads, with the hypothesis that, if there is a significant difference in the length of the road at a given grid square, it is likely that the longer dataset is more complete. The other comparisons between established and VGI datasets give ground to this hypothesis, although caution must be applied when the differences are small. The following maps show the differences between the MINUSTAH dataset and OpenStreetMap and MINUSTAH and Google Map Maker datasets. I have also reproduced the original map that compares OpenStreetMap and Map Maker for the purpose of comparison and consistency, as well as for cartographic quality.
The maps show that MINUSTAH does provide fairly comprehensive coverage across Haiti (as expected) and that the volunteered efforts of OpenStreetMap and Map Maker provide further details in urban areas. There are areas that are only covered by one of the datasets, so they all have value.
The final comparison uses the 3 datasets together, with the same criteria as in the previous map – the dataset with the longest length of roads is the one that is considered the most complete.
It is interesting to note the south/north divide between OpenStreetMap and Google Map Maker, with Google Map Maker providing more details in the north, and OpenStreetMap in the south (closer to the earthquake epicentre). When compared over the areas in which there is at least 100 metres of coverage of MINUSTAH, OpenStreetMap is, overall, 64.4% complete, while Map Maker is 41.2% complete. Map Maker is covering further 354 square kilometres which are not covered by MINUSTAH or OpenStreetMap, and OpneStreetMap is covering further 1044 square kilometres that are missing from the other datasets, so clearly there is a benefit in integrating them. The grid that includes the analysis of the integrated datasets in shapefile format is available here, in case that it is of any use or if you like to carry out further analysis and or visualise it.
While working on this comparison, it was interesting to explore the data fields in the MINUSTAH dataset, with some of them included to provide operational information, such as road condition, length of time that it takes to travel through it, etc. These are the hallmarks of practical and operational geographical information, with details that are relevant directly to the end-users in their daily tasks. The other two datasets have been standardised for universal coverage and delivery, and this is apparent in their internal data structure. Google Map Maker schema is closer to traditional geographical information products in field names and semantics, exposing the internal engineering of the system – for example, including a country code, which is clearly meaningless in a case where you are downloading one country! OpenStreetMap (as provided by either CloudMade or GeoFabrik) keeps with the simplicity mantra and is fairly basic. Yet, the scheme is the same in Haiti as in England or any other place. So just like Google, it takes a system view of the data and its delivery.
This means that, from an end-user perspective, while these VGI data sources were produced in a radically different way to traditional GI products, their delivery is similar to the way in which traditional products were delivered, burdening the user with the need to understand the semantics of the different fields before using the data.
In emergency situations, this is likely to present an additional hurdle for the use of any data, as it is not enough to provide the data for download through GeoCommons, GeoFabrik or Google – it is how it is going to be used that matters. Notice that the maps tell a story in which an end-user who wants to have full coverage of Haiti has to combine three datasets, so the semantic interpretation can be an issue for such a user.
So what should a user-centred design of GI for an emergency situation look like? The general answer is ‘find the core dataset that is used by the first responders, and adapt your data to this standard’. In the case of Haiti, I would suggest that the MINUSTAH dataset is a template for such a thing. It is more likely to find users of GI on the ground who are already exposed to the core dataset and familiar with it. The fields are relevant and operational and show that this is more ‘user-centred’ than the other two. Therefore, it would be beneficial for VGI providers who want to help in an emergency situation to ensure that their data comply to the local de facto standard, which is the dataset being used on the ground, and bring their schema to fit it.
Of course, this is what GI ontologies are for, to allow for semantic interoperability. The issue with them is that they add at least two steps – define the ontology and figure out the process to translate the dataset that you have acquired to the required format. Therefore, this is something that should be done by data providers, not by end-users when they are dealing with the real situation on the ground. They have more important things to do than to find a knowledge engineer that can understand semantic interoperability…
12 thoughts on “Haiti – further comparisons and the usability of geographic information in emergency situations”
Very interesting analysis. If the value of the MINUSTAH data set is to “provide operational information, such as road condition, length of time that it takes to travel through it, etc”, then that is something we can never replicate in OSM (or Map Maker) by tracing satellite imagery. In fact, we often cannot even always classify roads according to OSM’s own schema – the majority seem to be “unclassified”, due to the difficulty in discerning detailed information from imagery. This kind of detail requires ground surveys.
However, providing even a minimal basemap for areas with poor MINUSTAH coverage is surely better than nothing. If our basemaps are useful enough, perhaps aid agencies would consider feeding back operational information (e.g., via timestamped GPS traces) in future crises in order quickly to share information with other agencies.
Very good point – although I was focusing not on what volunteers can collect, but on the schema in which the information is provided to end-users. The point that you are raising is valid, and very interesting – maybe it is possible through interviews and discussions with delivery agencies to understand their requirements, and make the volunteering effort more targeted towards these needs.
“end-user who wants to have full coverage of Haiti has to combine three datasets” and violate the tos from google.
This is red herring and beside the point. The terms and conditions of OpenStreetMap or MINUSTAH are just an issue as Google’s. In any case, for the discussion here imagine a user working in an organisation that have agreements with Google. MINUSTAH and OSMF to use their data.
“Therefore, it would be beneficial for VGI providers who want to help in an emergency situation to ensure that their data comply to the local de facto standard, which is the dataset being used on the ground, and bring their schema to fit it.”
Please forgive me, but I think you have missed the entire idea of “VGI”, as you call it.
Please don’t ever expect the people volunteering the geographic information in the first place to conform to your schema. It’s rather a bit like being offered a gift and telling the gift giver that it’s rather nice but could they please take it back and return it, and get you one in a different color.
What this situation calls for is an intermediate layer of volunteers who are knowledgeable about both the “VGI” and the ontology desired by first responders, and have the capacity and the willing to make the two interoperate. As it happens, OSM community members have been providing these services — and not necessarily those same people who are actually creating the database.
I agree with your final point – which is what I was aiming at, and disagree about the first.
There are already intermediaries, such as Cloudmade and GeoFabrik as they provide the data in a format that is more useful to many end users. I was not assuming that they are the same people as those who create the data.
Regarding the free gift metaphor – unfortunately, there are too many cases where someone is volunteering things that the other side don’t need. There were report about sending Christmas toys to the victims of the tsunami in 2004 and many other examples throughout the history of disaster relief of doing things that are not helpful. This is not about a gift of a different colour, but giving a gift that is useless. As our understanding of crisis needs develop, there will be aspects that are more valuable to the first responders and I hope that the VGI communities (OSM, Map Maker and others) will start responding to those, instead of making their own mind with naive assumptions about what is needed. It is good to see that such change is already happening.
What sense would it make to gather all that geodata if it is not usable by end users? And why data usability shouldn’t be a concern of OSM? IMO one big problem of OSM is that the data is tremendously difficult to access and to integrate with other geodata or GIS. I assume there is a big difference compared to Wikpedia: users of OSM have much more interest in getting the basic geodata to produce own map products.
Yes, providing Shapefiles is a good idea – unfortunately they are pretty erroneous and insufficient (at least currently).
What I’ve read from the past GISCorps reports: in such disaster situations there might be a lot of data sources for creating maps (just as http://wiki.openstreetmap.org/wiki/WikiProject_Haiti/Imagery_and_data_sources) – but the problem is 1) lack of metadata and 2) data usability.
I’m not sure what you mean by data usability here … is it that you can’t find the resource?
On metadata, it’s something we’re addressing right now in H.O.T, so that OSM tags are linked to schemas used by responding agencies.
In short, we’re pretty concerned about getting our data used. The simplicity of the data format is one reason why you see so many examples of OSM data out in the wilds. I understand that responders may not have the same capacity, which is why we’ve identified outreach as so important.
@Mikel: The OSM data is simple to read – yes. But with usability I don’t mean simplicity. I ment: all the things starting from “getting the data I want” up to “creating the desired end product” (like a map) or “finding an answer to a question” (like due to GIS methods).
Thus as an user of OSM data I’ll be faced with various problems like: How to integrate the data into my GIS or map construction infrastructure? Are there any inconsistencies (logical, topological ..fun with multipolygon relations :))? And metadata: What is the projection/reference system of the data? How accurate and up to date is it? etc.
Muki, thanks for this analysis. There’s immediate use, OSM editors are using MINUSTAH data as another source in building the base map.
Your comments about semantic interoperability and direct contact with responders very nicely reinforces our own thoughts in Humanitarian OpenStreetMap Team. Nicolas is working hard on the semantic problem, by defining supersets of useful schemas from all agencies, and mapping those to tags in OSM. One of the strongest features of OSM is the open ended tagging system in order to incorporate new kinds of properties. We’re also working to get direct representation on the ground in Haiti, so we can most clearly establish a channel to find out what is working and what is needed.
One point I take issue with is that MINUSTAH was the core data set for first responders. Due to its immediate availability, in many simple formats, OSM more or less served this role in the immediate response. Search and Rescue teams were using OSM Garmin maps, Ushahidi is backed by OSM, and all the UN agency map projects incorporated OSM road network. MINUSTAH data was released in the subsequent and startling openness among agencies and data holders. Hopefully this culture will continue into the recovery and reconstruction phases. And in future emergencies, hopefully this kind of response will become the default, so OSM/HOT can most quickly tailor the response to real needs.
I appreciate these analyses of OSM coverage, both of Haiti and the UK. They always demand serious introspection of both how, and what, I map.
While I feel many of your points are undoubtedly valid for various data associated with the Haiti ‘quake, roads seem to me to be a poor example.
It is true that the MINUSTAH data set has some 15 or so attributes which allow a much more nuanced description of roads than the ‘one-size-fits-all’ of OSM. However, many features in the MINUSTAH dataset on geocommons (22252) are poorly attributed. The best populated is KIND, with three values ‘P’,’S’ and ‘T’. And even then ‘T’ represents 79% of values, whilst no values are present for PAVED in 2563 of the 2702 features.
These attributes could, IF populated, offer more value than OSM’s free-form tags: in practice they weren’t. In this respect the MINUSTAH data is mainly distinguished from the VGI providers by its extent and thoroughness of coverage. Semantic interoperability was not an issue simply because there was hardly any information associated with the road data, whatever its source.
In other areas, such as damage assessment and spontaneous camps, the initial ad hoc tagging became more formal and closer to existing ontologies. So I think the VGI efforts may be more adaptable than you suggest.
Three other points. Firstly, the nature of OSM data is that it is difficult to maintain semantic interoperability between two users mapping in the same area (tagging wars), although, in my experience, this rarely creates real problems in interpreting the data. Secondly, a number of OSM outputs, street names, files for garmin GPS and paper maps, fall into categories where many end-users have clear models for semantic interpretation. Lastly, I am sceptical that we will ever avoid the issue where “the end-user who wants to have full coverage of … has to combine n datasets”, particularly in the context of a disaster where response speed is so important. There is an interesting trade-off between the speed with which a VGI such as OSM can respond to this sort of situation, and maximising interoperability. For OSM I anticipate that this will mean new components in the toolchain for pushing data out to end users.