Data and the City workshop (day 2)

The second day of the Data and City Workshop (here are the notes from day 1) started with the session Data Models and the City.

Pouria Amirian started with Service Oriented Design and Polyglot Binding for Efficient Sharing and Analysing of Data in Cities. The starting point is that management of the city need data, and therefore technologies to handle data are necessary. In traditional pipeline, we start from sources, then using tools to move them to data warehouse, and then doing the analytics. The problems in the traditional approach is the size of data – the management of the data warehouse is very difficult, and need to deal with real-time data that need to answer very fast and finally new data types – from sensors, social media and cloud-born data that is happening outside the organisation. Therefore, it is imperative to stop moving data around but analyse them where they are. Big Data technologies aim to resolve these issues – e.g. from the development of Google distributed file system that led to Hadoop to similar technologies. Big Data relate to the technologies that are being used to manage and analyse it. The stack for managing big data include now over 40 projects to support different aspects of the governance, data management, analysis etc. Data Science is including many areas: statistics, machine learning, visualisation and so on – and no one expert can know all these areas (such expert exist as much as unicorns exist). There is interaction between data science researchers and domain experts and that is necessary for ensuring reasonable analysis. In the city context, these technologies can be used for different purposes – for example deciding on the allocation of bikes in the city using real-time information that include social media (Barcelona). We can think of data scientists as active actors, but there are also opportunities for citizen data scientists using tools and technologies to perform the analysis. Citizen data scientists need data and tools – such as visual analysis language (AzureML) that allow them to create models graphically and set a process in motion. Access to data is required to facilitate finding the data and accessing it – interoperability is important. Service oriented architecture (which use web services) is an enabling technology for this, and the current Open Geospatial Consortium (OGC) standards require some further development and changes to make them relevant to this environment. Different services can provided to different users with different needs [comment: but that increase in maintenance and complexity]. No single stack provides all the needs.

Next Mike Batty talked about Data about Cities: Redefining Big, Recasting Small (his paper is available here) – exploring how Big Data was always there: locations can be seen are bundles of interactions – flows in systems. However, visualisation of flows is very difficult, and make it challenging to understand the results, and check them. The core issue is that in N locations there are N^2 interactions, and the exponential growth with the growth of N is a continuing challenge in understanding and managing cities. In 1964, Brian Berry suggested a system on location, attributes and time – but temporal dimension was suppressed for a long time. With Big Data, the temporal dimension is becoming very important. An example of how understanding data is difficult is demonstrated with understanding travel flows – the more regions are included, the bigger the interaction matrix, but it is then difficult to show and make sense of all these interactions. Even trying to create scatter plots is complex and not helping to reveal much.

The final talk was from Jo Walsh titled Putting Out Data Fires; life with the OpenStreetMap Data Working Group (DWG) Jo noted that she’s talking from a position of volunteer in OSM, and recall that 10 years ago she gave a talk about technological determinism but not completely a utopian picture about cities , in which OpenStreetMap (OSM) was considered as part of the picture. Now, in order to review the current state of OSM activities relevant for her talk, she asked in the OSM mailing list for examples. She also highlighted that OSM is big, but it’s not Big Data- it can still fit to one PostGres installation. There is no anonymity in the system – you can find quite a lot about people from their activity and that is built into the system. There are all sort of projects that demonstrate how OSM data is relevant to cities – such as OSM building to create 3D building from the database, or use OSM in 3D modelling data such as DTM. OSM provide support for editing in the browser or with offline editor (JOSM). Importantly it’s not only a map, but OSM is also a database (like the new OSi database) – as can be shawn by running searches on the database from web interface. There are unexpected projects, such as custom clothing from maps, or Dressmap. More serious surprises are projects like the humanitarian OSM team and the Missing Maps projects – there are issues with the quality of the data, but also in the fact that mapping is imposed on an area that is not mapped from the outside, and some elements of colonial thinking in it (see Gwilym Eddes critique) . The InaSAFE project is an example of disaster modeling with OSM. In Poland, they extend the model to mark details of road areas and other details. All these are demonstrating that OSM is getting close to the next level of using geographic information, and there are current experimentations with it. Projects such as UTC of Mappa Marcia is linking OSM to transport simulations. Another activity is the use of historical maps – townland.ie .
One of the roles that Jo play in OSM is part of the data working group, and she joined it following a discussion about diversity in OSM within the community. The DWG need some help, and their role is geodata thought police/Janitorial judicial service/social work arm of the volunteer fire force. DWG clean up messy imports, deal with vandalisms, but also deal with dispute resolutions. They are similar to volunteer fire service when something happens and you can see how the sys admins sparking into action to deal with an emerging issue. Example, someone from Ozbekistan saying that they found corruption with some new information, so you need to find out the changeset, asking people to annotate more, say what they are changing and why. OSM is self policing and self regulating – but different people have different ideas about what they are doing. For example, different groups see the view of what they want to do. There are also clashes between armchair mapping and surveying mappers – a discussion between someone who is doing things remotely, and the local person say that know the road and asking to change the editing of classification. DWG doesn’t have a legal basis, and some issues come up because of the global cases – so for example translated names that does not reflect local practices. There are tensions between commercial actors that do work on OSM compared to a normal volunteer mappers. OSM doesn’t have privileges over other users – so the DWG is recognised by the community and gathering authority through consensus.

The discussion that follows this session explored examples of OSM, there are conflicted areas such as Crimea nad other contested territories. Pouria explained that distributed computing in the current models, there are data nodes, and keeping the data static, but transferring the code instead of data. There is a growing bottleneck in network latency due to the amount of data. There are hierarchy of packaging system that you need to use in order to work with distributed web system, so tightening up code is an issue.
Rob – there are limited of Big Data such as hardware and software, as well as the analytics of information. The limits in which you can foster community when the size is very large and the organisation is managed by volunteers. Mike – the quality of big data is rather different in terms of its problem from traditional data, so while things are automated, making sense of it is difficult – e.g. tap in but without tap out in the Oyster data. The bigger the dataset, there might be bigger issues with it. The level of knowledge that we get is heterogeneity in time and transfer the focus to the routine. But evidence is important to policy making and making cases. Martijn – how to move the technical systems to allow the move to focal community practice? Mike – the transport modelling is based on promoting digital technology use by the funders, and it can be done for a specific place, and the question is who are the users? There is no clear view of who they are and there is wide variety, different users playing different roles – first, ‘policy analysts’ are the first users of models – they are domain experts who advise policy people. less thinking of informed citizens. How people react to big infrastructure projects – the articulations of the policy is different from what is coming out of the models. there are projects who got open and closed mandate. Jo – OSM got a tradition of mapping parties are bringing people together, and it need a critical mass already there – and how to bootstrap this process, such as how to support a single mapper in Houston, Texas. For cases of companies using the data while local people used historical information and created conflict in the way that people use them. There are cases that the tension is going very high but it does need negotiation. Rob – issues about data citizens and digital citizenship concepts. Jo – in terms of community governance, the OSM foundation is very hands off, and there isn’t detailed process for dealing with corporate employees who are mapping in their job. Evelyn – the conventions are matters of dispute and negotiation between participants. The conventions are being challenged all the time. One of the challenges of dealing with citizenship is to challenge the boundaries and protocols that go beyond the state. Retain the term to separate it from the subject.

The last session in the workshop focused on Data Issues: surveillance and crime 

David Wood talked about Smart City, Surveillance City: human flourishing in a data-driven urban world. The consideration is of the smart cities as an archetype of the surveillance society. Especially trying to think because it’s part of Surveillance Society, so one way to deal with it is to consider resistance and abolishing it to allow human flourishing. His interest is in rights – beyond privacy. What is that we really want for human being in this data driven environment? We want all to flourish, and that mean starting from the most marginalised, at the bottom of the social order. The idea of flourishing is coming from Spinoza and also Luciano Floridi – his anti-enthropic information principle. Starting with the smart cities – business and government are dependent on large quant of data, and increase surveillance. Social Science ignore that these technology provide the ground for social life. The smart city concept include multiple visions, for example, a European vision that is about government first – how to make good government in cities, with technology as part of a wider whole. The US approach is about how can we use information management for complex urban systems? this rely on other technologies – pervasive computing, IoT and things that are weaved into the fabric of life. The third vision is Smart Security vision – technology used in order to control urban terrain, with use of military techniques to be used in cities (also used in war zones), for example biometrics systems for refugees in Afghanistan which is also for control and provision of services. The history going back to cybernetics and policing initiatives from the colonial era. The visions overlap – security is not overtly about it (apart from military actors). Smart Cities are inevitably surveillance cities – a collection of data for purposeful control of population. Specific concerns of researchers – is the targeting of people that fit a profile of a certain kind of people, aggregation of private data for profit on the expense of those that are involved. The critique of surveillance is the issue of sorting, unfair treatment of people etc. Beyond that – as discussed in the special issue on surveillance and empowerment– there are positive potentials. Many of these systems have a role for the common good. Need to think about the city within neoliberal capitalism, separate people in space along specific lines and areas, from borders to building. Trying to make the city into a tamed zone – but the danger parts of city life are also source for opportunities and creativity. The smart city fit well to this aspect – stopping the city from being disorderly. There is a paper from 1995 critique pervasive computing as surveillance and reduce the distance between us and things, the more the world become a surveillance device and stop us from acting on it politically. In many of the visions of the human in pervasive computing is actually marginalised. This is still the case. There are opportunities for social empowerment, say to allow elderly to move to areas that they stop exploring, or use it to overcome disability. Participation, however, is flawed – who can participate in what, where and how? additional questions are that participation in highly technical people is limited to a very small group, participation can also become instrumental – ‘sensors on legs’. The smart city could enable to discover the beach under the pavement (a concept from the situationists) – and some are being hardened. The problem is corporate ‘wall garden’ systems and we need to remember that we might need to bring them down.

Next Francisco Klauser talked about Michel Foucault and the smart city: power dynamics inherent in contemporary governing through code. Interested in power dynamics of governing through data. Taking from Foucault the concept of understanding how we can explain power put into actions. Also thinking about different modes of power: Referentiality – how security relate to governing? Normativity – looking at what is the norm and where it is came from? Spatiality – how discipline and security is spread across space. Discipline is how to impose model of behaviour on others (panopticon). Security work in another way – it is free things up within the limits. So the two modes work together. Power start from the study of given reality. Data is about the management of flows. The specific relevance to data in cities is done by looking at refrigerated warehouses that are used within the framework of smart grid to balance energy consumption – storing and releasing energy that is preserved in them. The whole warehouse has been objectified and quantified – down to specific product and opening and closing doors. He see the core of the control through connections, processes and flows. Think of liquid surveillance – beyond the human.

Finally, Teresa Scassa explored Crime Data and Analytics: Accounting for Crime in the City. Crime data is used in planning, allocation of resources, public policy making – broad range of uses. Part of oppositional social justice narratives, and it is an artefact of the interaction of citizen and state, as understood and recorded by the agents of the state operating within particular institutional cultures. Looking at crime statistics that are provided to the public as open data – derived from police files under some guidelines, and also emergency call data which made from calls to the policy to provide crime maps. The data that use in visualisation about the city is not the same data that is used for official crime statistics. There are limits to the data – institutional factors: it measure the performance of the police, not crime. It’s how police are doing their job – and there are lots of acts of ‘massaging’ the data by those that are observed. The stats are manipulated to produce the results that are requested. The police are the sensors, and there is unreporting of crime according to the opinion of police person – e.g. sexual assault, and also the privatisation of policing who don’t report. Crime maps are offered by private sector companies that sell analytics, and then provide public facing option – the narrative is controlled – what will be shared and how. Crime maps are declared as ‘public awareness or civic engagement’ but not transparency or accountability. Focus on property offence and not white collar one. There are ‘alternalytics’ – using other sources, such as victimisation survey, legislation, data from hospital, sexual assault crisis centres, and crowdsourcing. Example of the reporting bottom up is harrassmap to report cases that started in Egypt. Legal questions are how relationship between private and public sector data affect ownership, access and control. Another one is how the state structure affect data comparability and interoperability. Also there is a question about how does law prescribe and limit what data points can be collected or reported.

The session closed with a discussion that explored some examples of solutionism  like crowdsourcing that ask the most vulnerable people in society to contribute data about assault against them which is highly problematic. The crime data is popular in portals such as the London one, but it is mixed into multiple  concerns such as property price. David – The utopian concept of platform independence, and assuming that platforms are without values is inherently wrong.

The workshop closed with a discussion of the main ideas that emerged from it and lessons. How are all these things playing out. Some questions that started emerging are questions on how crowdsourcing can be bottom up (OSM) and sometime top-down, with issues about data cultures in Citizen Science, for example. There are questions on to what degree the political aspects of citizenship and subjectivity are playing out in citizen science. Re-engineering information in new ways, and rural/urban divide are issues that bodies such as Ordnance Survey need to face, there are conflicts within data that is an interesting piece, and to ensure that the data is useful. The sensors on legs is a concept that can be relevant to bodies such as Ordnance Survey. The concept of stack – it also relevant to where we position our research and what different researchers do: starting from the technical aspects to how people engage, and the workshop gave a slicing through these layers. An issue that is left outside is the business aspect – who will use it, how it is paid. We need the public libraries with the information, but also the skills to do things with these data. The data economy is important and some data will only produced by the state, but there are issues with the data practices within the data agencies within the state – and it is not ready to get out. If data is garbage, you can’t do much with it – there is no economy that can be based on it. An open questions is when data produce software? when does it fail? Can we produce data with and without connection to software? There is also the physical presence and the environmental impacts. Citizen engagement about infrastructure is lacking and how we tease out how things open to people to get involved. There was also need to be nuanced about the city the same way that we focused on data. Try to think about the way the city is framed: as a site to activities, subjectivity, practices; city as a source for data – mined; city as political jurisdiction; city as aspiration – the city of tomorrow; city as concentration of flows; city as a social-cultural system; city as a scale for analysis/ laboratory. The title and data and the city – is it for a city? Back to environmental issues – data is not ephemeral and does have tangible impacts (e.g. energy use in blockchain, inefficient algorithms, electronic WEEE that is left in the city). There are also issues of access and control – huge volumes of data. Issues are covered in papers such as device democracy. Wider issues that are making link between technology and wider systems of thought and considerations.

Esri survey123 tool – rapid prototyping geographical citizen science tool

There are several applications that allow creating forms rapidly – such as Open Data Kit (ODK) or EpiCollect. Now, there is another offering from Esri, in the form of Survey123 app – which is explained in the video below.

Survey123 is integrated into ArcGIS Online, so you need an ArcGIS account to use it (you can have a short experiment if you register for a trial account, but for a longer project you’ll have to pay). The forms are configured in XForms, like ODK . The forms can be designed in Excel fairly quickly, and the desktop connection package make it easy to link to the Survey123 site, as well as testing forms.  I tried creating a form for local data collection, including recording a location and taking an image with the phone. It was fairly easy to create forms with textual, numerical, image and location information, and the software also supports the use of images to items in the form, so they can be illustrated visually. The desktop connector application also allow use to render the form, so they can be tested before they are uploaded to ArcGIS Online. Then it is possible to distribute the form to mobile devices and use them to collect the information.

The app works well offline, and it is possible to collect multiple forms and then upload them all together. While the application still showing rough edges in terms of interaction design, meaningful messages and bug clearing, it can be useful for developing prototypes and forms when the geographic aspect of the data collection is central. For example, during data collection the application supports both capturing the location from GPS and pointing on a map to the location where the data was collected. You can only use GPS when you are offline, as for now it doesn’t let you cache a map of a study area.

As might be expected, the advantage of Survey123 is coming once you’ve got the information and want to analyse it, since ArcGIS Online provide the tools for detailed GIS analysis, or you can link to it from a desktop GIS and analyse and visualise the information.

Luckily for us, Esri is a partner of the Extreme Citizen Science group and UCL also holds an institutional licence for ArcGIS Online, so we have access to these tools. However, through Esri conservation programme can also apply to have access to ArcGIS Online and use this tool.

Call for papers – special issue of the Cartographic Journal on Participatory GIS

Call for papers for a special issue of The Cartographic Journal on past, present and future of
Participatory GIS and Public Participation GIS.

DSC01463In the 1990s, participatory GIS (PGIS) and Public Participation GIS (PPGIS) emerged as an approach and tool to make geospatial technologies more relevant and accessible to marginalized groups. The goal has been to integrate the qualitative and experiential knowledge of local communities and individuals, thereby empowering local peoples and non-profit organizations to participate in political decision-making. By enabling the participation of local people from different walks of life, P/PGIS has provided a platform where these people can share their viewpoints and create maps depicting alternative views of the same problem, but from a local perspective.

Over the years, numerous applications integrating GIS and social and spatial knowledge of local groups have been developed. P/PGIS appears well articulated as a technique. With the growth of Information and Communication Technologies (ICT), from an epistemological view point the relationship of P/PGIS constructs (society, technology and institutions) and the use of components (access, power relations, diverse knowledge) in P/PGIS necessitates an exploration of what P/PGIS means in 21st century.

A related field, Citizen Science a.k.a. public participation in scientific research is a research technique that allows participation of public in the discovery of new scientific knowledge through data collection, analysis, or reporting. This approach can be viewed to be somewhat similar in its implementation to P/PGIS, which broadens the scope of data collection and enables information sharing among stakeholders in specific policies to solve a problem. The success of all three concepts, citizen science, PGIS and PPGIS, is influenced by the Geoweb – an integration of the Information and Communication Technologies (ICT) (e.g., social networking sites) and geospatial technologies (e.g., virtual globes like Google Earth, free and open source GIS like QGIS and location enabled devices like the iPhone) – that allows a platform for non-experts to participate in the creation and sharing of geospatial information without the aid of geospatial professionals.

Following a successful session in the AAG 2015 Annual Meeting, this call is for papers that will appear in a special issue of ‘The Cartographic Journal’ (http://www.maneyonline.com/loi/caj). We are calling for reflections on PPGIS/PGIS and citizen science that address some of the questions that are listed below.

  1. What social theories form the basis for the current implementation of P/PGIS? Have these theories changed? What remains persistent and intractable?
  2. What role do spatial theories, such as Tobler’s law of spatial relations or issues of spatial data accuracy, have in P/PGIS, Citizen Science or crowdsourcing?
  3. Since Schlossberg and Shuford, have we gotten better at understanding who the public is in PPGIS and what their role is in a successful deployment of PGIS?
  4. Which new knowledge should be included in data collection, mapping and decision-making and knowledge production? To what extent are rural, developing country, or marginalized communities really involved in the counter-mapping process? Are they represented when this action is undertaken by volunteers?
  5. What role do new ICTs and the emergence of crowdsourcing plays in the inclusion of indigenous and local knowledge? Do new tech and concepts hinder the participatory process or enable empowerment of local communities? Do we have new insights on what could be considered technological determinism?
  6. Do we need to revisit P/PGIS in light of any of these shifts? How often do P/PGIS projects need to be revisited to address the dynamic nature of society and political factors and to allow future growth?
  7. How effective have P/PGIS and Citizen Science been in addressing issues of environmental and social justice and resource allocation, especially, from a policy-making perspective?
  8. Are we any better at measuring the success of P/PGIS and/or Citizen Science? Should there be policies to monitor citizen scientists’ participation in Geoweb? If so, for what purpose?
  9. What should be the role of privacy in P/PGIS, for example, when it influences the accuracy of the data and subsequent usability of final products? How have our notions of needed literacy (e.g., GIS) and skills shifted with the emergence of new technologies?
  10. How has the concept of the digital divide been impacted by the emergence of the Geoweb, crowdsourcing and/or neogeography?
  11. What is the range of participatory practices in Citizen Science and what are the values and theories that they encapsulate?
  12. What are the different applications of Citizen Science from policy and scientific research perspective?
  13. To what extent do the spatial distribution of citizens influence their participation in decision making process and resolving scientific problems?
  14. How have our notions of needed literacy (e.g., GIS) and skills shifted with the emergence of new technologies?

Editors: Muki Haklay (m.haklay@ucl.ac.uk), University College London, UK; Renee Sieber (renee.sieber@mcgill.ca), McGill University; Rina Ghose (rghose@uwm.edu), University of Wisconsin – Milwaukee; Bandana Kar (bandana.kar@usm.edu), University of Southern Mississippi – Hattiesburg. Please use this link to send queries about the special issues, or contact one of the editors.

Submission Deadlines
Abstract – a 250 word abstract along with the title of the paper, name(s) of authors and their affiliations must be submitted by 15th August 2015 to Muki Haklay (use the links above). The editorial team will make a decision if the paper is suitable for the special issue by 1st September
Paper – The final paper created following the guidelines of The Cartographic Journal must be submitted by 30th October 2015.
Our aim is that the final issue will be published in early 2016

COST Energic Summer School on VGI and Citizen Science in Malta

Vyron Antoniou covering VGI foundations
Vyron Antoniou covering VGI foundations

COST Energic organised a second summer school that is dedicated to Volunteered Geographic Information (VGI) and citizen science. This time, the school was run by the Institute for Climate Change & Sustainable Development of the University of Malta. with almost 40 participants from across Europe and beyond (Brazil, New Zealand), and, of course, participants from Malta. Most of the students are in early stage of their academic career (Masters and Ph.D. students and several postdoctoral fellows) but the school was also attended by practitioners – for example in urban planning or in cultural heritage. Their backgrounds included engineering, geography, environmental studies, sociology, architecture, biology and ecology, computer science. The areas from which the participants came from demonstrate the range of disciplines and practices that are now involved in crowdsourced data collection and use. Also interesting is the opening of governmental and non-governmental bodies to the potential of crowdsourcing as evident from the practitioners group.

The teachers on the programme, Maria Attard, Claire Ellul, Rob Lemmens, Vyron Antoniou, Nuno Charneca, Cristina Capineri (and myself) are all part of the COST Energic network. Each provide a different insight and interest in VGI in their work – from transport, to spatial data infrastructure or participatory mapping. The aim of the training school was to provide a ‘hands-on’ experience with VGI and citizen science data sources, assuming that some of the students might be new to the topics, the technologies or both. Understanding how to get the data and how to use it is an important issue that can be confusing to someone who is new to this field – where the data is, how do you consume it, which software you use for it etc.

Collecting information in the University of Malta
Collecting information in the University of Malta

After covering some of the principles of VGI, and examples from different areas of data collection, the students started to learn how to use various OpenStreetMap data collection tools. This set the scene to the second day, which was dedicated to going around the university campus and collecting data that is missing from OpenStreetMap, and carrying out both the data collection and then uploading the GPS Tracks and sharing the information. Of particular importance was the reflection part, as the students were asked to consider how other people, who are also new to OpenStreetMap will find the process.

Using meteorological sensors in Gozo
Using meteorological sensors in Gozo

The next level of data collection involved using sensors, with an introduction to the potential of DIY electronics such as Arduino or Raspberry Pi as a basis for sensing devices. A field trip to Gozo in the next day provided the opportunity to explore these tools and gain more experience in participatory sensing. Following a lecture on participatory GIS application in Gozo, groups of students explored a local park in the centre of Rabat (the capital of Gozo) and gained experience in participatory sensing and citizen science.

Learning together The training school also included a public lecture by Cristina Capineri on ‘the fortune of VGI’.

The students will continue to develop their understanding of VGI and citizen science, culminating with group presentations on the last day. The most important aspects of any training school, as always, is in the development of new connections and links between the people on the programme, and in the conversations you could notice how these areas of research are still full of questions and research challenges.

COST ENERGIC meeting – Tallinn 21-22 May

TallinnThe COST Energic network is progressing in its 3rd year. The previous post showed one output from the action – a video that describe the links between volunteered geographic information and indigenous knowledge.

The people who came to the meeting represent the variety of interest in crwodsourced geographic information, from people with background in Geography, Urban planning, and many people with interest in computing – from semantic representation of information, cloud computing, data mining and similar issues where VGI represent an ‘interesting’ dataset.

Part of the meeting focused on the next output of the network, which is an Open Access book which is titled ‘European Handbook of Crowdsourced Geographic Information’. The book will be made from short chapters that are going through peer-review by people within the network. The chapters will cover topics such as theoretical and social aspects, quality – criteria and methodologies, data analysis and finally applied research and case studies. We are also creating a combined reference list that will be useful for researchers in the field. There will be about 25 chapters. Different authors gave a quick overview of their topics, with plenty to explore – from Smart Cities to concepts on the nature of information.

COST ‘actions’ (that’s how these projects are called), operate through working groups. In COST Energic, there are 3 working groups, focusing on human and societal issues,  Spatial data Quality and infrastructures, and Data mining, semantics and VGI.

Working Group 1 looked at an example of big data from Alg@line –  22 years of data of ferry data from the Baltic sea – with 17 millions observations a year. Data from  that can be used for visualisation and exploring the properties. Another case study that the working group consider is the engagement of schoolchildren and VGI – with activities in Portugal, Western Finland, and Italy. These activities are integrating citizen science and VGI, and using free and open source software and data. In the coming year, they are planning specific activities in big data and urban planning and crowd atlas on urban biodiversity.

Working Group 2 have been progressing in its activities linking VGI quality with citizen science, and how to produce reliable information from it. The working group collaborate with another COST action (TD1202) which called ‘Mapping and the Citizen Sensor‘. They carried out work on topics of quality of information – and especially with vernacular gazetteers. In their forthcoming activities, they contribute to ISSDQ 2015 (international symposium on spatial data quality) with a set of special sessions. Future work will focus on quality tools and quality visualisation.

Prof. Cristina Capineri opening the meeting
Prof. Cristina Capineri opening the meeting

Working Group 3 also highlighted the ISSDQ 2015 and will have a good presence in the conference. The group aims to plan a hackathon in which people will work on VGI, with a distributed event for people to work with data over time. Another plan is to focus on research around the repository. The data repository from the working group – contains way of getting of data and code. It’s mostly how to get at the data.

There is also a growing repository of bibliography on VGI in CiteULike. The repository is open to other researchers in the area of VGI, and WG3 aim to manage it as a curated resource. 

VGI and indigenous knowledge – COST Energic Video

The COST Energic network has been running now for 3 years, and one of the outputs from the network is the video below, which explore a very valuable form of Volunteered Geographic Information (VGI). This is information that is coming from participatory projects between researchers and indigenous communities, and this short film provide examples from Bolivia, British Columbia, and the Congo Basin, where researchers in the network are working with local communities to collect information about their areas and issues that concern them.

The video was produced by Lou del Bello, and include some stock photos and footage. The images that are marked with titles are from COST Energic Activities. Lou has also created a short video on the work of the Extreme Citizen Science group in her report on Mapping the Congo on SciDev

The video is released just before a meeting of the COST Network, held in Tallinn, and hosted by the Interaction Design Lab of Tallinn University.

AAG 2015 notes – day 4 – Citizen Science & OpenStreetMap Studies

The last day of AAG 2015 is about citizen science and OpenStreetMap studies.

The session Beyond motivation? Understanding enthusiasm in citizen science and volunteered geographic information was organised together with Hilary Geoghegan. We were interest to ‘explore and debate current research and practice moving beyond motivation, to consider the associated enthusiasm, materials and meanings of participating in citizen science and VGI.’

As Hilary couldn’t attend the conference, we started the session with a discussion about experiences of enthusiasm – for example, my own experience with IBM World Community Grid.  Jeroen Verplanke raised the addiction in volunteer thinking projects, such as logging in to Zooniverse or Tomnod project, and time fly-by. Mairead de Roiste described mapping wood-pigeon in New Zealand – public got involved because they wanted to help, but when they hear that the data wasn’t use, they might lose interest. Urgency can also be a form influencing participation.

Britta Ricker – University of Washington Tacoma – Look what I can do! Harnessing drone enthusiasm for increased motivation to participate. On-going research. Looking at the Geoweb – it allow people to access information, and made imagery available to the public, and the data is at the whim of whoever give us the data. With drones, we can send them up when we want or need to. Citizen Science is deeply related to geoweb – challenge is to get people involve and make them stay involved. We can harness drone enthusiasm – they evoke negative connotation but also thinking about them for good – humanitarian applications. Evidence for the enthusiasm is provided by YouTube where there are plenty of drone video – 3.44M – lots of action photography: surfing community and GoPro development. People are attached to the drone – jumping to the water to save them. So how the enthusiasm to drones can be harnessed to help participatory mapping. We need to design a workflow around stages: pre-flight, flight, post processing. She partnered with water scientists to explore local issues. There are considerations of costs and popularity – and selected quadcopter for that. DJI Phantom Vision 2+. With drones need to read the manual and plan the flight. There are legal issues of where it is OK to fly, and Esri & MapBox provide information on where you can fly them. Need to think of camera angle – need also to correct fisheye, and then process the images. Stitch imagery can be done manually (MapKnitter/QGIS/ArcGIS). Possible to do it in automated software, but open source (e.g. OpenDroneMap) is not yet good enough in terms of ease of use. Software such as Pix4D is useful but expensive. Working with raster data is difficult, drones require practice, and software/hardware is epensive – not yet ready to everyone. NGOs can start using it. Idea: sharing photos , classifying images together by volunteers.

Brittany Davis – Allegheny College – Motivated to Kill: Lionfish Derbies, Scuba Divers, and Citizen Science. Lionfish are stunning under water – challenging to differentiate between the two sub species but it doesn’t matter if you’re trying to catch them. They are invasive species and are without predators, exploded – especially from 2010. There is a lot of informational campaign and encouraging people to hunt them, especially in dive centres – telling people that it is a way to save a Caribbean reefs. When people transform themselves from ‘benign environmental activity’ to ‘you tell me that I can hunt? cool!’. Lionfish is tasty so having the meat for dinner is a motivation. Then doing ‘lionfish derbies’ – how many can you kill in a day. Seen a lot of enthusiasm for lionfish derbies. Trying to sign up people to where they go but they are not recording where they hunt the lionfish. People go to another site for competition as they want to capture more. REEF trying to encourage a protocol for capturing them, and there are cash prizes for the hunting. They use the catch to encourage people to hunt lionfish. Derbies increase in size – 14832 were removed from 2009 to 2014 and some evidence for the success of the methodology. There was a pressure on ‘safely and humanely capture and euthanase these fish’ – challenge for PADI who run special scuba courses that are linked to conservation. People hear about the hunting and that motivate people to go diving. There is a very specific process of REEF sanctioned lionfish derby, so trying to include recording and public information. But there are challenges below the depth of recreational divers. She also explored if it is possible to improve data collection for scientists.

Cheryl Gilge – University of Washington – The rhetorical flourish of citizen participation (or, the formation of cultural fascism?) offered a theoretical analysis of citizen science and web 2.0 as part of a wider project to understand labour relationships and power. She argues that there is agency to the average citizen to link to their environment. They have the ability to contribute, and to receive information is part of Web 2.0. As a technology layer, it changes both the individual and society levels. The collaboration and participation in Web 2.0 is framed around entrepreneurialism, efficiencies, and innovation. The web is offering many opportunities to help wider projects, where amateur and expert knowledge are both valued. However, there is a risk of reducing the politics of participation – semblance of agency. Democratic potential – but also co-opting the spirit is in evidence. There is plenty of examples of inducing individuals to contribute data and information, researchers are eager to understand motivation over a long period. Rational system to explain what is going on can’t explain the competing goals and values that are in action. The desire to participation is spread – fun, boredom etc. From understanding people as ‘snowflakes’ to unashamed exploitation. Why do people contribute to the wider agenda? As provocation, harnessing crowd potential to neoliberalisation agenda of universities. We give freedom to the efficiency and promise of digital tools. Government promise ‘open government’ or ‘smart cities’ that put efficiency as the top value. Deep libertarian desire for small government is expressed through technology. The government have sensors that reduce cost of monitoring what is happening. In the academic environment – reduce funding, hiring freeze, increase in pressure to publish – an assumption that it is possible to mechanically produce top research. Trading in ideas are less valued. Desire for capacity of information processing, or dealing with humanitarian efforts – projects like Galaxy Zoo require more people to analyse the masses of data that research produces, or mapathons to deal with emergencies. Participants are induced to do more through commitment to the project and harnessing enthusiasm. Adding inducement to the participants. She introduce the concept of micro-fascism from Guattari  – taking over freedoms in the hope of future promises. It enable large group formation to happen – e.g. identities such as I’m Mac/PC – it is harder to disconnect. Fascism can be defined as an ideology that rely on the masses in believing in the larger goals, the unquestioned authority of data in Web 2.0. Belief in technology induce researchers to get data and participation regardless of the costs. Open source is presented as democracy, but there are also similarities with fascism. Participation in the movement and participants must continue to perform. It bring uncomfortable participation – putting hope on these activities, but also happens in top down and bottom up, and Web 2.0. What is the ethical role of researchers who are involved in these projects? How do we value this labour? Need to admit that it is a political.

In a final comment, Teresa Scassa pointed that we need to consider the implication of legitimising drones, killing fish or employing unpaid labour – underlying all is a moral discomfort.

Afternoon, the two sessions on OpenStreetMap that Alan McConchie and I organised, taking the 10th birthday of OSM as a starting point, this session will survey the state of geographical research on OpenStreetMap and recognising that OSM studies are different from VGI. The session is supported by the European COST Energic (COST Action IC1203) network: European Network Exploring Research into Geospatial Information Crowdsourcing.

OpenStreetMap Studies 1 

Jennings Anderson, Robert Soden, Mikel Maron, Marina Kogan & Ken Anderson – University of Colorado, Boulder – The Social Life of OpenStreetMap: What Can We Know from the Data? New Tools and Approaches. OSM provides a platform to understand human centred computing. The is very valuable information in OSM history file, and they built a framework (EPIC OSM) that can run spatial and temporal queries and produces JSON output that can be then analysed. They are use existing tools and software frameworks to deliver it. The framework was demonstrated: can ask questions by day, or by month and even bin them by week and other ways. Running such questions which are evaluated by Ruby, so easy to add more questions and change them. They already use the framework in a paper in CHI about the Haiti earthquake (see video below).  Once they’ve created the underlying framework, they also developed an interface – OSM Markdown – can embed code and see changesets, accumulative nodes collected and classification by type of user. They are also providing information with tags. When analysing Haiti response, they see spike in noted added and what they see in buildings – the tags of collapse=yes

Christian Bittner – Diverse crowds, diverse VGI? Comparing OSM and Wikimapia in JerusalemChristian looked at differences in Wikimapia and OSM as sources of VGI. Especially interested in the social implications such as the way exclusion plays in VGI – challenges between Palestine/Israel – too contradicting stories that play out in a contested space, and there are conflict and fights over narratives that the two sides enact in different areas. With new tools, there is a ‘promise’ of democratisation – so a narrative of collaboration and participation. In crowdsourced geographic information we can ask: who is the crowd, and who is not? Studying social bias in OSM is a topic that is being discussed in the literature. The process is to look at the database of OSM. Analysing the data and metadata and used the municipal boundaries of Jerusalem. Simplified representation of the city, and region are classified by majority – Arab or Jewish. Then used cartograpms according to size of population and the amount of information collected.In OSM, Jewish areas are over-represented, while Arab areas are under-represented. Bias toward male from privileged socio-economic background as participants. In Wikimapia, the process is tagging places and uses visual information from Google. Wikimapia is about qualitative information so objects are messy and overlap, with no definitions of what consist of a place. In Wikimapia, there is much more descriptions of the Arab areas which are over-represented. The amount of information in Wikimpaia is smaller – 2679 objects, compared to 33,411 ways in OSM. In OSM there is little Arabic, and more Hebrew, though Latin is the most used language. Wikimapia is the other way around, with Hebrew in the minority. The crowd is different between projects. There are wider implications – diverse crowd so diverse VGI? VGI is diverse form of data, and they are produced in different ways from different knowledge cultures. He call for very specific studies on each community before claiming that VGI is general form of information.

Tim Elrick  & Georg Glasze – University of Erlangen-Nuremberg, Germany –  A changing mapping practices? Representation of Places of Worship in OpenStreetMap and other sourcesThe start of the process is noticing that churches are presented on official maps, but not a masques, noticing how maps are used to produce specific narratives. What happen in new forms of mapping? In Google Maps, the masque is presented, but not the church, in OSM both are mapped. What is happening? In the old topographic maps, the official NMAs argue that it provides a precise representation – but failing to do so in terms of religious differences. Some state do not include non-Christian places of worship – the federal mapping agency came with symbols for such places (masques, synagogues) but the preference from the states NMAs was for a generic mark for all non-Christian places that do not differentiate between religions. USGS just have single mark for house of worship – with cross. The USGS suggested to carry out crowdsourcing to identify places of worship so they are willing to change. In OSM there are free tagging and marks for religion, but the rendering dictate only some tags. In 2007 there was suggestion to change rendering of non-Christian places. Once Steve Chilton created cartographic symbols for the change. OSM do-ocracy can lead to change, but in other places that use OSM this was not accepted – there are different symbols in OpenCycleMaps. In Germany, there are conflicts about non visible places of worship (e.g. Masque in social club). Adaptive approach to dealing with location in OSM. In Google there is a whole set of data sources that are used, but also crowdsourcing which go to moderators in Google – no accountability or local knolwedge. Places of worship is not transparent. Categorisation and presentation change with new actors – corporate and open data. Google use economy of attention.

Alan McConchie – University of British Columbia – Map Gardening in Practice: Tracing Patterns of Growth and Maintenance in OpenStreetMap. Looking at history of OSM. Editing existing features is an important as adding new ones – having to collaborate and dealing with other people data. In the US, OSM is a mixed of volunteer and imported data – it’s ongoing aspect of the project. Questions: do the ‘explorers’ stick around? the people who like empty spaces . Do imports hinder the growth of the community? and does activity shift to ‘gardening’? The TIGER import in 2007 have been significant to the growth of the project. There are also many other imports – address in Denmark, French land cover, incomplete land cover imports in Canada. There was community backlash from people who were concerned about the impact of imports (e.g. Crowe 2011; Fredrik Ramm, 2012, Tobias Knerr, 2015). The debate is also between different regional factions. There is an assumption that only empty areas are exciting. That is problematic in terms of someone joining now in Germany. New best practices that are evolving Imports in Seattle were used to encourage the community and build it. Zielstra et al. 2013 explored imports show different growths, but not so simple as just to pin it on imports. Alan takes the ‘Wiki Gardening’ concept – people who like to keep things tidy and well maintained. Analysing small areas. Identifying blank spots, but trying to normalise across city in the world – e.g. population from the gridded population of the world. Exploring edits per month. We see many imports happening all the time. At individual city, explore the behaviour of explorers and those that never mapped the unknown. In London, new mappers are coming in while at Vancouver the original mapper are the one that continue to maintain the map. There is power law effects that trump anything else, and shift to new contributors and it is not clear cut.

Monica G. Stephens – University at Buffalo – Discussant: she started looking at OSM only few years ago, because of a statement from Mike Goodchild that women are not included, so done survey of internet users in Google Maps and OSM. She found that geotagging is much more male – more then just sharing image. In her survey she noticed gender bias in OSM. Maps are biased by the norms, traditions, assumptions and politics of map maker (Harley 1989). Biases – but biases of map maker – bikes in Denver (what interest them), or uneven representation of Hebrew in Jerusalem, or Religious attributes. Also there is how the community makes decision – how to display information? what to import? There are issues of ethos – there are fundamental differences in UK and Germany communities to US mapping communities. This lead to interesting conversations between these communities. There are also comparison, Wikimapia, Google Maps, Topo Maps – the tell us what OSM is doing. OSM democracy is more efficient and responding to communities ideas. The discussions on tagging childcare – rejected but there are discussions that led to remapping of tags in response to the critique. Compare to Google Maps, who was creating local knowledge? in Google Maps 96% of reviewers are male (in Google Map Maker 2012), so the question is who is the authority that govern Wikimapia.

OpenStreetMap Studies 2  included the following:

Martin Loidl – Department of Geoinformatics, University of Salzburg – An intrinsic approach for the detection and correction of attributive inconsistencies and semantic heterogeneity in OSM data. Martin come from data modelling perspective, accepting that OSM is based on bottom-up approach, with flat data modelling and attributes, with no restriction on tag usage. There are attributive inconsistencies. Semantics heterogeneity is influencing visualisation, statistics and spatial analysis. Suggesting to improve results by harmonization and correction through estimation. There has been many comparison of OSM quality over the years. There is little work on attribute information. Martin suggested an intrinsic approach that rely on the data in OSM – expecting major roads to be connected and consistent. Showing how you can attributes in completeness. Most of the road in OSM are local roads and  and there is high heterogeneity, but we need them and we should care about them. There are issues with keeping the freedom to tag – it expose the complexity of OSM.

Peter A. Johnson – University of Waterloo Challenges and Constraints to Municipal Government Adoption of OpenStreetMap. The collaboration of MapBox with NYC – agreement on data sharing was his starting point and motivation to explore how we can connect government and citizens to share data. Potentially, OSM community will help with official data, improve it and send it back. Just delivering municipal data over OSM base map is not much – maybe we need to look at mirroring – questions about currency, improvement of our services, and cheaper/easier to get are core questions. Evaluating official data and OSM data. Interview with governments in Canada, with range of sizes – easy in large cities, basic steps in medium and little progress in rural places. No official use of OSM, but do make data available to OSM community, and anecdotal evidence of using it for different jobs unofficially. Not seeing benefits in mirroring data, and they are the authoritative source for information, no other data is relevant. Constraints: not sure that OSM is more accurate and risk averse culture. They question fit with organisation needs, lacking required attributes, and they do see costs in altering existing data. OSM might be relevant to rural and small cities where data is not being updated.

Muki Haklay – University College London COST Energic – A European Network for research of VGI: the role of OSM/VGI/Citizen Science definitionsI’ve used some of the concepts that I first presented in SOTM 2011 in Vienna, and extended them to the general area of citizen science and VGI. Arguing that academics need to be ‘critical friends’, in a nice way, to OSM and other communities. The different talks and Monica points about changes in tagging demonstrate that this approach is effective and helpful.

Discussant: Alan McConchie – University of British Columbia. The later session looked at intrinsic or extrinsic analysis of OSM – such as Martin’s work on internal consistency, there are issues of knowing specific person in the bits of the process who can lead to the change. There is a very tiny group of people that make the decisions, but there is a slow opening towards accountability (e.g. OSM rendering style on Github). There are translation of knowledge and representation that happen in different groups and identifying how to make the information correctly. There is a sense of ‘no one got the right answer’. Industry and NGOs also need to act as critical friends – it will make it a better project. There is also critical GIS conversations – is there ‘fork’ within the OSM studies? We can have conversations about these issues.

Follow up questions explored the privacy of the participants and maybe mentioned it to participants and the community, and also the position as participant or someone who alters the data and as a researcher – the implications of participatory observations.