Eye on Earth (Day 2 – Morning) – moving to data supply

Eye on Earth (Day 2 – Morning) – moving to data supply The second day of Eye on Earth moved from data demand to supply . You can find my posts from day one, with the morning and the afternoon sessions. I have only partial notes on the plenary Data Revolution-data supply side, although I’ve posted separately the slides from my talk. The description of the session stated: The purpose of the the session is to set the tone and direction for the “data supply” theme of the 2nd day of the Summit. The speakers focused on the revolution in data – the logarithmic explosion both in terms of data volume and of data sources. Most importantly, the keynote addresses will highlight the undiscovered potential of these new resources and providers to contribute to informed decision-making about environmental, social and economic challenges faced by politicians, businesses, governments, scientists and ordinary citizens.


The session was moderated by Barbara J. Ryan; with talks from Philemon Mjwara, Mary Glackin, Muki Haklay, Christopher Tucker, and Mae Jemison. [I’ll revise the blog with notes later]

After the plenary, the session Data for Sustainable Development was building on the themes from the plenary. Some of the talks in the session were:

Louis Liebenberg presented cybertracker – showing how it evolved from early staged in the mid 1990s to a use across the world. The business model of cybertracker is such that people can download it for free, but it mostly used off-line in many places, with majority of the users that use it as local tool. This raise issues of data sharing – data doesn’t go beyond that the people who manage the project. Cybertracker address the need to to extend citizen science activities to a whole range of participants beyond the affluent population that usually participate in nature observations.

Gary Lawrence – discussed how with Big Data we can engage the public in deciding which problem need to be resolved – not only the technical or the scientific community. Ideas will emerge within Big Data that might be coincident or causality. Many cases are coincidental. The framing should be: who are we today? what are we trying to become? What has to be different two, five, ten years from now if we’re going to achieve it? most organisations don’t even know where they are today. There is also an issue – Big Data: is it driven by a future that people want. There are good examples of using big data in cities context that take into account the need of all groups – government, business and citizens in Helsinki and other places.

B – the Big Data in ESPA experience www.espa.ac.uk – data don’t have value until they are used. International interdisciplinary science for ecosystems services for poverty alleviation programme. Look at opportunities, then the challenges. Opportunities: SDGs are articulation of a demand to deliver benefits to societal need for new data led solution for sustainable development, with new technologies: remote sensing / UAVs, existing data sets, citizen science and mobile telephony, combined with open access to data and web-based applications. Citizen Science is also about empowering communities with access to data. We need to take commitments to take data and use it to transforming life.

Discussion: lots of people are sitting on a lots of valuable data that are considered as private and are not shared. Commitment to open data should be to help in how to solve problems in making data accessible and ensure that it is shared. We need to make projects aware that the data will be archived and have procedures in place, and also need staff and repositories. Issue is how to engage private sector actors in data sharing. In work with indigenous communities, Louis noted that the most valuable thing is that the data can be used to transfer information to future generations and explain how things are done.

Eye on Earth (Day 1 – afternoon) – policy making demand for data and knowledge for healthy living

The afternoon of the first day of Eye on Earth (see previous post for an opening ceremony and the morning sessions) had multiple tracks. I selected to attend Addressing policy making demand for data; dialogue between decision makers and providers

wpid-wp-1444139631192.jpgThe speakers were asked to address four points that address issues of data quality control and assurance, identify the major challenges facing data quality for decision-making in the context of crowd-sourcing and citizen science. Felix Dodds  who chaired the session noted that – the process of deciding on indicators for SDGs is managed through the UN Inter-agency group, and these indicators and standards of measurements need to last for 15 years.  There is now also ‘World Forum on Sustainable Development Data’ and review of the World Summit on Information Society (WSIS) is also coming. The speakers are asked to think about  coordination mechanisms and QA to ensure good quality data? How accessible is the data? Finally, what is the role of citizen science within this government information? We need to address the requirements of the data – at international, regional, and national levels.

Nawal Alhosany (MASDAR institute): Data is very important ingredient in making policy when you try to make policy on facts and hard evidence. Masdar is active throughout the sustainability chain, with a focus on energy. The question how to ensure that data is of good quality, and Masdar recognised gap in availability of data 10 years ago. For example, some prediction tools for solar power were not taking into account local conditions, as well as quality assurance that is suitable to local needed. Therefore, they developed local measurement and modelling tools (ReCREMA). In terms of capacity building, they see issues in human capacity across the region, and try to address it (e.g. lack of open source culture). In Masdar, they see a role for citizen science – and they make steps towards it through STEM initiatives such as Young Future Energy Leaders and other activities.

David Rhind (Nuffiled Foundation): many of the data sets that we want cover national boundaries – e.g. radioactive plum from Chernobyl. When we want to mix population and environment, we need to deal with mixing boundaries and complex problems with data integrity. There are also serious problem with validity – there are 21 sub-Saharan countries that haven’t done household survey sine 2006, so how can we know about levels of poverty today? There is a fundamental question of what is quality, and how can we define it in any meaningful sense. Mixing data from different sources is creating a problem of what quality mean. Some cases can rely on international agreements – e.g. N principles, or the UK regulatory authority to check statistics. Maybe we should think of international standards like in accountancy. In terms of gaps in capacity, there is a quick change due to need for analysis and data scientists are becoming available in the UK, but there is issue with policy makers who do not have the skills to understand the information. Accessible data is becoming common with the open data approach, but many countries make official data less open for security. However, data need some characteristics – need to be re-use , easy to distribute, public and with open licensing. The issue about the citizen science – there are reasons to see it as an opportunity – e.g. OpenStreetMap, but there are many factors that make its integration challenging. There is a need for proper communication – e.g. the miscommunication in L’Aquila

Kathrine Brekke (ICLEI) – perspective from local government. Local government need data for decision-making. Data also make it the city suitable for investment, insurance, and improve transparency and accountability. There are issues of capacity in terms of collecting the data, sharing it, and it is even down to language skills (if it is not available in English, international comparison is difficult). There are initiatives such as open.dataforcities.org to allow sharing of city data. There are 100 sustainability indicators that are common across cities and can be shared. In terms of data quality we can also include crowdsourcing – but then need to ensure that it the data will be systematic and comparable. The standards and consistency are key – e.g. greenhouse registry is important and therefore there is global protocol for collecting the data.

Ingrid Dillo (DANS, Netherlands) there is data deluge with a lot of potential, but there are challenges about the quality of the data and trust. Quality is about fitness for use. DANS aim is to ensure archiving of data from research projects in the Netherlands. Data quality in science – made of scientific data quality but also technical. Scientific integrity is about the values of science – standards of conduct within science. There are issues with fraud in science that require better conduct. Data management in small projects lack checks and balances, with peer pressure as major driver to ensure quality – so open science is one way to deal with that. There are also technical issues such as metadata and data management so it can be used and stored in certified trustworthy digital repository.

Robert Gurney (University of Reading) -in environmental science there is the Belmont Forum e-Infrastructures & data management. The Belmont forum is association of environmental science funders from across the world. The initiative is to deal with the huge increase in data. Scientists are early adopters of technology and some of the lessons can be used from what scientists are doing by other people in the environmental sector. The aim is to deliver knowledge that is needed for action. The infrastructure is needed to meet global environmental challenges. This require working with many supercomputers – the problems are volume, variety, veracity, velocity (Big Data) – we getting many petabytes – can reach 100 Petabytes by 2020. The problem is that data is in deep silos – even between Earth Observation archives. The need to make data open and sharable. There will be 10% of funding going towards e-infrastructure. They created data principles and want to have the principle of open by default.

Marcos Silva (CITES)  Cites is about the trade in engendered species . CITES (since mid 1970s)  regulate trade in multi-billion dollar business with 850,000 permits a year. Each permits say that it’s OK to export a specimen without harming the population. It is data driven. CITES data can help understanding outliers and noticing trends. There are issues of ontologies, schema, quality etc. between the signatories – similar to environmental information. They would like to track what happen to the species across the world. They are thinking about a standard about all the transactions with specimen which will create huge amount of data. Even dealing with illegal poaching and protection of animals, there is a need for interoperable data.

Discussion: Data Shift for citizen generated data for SDG goals. Is there data that is already used? How we are going to integrate data against other types of data? We risk filtering citizen science data out because it follow different framework. Rhind – statisticians are concerned about citizen science data, and will take traditional view, and not use the data. There is a need to have quality assurance not just at the end. The management of indicators and their standards will require inclusion of suitable data. Marcos ask what is considered citizen science data? e.g. reporting of data by citizens is used in CITES and there are things to learn – how the quality of the data can be integrated with traditional process that enforcement agencies use. Science is not just data collection and analysis, such as climateprediction.net  and multiple people can analyse information. Katherine talked about crowdsourcing – e.g. reporting of trees in certain cities  so there is also dialogue of deciding which trees to plant. Ingrid – disagree that data collection on its own is not science. Nawal – doing projects with schools about energy, which open participation in science. Rhind – raised the issue of the need for huge data repository and the question if governments are ready to invest. Gurney – need to coordinate multiple groups and organisations that are dealing with data organisations. There is a huge shortage of people in environmental science with advanced computing skills.

wpid-wp-1444166132788.jpgThe second session that I attended explored Building knowledge for healthy lives opened by Jacqueline McGlade – the context of data need to focus on the SDGs, and health is underpinning more goals then environmental issues. UNEP Live is aimed to allow access UN data – from country data, to citizen science data – so it can be found. The panel will explore many relations to health: climate change, and its impact on people’s life and health. heatwaves and issues of vulnerability to extreme events. Over 40 countries want to use the new air quality monitoring that UNEP developed, including the community in Kibera.

wpid-wp-1444166114783.jpgHayat Sindi is the CEO of i2Institute, exploring social innovations. Our ignorance about the world is profound. We are teaching children about foundation theories without questioning science heroes and theories, as if things are static. We are elevating ideas from the past and don’t question them. We ignore the evidence. The fuel for science is observation. We need to continue and create technology to improve life. Social innovation is important – and she learn it from diagnostic for all (DFA) from MIT. The DFA is low cost, portable, easy to use and safely disposable. The full potential of social innovation is not fulfilled. True scientists need to talk with people, understand their need, and work with them

Maria Neira (WHO) – all the SDGs are linked to health. A core question is what are the environmental determinants of health. Climate change, air quality – all these are part of addressing health and wellbeing. Need to provide evidence based guidelines, and the WHO also promote health impact assessment for major development projects. There are different sectors – housing, access to water, electricity – some healthcare facility lack access to reliable source of energy. Air pollution is a major issue that the WHO recognise as a challenge – killing 7m people a year. With air quality we don’t have a choice with a warning like we do with tobacco. The WHO offering indicators who offer that the access to energy require to measure exposure to air pollution. There is a call for strong collaboration with other organisation. There is a global platform on air quality and health that is being developed. Aim to enhance estimation of the impacts from air quality.

Joni Seager (GGEO coordinating lead author) talking about gender and global environmental outlook. She looks at how gender is not included in health and environmental data. First example – collect gender data and then hide it. Gender analysis can provide better information can help in decision making and policy formation.  Second method – dealing with households – they don’t have agency in education, access to car or food security, but in reality there is no evidence that food security is household level attribute – men and women have different experience of coping strategies – significant different between men or women. Household data is the view of the men and not the real information. Household data make women especially invisible. There are also cases where data is not collected. In some areas – e.g. sanitation, information is not collected. If we building knowledge for healthy life, we should ask who’s knowledge and who’s life?

Parrys Raines (Climate Girl) grown in Australia and want to protect the environment – heard about climate change as 6 years old and then seek to research and learn about the data – information is not accessible to young girls. She built close relationships with UNEP. There are different impacts on young people. She is also sharing information about air quality and pollution to allow people to include youth in the discussion and solutions. Youth need to be seen as a resource across different levels – sharing generation, global thinking. There is need for intergenerational communication – critical. knowledge of data is critical for the 21st century. Need organisations to go out and support youth – from mentoring to monetary support.

wpid-wp-1444166106561.jpgIman Nuwayhid talking about the health and ecological sustainability in the Arab world. There are many Millennium Development Goals MDGs that have been achieved, but most of the countries fell short of achieving them. In ecological sustainability, the picture is gloomy in the Arab world – many countries don’t have access to water. Demand for food is beyond the capacity of the region to produce. Population is expected to double in next 30 years. Poorer countries have high fertility – lots of displacement: war, economic and environmental. Development – there are striking inequities in the region – some of the wealthiest countries and the poorest countries in the world. Distribution of water need to consider which sector should use it. In comparison of health vs military expenditure, the Arab world spend much more on military than on health. There is interaction between environment, population and development. The region ecological footprint is highest and increasing. There are also issues of political instability that can be caused by environmental stresses. Displacement of people between countries create new stresses and question the value of state based analysis. Uncertainty is a major context for the region and science in general.

Discussion: the air quality issue – monitoring is not enough without understanding the toxicity, dispersion. Air pollution are impacted also by activities such as stone quarries. Need to balance monitoring efforts with accuracy and the costs of acting. Need to develop models and methods to think about it’s use. Some urban area of light and noise have also impacts not just on death but on quality of life and mental problems.

Two side events of interest run in parallel:

wpid-wp-1444166098477.jpgThe European Environmental Bureau presented a side event on collaborative research and activist knowledge on environmental justice. Pressure on resources mean extractive industries operate in the south with the outcomes used in the North. There is an increased level of conflicts in the south. The EJOLT project is a network of 23 partners in 23 countries. It’s collaborative research of scientists, grass roots organisations, NGOs and legal organisations. They had a whole set results. A visible result is the Atlas of environmental justice. There is plenty to say about citizen science and how important is that information come from people who are closed to the ground. They work with team in ecological economics, that created a moderated process for collecting and sharing information. The atlas allow to look at information according to different categories, and this is link to stories about the conflict and it’s history – as well as further details about it. The atlas is a tool to map conflicts but also to try and resolve them. The EEB see the atlas as an ongoing work and they want to continue and develop sources of information and reporting. Updating and maintaining the tool is a challenge that the organisation face.

At the same time, the best practice guidelines Putting Principle 10 into action was launched, building on the experience from Aarhus guide, there are plenty of case studies and information and it will be available at on the UNEP website

wpid-wp-1444166160281.jpgThe gala dinner included an award to the sensable city lab project in Singapore, demonstrating the development of personalise travel plans that can help avoiding pollution and based on 30-40 participants who collected data using cheap sensors.

New paper: The epistemology(s) of volunteered geographic information: a critique

Considering how long Reneé Sieber  (McGill University) and I know each other, and working in similar areas (participatory GIS, participatory geoweb, open data, socio-technical aspects of GIS, environmental information), I’m very pleased that a collaborative paper that we developed together is finally published.

The paper ‘The epistemology(s) of volunteered geographic information: a critique‘ took some time to evolve. We started jotting ideas in late 2011, and slowly developed the paper until it was ready, after several rounds of peer review, for publication in early 2014, but various delays led to its publication only now. What is pleasing is that the long development time did not reduced the paper relevancy – we hope! (we kept updating it as we went along). Because the paper is looking at philosophical aspects of GIScience, we needed periods of reflection and re-reading to make sure that the whole paper come together, and I’m pleased with the way ideas are presented and discussed in it. Now that it’s out, we will need to wait and see how it will be received.

The abstract of the paper is:

Numerous exegeses have been written about the epistemologies of volunteered geographic information (VGI). We contend that VGI is itself a socially constructed epistemology crafted in the discipline of geography, which when re-examined, does not sit comfortably with either GIScience or critical GIS scholarship. Using insights from Albert Borgmann’s philosophy of technology we offer a critique that, rather than appreciating the contours of this new form of data, truth appears to derive from traditional analytic views of information found within GIScience. This is assisted by structures that enable VGI to be treated as independent of the process that led to its creation. Allusions to individual emancipation further hamper VGI and problematise participatory practices in mapping/geospatial technologies (e.g. public participation geographic information systems). The paper concludes with implications of this epistemological turn and prescriptions for designing systems and advancing the field to ensure nuanced views of participation within the core conceptualisation of VGI.

The paper is open access (so anyone can download it) and it is available in the Geo website . 

Building Centre – from Mapping to Making

The London based Building Centre organised an evening event – from Mapping to Making –  which looked at the “radical evolution in the making and meaning of maps is influencing creative output. New approaches to data capture and integration – from drones to crowd-sourcing – suggest maps are changing their impact on our working life, particularly in design.”  The event included 5 speakers (including me, on behalf of Mapping for Change) and a short discussion.

Lewis Blackwell of the Building Centre opened the evening by noting that in a dedicated exhibition on visualisation and the city, the Building Centre is looking at new visualisation techniques. He realised that a lot of the visualisations are connected to mapping – it’s circular: mapping can ask and answer questions about the design process of the build environment, and changes in the built environment create new data. The set of talks in the evening is exploring the role of mapping.

Rollo Home, Geospatial Product Development Manager, Ordnance Survey (OS), started by thinking about the OS as the ‘oldest data company in the world‘. The OS thinking of itself as data company – the traditional mapping products that are very familiar represent only 5% of turnover. The history of OS go back to 1746 and William Roy’s work on accurately mapping Britain. The first maps produced in Kent, for the purpose of positioning ordinances. The maps of today, when visualised, look somewhat the same as maps from 1800, but the current maps are in machine readable formats that mean that the underlying information is very different. Demands for mapping changed over the years: Originally for ordinances, then for land information and taxation, and later helping the development of the railways. During WW I & II the OS led many technological innovations – from national grid in 1930s to photogrammetry. In 1973 the first digital maps were produced, and the process was completed in the 1980s. This was, in terms of data structures, still structured as a map. Only in 2000, MasterMap appear with more machine readable format that is updated 10,000 times a day, based on Oracle database (the biggest spatial data in the world) – but it’s not a map. Real world information is modelled to allow for structure and meaning. Ability to answer questions from the database is critical to decision-making. The information in the data can become explicit to many parts of the information – from the area of rear gardens to height of a building. They see developments in the areas of oblique image capture, 3D data, details under the roof, facades and they do a lot of research to develop their future directions – e.g. challenges of capturing data in cloud points. They see data that come from different sources including social media, satellite, UAVs, and official sources. Most of Smart Cities/Transport etc. areas need geospatial information and the OS is moving from mapping to data, and enabling better decisions.

Rita Lambert, Development Planning Unit, UCL. Covered the ReMap Lima project – running since 2012, and looking at marginalised neighbourhoods in the city. The project focused on the questions of what we are mapping and what we are making through representations. Maps contain potential of what might become – we making maps and models that are about ideas, and possibilities for more just cities. The project is collaboration between DPU and CASA at UCL, with 3 NGOs in Lima, and 40 participants from the city. They wanted to explore the political agency of mapping, open up spaces to negotiate outcomes and expand the possibilities of spatial analysis in marginalised areas in a participatory action-learning approach. The use of technology is in the context of very specific theoretical aims. Use of UAV is deliberate to explore their progressive potential. They mapped the historic centre which is overmapped and it is marginalised through over-representation (e.g. using maps to show that it need regeneration) while the periphery is undermapped – large part of the city (50% of the area), and they are marginalised through omission. Maps can act through undermapping or overmapping. Issues are very different – from evictions, lack of services, loss of cultural heritage (people and building) at the centre, while at the informal settlement there are risks, land trafficking, destruction of ecological infrastructure, and lack of coordination between spatial planning between places. The process that they followed include mapping from the sky (with a drone) and mapping from the ground (through participatory mapping using aerial images). The drones provided the imagery in an area that changes rapidly – and the outputs were used in participatory mapping, with the people on the ground deciding what to map and where to map. The results allow to identify eviction through changes to the building that can be observed from above. The mapping process itself was also a mean to strengthen community organisations. The use of 3D visualisation at the centre and at the periphery helped in understanding the risks that are emerging or the changes to their area. Data collection is using both maps and data collection through tools such as EpiCollect+ and community mapping, and also printing 3D models so they can used by discussions and conversations. The work carries on as the local residents continue the work. The conclusion: careful consideration for the use of technology in the context, and mapping from the sky and the ground go hand in hand. Creating these new representation are significant and what is that we are producing. more information at Remaplima.blogspot.co.uk  and learninglima.net

Simon Mabey, Digital Services Lead for City Modelling, Arup. Simon discussed city modelling in Arup – with the moved from visualisation to more sophisticated models. He leads on modelling cities in 3D, since the 1988, when visualisation of future designs was done stitching pieces of paper and photos. The rebuilding of Manchester in the mid 1990s, led to the development of 3D urban modelling, with animations and created an interactive CDROM. This continued to develop the data about Manchester and then shared it with others. The models were used in different ways – from gaming software to online, and trying to find ways to allow people to use it in real world context. Many models are used in interactive displays – e.g. for attracting inward investment. They went on to model many cities across the UK, with different levels of details and area that is covered. They also starting to identify features underground – utilities and the such. Models are kept up to date through collaboration, with clients providing back information about things that they are designing and integrating BIM data. In Sheffield, they also enhance the model through planning of new projects and activities. Models are used to communicate information to other stakeholders – e.g. traffic model outputs, and also do that with pedestrians movement. Using different information to colour code the model (e.g. enregy) or acoustic modelling or flooding. More recently, they move to city analytics, understanding the structure within models – for example understanding solar energy potential with the use and consumption of the building. They find themselves needing information about what utility data exist and that need to be mapped and integrated into their analysis. They also getting mobile phone data to predict trip journeys that people make.

I was the next speaker, on behalf Mapping for Change. I provided the background of Mapping for Change, and the approach that we are using for the mapping. In the context of other talks, which focused on technology, I emphasised that just as we are trying to reach out to people in the places that they use daily and fit the participatory process into their life rhythms, we need to do it in the online environment. That mean that conversations need to go where they are – so linking to facebook, twitter or whatsapp. We should also know that people are using different ways to access information – some will use just their phone, other laptops, and for others we need to think of laptop/desktop environment. In a way, this complicates participatory mapping much more than earlier participatory web mapping systems, when participants were more used to the idea of using multiple websites for different purposes. I also mentioned the need for listening to the people that we work with, and deciding if information should be shown online or not – taking into account what they would like to do with the data. I mentioned the work that involve citizen science (e.g. air quality monitoring) but more generally the ability to collect facts and evidence to deal with a specific issue. Finally, I also used some examples of our new community mapping system, which is based on GeoKey.

The final talk was from Neil Clark, Founder, EYELEVEL. He is from an architectural visualisation company that work in the North East and operate in the built environment area. They are using architectural modelling and us Ordnance Survey data and then position the designs, so they can be rendered accurately. Many of the processes are very expensive and complex. They have developed a tool called EYEVIEW for accurate augmented reality – working on iPad to allow viewing models in real-time. This can cut the costs of producing these models. They use a tripod to make it easier to control. The tool is the outcome of 4 years of development, allow the navigation of the architectural model to move it to overlay with the image. They are aiming at Accurate Visual Representation and they follow the detailed framework that is used in London for this purpose www.eyeviewportal.com

The discussion that follow explored the political nature of information and who is represented and how. A question to OS was how open it will be with the detailed data and while Rollo explained that access to the data is complicated one and it need to be funded. I found myself defending the justification of charging high detailed models by suggesting to imagine a situation where the universal provision of high quality data at national level wasn’t there, and you had to deal with each city data model.

The last discussion point was about the truth in the mapping and the positions that were raised – It about the way that people understand their truth or is there an absolute truth that is captured in models and maps – or represented in 3D visualisations? Interestingly, 3 of the talk assume that there is a way to capture specific aspects of reality (structures, roads, pollution) and model it by numbers, while Rita and I took a more interpretive and culturally led representations.

Data and the City workshop (day 2)

The second day of the Data and City Workshop (here are the notes from day 1) started with the session Data Models and the City.

Pouria Amirian started with Service Oriented Design and Polyglot Binding for Efficient Sharing and Analysing of Data in Cities. The starting point is that management of the city need data, and therefore technologies to handle data are necessary. In traditional pipeline, we start from sources, then using tools to move them to data warehouse, and then doing the analytics. The problems in the traditional approach is the size of data – the management of the data warehouse is very difficult, and need to deal with real-time data that need to answer very fast and finally new data types – from sensors, social media and cloud-born data that is happening outside the organisation. Therefore, it is imperative to stop moving data around but analyse them where they are. Big Data technologies aim to resolve these issues – e.g. from the development of Google distributed file system that led to Hadoop to similar technologies. Big Data relate to the technologies that are being used to manage and analyse it. The stack for managing big data include now over 40 projects to support different aspects of the governance, data management, analysis etc. Data Science is including many areas: statistics, machine learning, visualisation and so on – and no one expert can know all these areas (such expert exist as much as unicorns exist). There is interaction between data science researchers and domain experts and that is necessary for ensuring reasonable analysis. In the city context, these technologies can be used for different purposes – for example deciding on the allocation of bikes in the city using real-time information that include social media (Barcelona). We can think of data scientists as active actors, but there are also opportunities for citizen data scientists using tools and technologies to perform the analysis. Citizen data scientists need data and tools – such as visual analysis language (AzureML) that allow them to create models graphically and set a process in motion. Access to data is required to facilitate finding the data and accessing it – interoperability is important. Service oriented architecture (which use web services) is an enabling technology for this, and the current Open Geospatial Consortium (OGC) standards require some further development and changes to make them relevant to this environment. Different services can provided to different users with different needs [comment: but that increase in maintenance and complexity]. No single stack provides all the needs.

Next Mike Batty talked about Data about Cities: Redefining Big, Recasting Small (his paper is available here) – exploring how Big Data was always there: locations can be seen are bundles of interactions – flows in systems. However, visualisation of flows is very difficult, and make it challenging to understand the results, and check them. The core issue is that in N locations there are N^2 interactions, and the exponential growth with the growth of N is a continuing challenge in understanding and managing cities. In 1964, Brian Berry suggested a system on location, attributes and time – but temporal dimension was suppressed for a long time. With Big Data, the temporal dimension is becoming very important. An example of how understanding data is difficult is demonstrated with understanding travel flows – the more regions are included, the bigger the interaction matrix, but it is then difficult to show and make sense of all these interactions. Even trying to create scatter plots is complex and not helping to reveal much.

The final talk was from Jo Walsh titled Putting Out Data Fires; life with the OpenStreetMap Data Working Group (DWG) Jo noted that she’s talking from a position of volunteer in OSM, and recall that 10 years ago she gave a talk about technological determinism but not completely a utopian picture about cities , in which OpenStreetMap (OSM) was considered as part of the picture. Now, in order to review the current state of OSM activities relevant for her talk, she asked in the OSM mailing list for examples. She also highlighted that OSM is big, but it’s not Big Data- it can still fit to one PostGres installation. There is no anonymity in the system – you can find quite a lot about people from their activity and that is built into the system. There are all sort of projects that demonstrate how OSM data is relevant to cities – such as OSM building to create 3D building from the database, or use OSM in 3D modelling data such as DTM. OSM provide support for editing in the browser or with offline editor (JOSM). Importantly it’s not only a map, but OSM is also a database (like the new OSi database) – as can be shawn by running searches on the database from web interface. There are unexpected projects, such as custom clothing from maps, or Dressmap. More serious surprises are projects like the humanitarian OSM team and the Missing Maps projects – there are issues with the quality of the data, but also in the fact that mapping is imposed on an area that is not mapped from the outside, and some elements of colonial thinking in it (see Gwilym Eddes critique) . The InaSAFE project is an example of disaster modeling with OSM. In Poland, they extend the model to mark details of road areas and other details. All these are demonstrating that OSM is getting close to the next level of using geographic information, and there are current experimentations with it. Projects such as UTC of Mappa Marcia is linking OSM to transport simulations. Another activity is the use of historical maps – townland.ie .
One of the roles that Jo play in OSM is part of the data working group, and she joined it following a discussion about diversity in OSM within the community. The DWG need some help, and their role is geodata thought police/Janitorial judicial service/social work arm of the volunteer fire force. DWG clean up messy imports, deal with vandalisms, but also deal with dispute resolutions. They are similar to volunteer fire service when something happens and you can see how the sys admins sparking into action to deal with an emerging issue. Example, someone from Ozbekistan saying that they found corruption with some new information, so you need to find out the changeset, asking people to annotate more, say what they are changing and why. OSM is self policing and self regulating – but different people have different ideas about what they are doing. For example, different groups see the view of what they want to do. There are also clashes between armchair mapping and surveying mappers – a discussion between someone who is doing things remotely, and the local person say that know the road and asking to change the editing of classification. DWG doesn’t have a legal basis, and some issues come up because of the global cases – so for example translated names that does not reflect local practices. There are tensions between commercial actors that do work on OSM compared to a normal volunteer mappers. OSM doesn’t have privileges over other users – so the DWG is recognised by the community and gathering authority through consensus.

The discussion that follows this session explored examples of OSM, there are conflicted areas such as Crimea nad other contested territories. Pouria explained that distributed computing in the current models, there are data nodes, and keeping the data static, but transferring the code instead of data. There is a growing bottleneck in network latency due to the amount of data. There are hierarchy of packaging system that you need to use in order to work with distributed web system, so tightening up code is an issue.
Rob – there are limited of Big Data such as hardware and software, as well as the analytics of information. The limits in which you can foster community when the size is very large and the organisation is managed by volunteers. Mike – the quality of big data is rather different in terms of its problem from traditional data, so while things are automated, making sense of it is difficult – e.g. tap in but without tap out in the Oyster data. The bigger the dataset, there might be bigger issues with it. The level of knowledge that we get is heterogeneity in time and transfer the focus to the routine. But evidence is important to policy making and making cases. Martijn – how to move the technical systems to allow the move to focal community practice? Mike – the transport modelling is based on promoting digital technology use by the funders, and it can be done for a specific place, and the question is who are the users? There is no clear view of who they are and there is wide variety, different users playing different roles – first, ‘policy analysts’ are the first users of models – they are domain experts who advise policy people. less thinking of informed citizens. How people react to big infrastructure projects – the articulations of the policy is different from what is coming out of the models. there are projects who got open and closed mandate. Jo – OSM got a tradition of mapping parties are bringing people together, and it need a critical mass already there – and how to bootstrap this process, such as how to support a single mapper in Houston, Texas. For cases of companies using the data while local people used historical information and created conflict in the way that people use them. There are cases that the tension is going very high but it does need negotiation. Rob – issues about data citizens and digital citizenship concepts. Jo – in terms of community governance, the OSM foundation is very hands off, and there isn’t detailed process for dealing with corporate employees who are mapping in their job. Evelyn – the conventions are matters of dispute and negotiation between participants. The conventions are being challenged all the time. One of the challenges of dealing with citizenship is to challenge the boundaries and protocols that go beyond the state. Retain the term to separate it from the subject.

The last session in the workshop focused on Data Issues: surveillance and crime 

David Wood talked about Smart City, Surveillance City: human flourishing in a data-driven urban world. The consideration is of the smart cities as an archetype of the surveillance society. Especially trying to think because it’s part of Surveillance Society, so one way to deal with it is to consider resistance and abolishing it to allow human flourishing. His interest is in rights – beyond privacy. What is that we really want for human being in this data driven environment? We want all to flourish, and that mean starting from the most marginalised, at the bottom of the social order. The idea of flourishing is coming from Spinoza and also Luciano Floridi – his anti-enthropic information principle. Starting with the smart cities – business and government are dependent on large quant of data, and increase surveillance. Social Science ignore that these technology provide the ground for social life. The smart city concept include multiple visions, for example, a European vision that is about government first – how to make good government in cities, with technology as part of a wider whole. The US approach is about how can we use information management for complex urban systems? this rely on other technologies – pervasive computing, IoT and things that are weaved into the fabric of life. The third vision is Smart Security vision – technology used in order to control urban terrain, with use of military techniques to be used in cities (also used in war zones), for example biometrics systems for refugees in Afghanistan which is also for control and provision of services. The history going back to cybernetics and policing initiatives from the colonial era. The visions overlap – security is not overtly about it (apart from military actors). Smart Cities are inevitably surveillance cities – a collection of data for purposeful control of population. Specific concerns of researchers – is the targeting of people that fit a profile of a certain kind of people, aggregation of private data for profit on the expense of those that are involved. The critique of surveillance is the issue of sorting, unfair treatment of people etc. Beyond that – as discussed in the special issue on surveillance and empowerment– there are positive potentials. Many of these systems have a role for the common good. Need to think about the city within neoliberal capitalism, separate people in space along specific lines and areas, from borders to building. Trying to make the city into a tamed zone – but the danger parts of city life are also source for opportunities and creativity. The smart city fit well to this aspect – stopping the city from being disorderly. There is a paper from 1995 critique pervasive computing as surveillance and reduce the distance between us and things, the more the world become a surveillance device and stop us from acting on it politically. In many of the visions of the human in pervasive computing is actually marginalised. This is still the case. There are opportunities for social empowerment, say to allow elderly to move to areas that they stop exploring, or use it to overcome disability. Participation, however, is flawed – who can participate in what, where and how? additional questions are that participation in highly technical people is limited to a very small group, participation can also become instrumental – ‘sensors on legs’. The smart city could enable to discover the beach under the pavement (a concept from the situationists) – and some are being hardened. The problem is corporate ‘wall garden’ systems and we need to remember that we might need to bring them down.

Next Francisco Klauser talked about Michel Foucault and the smart city: power dynamics inherent in contemporary governing through code. Interested in power dynamics of governing through data. Taking from Foucault the concept of understanding how we can explain power put into actions. Also thinking about different modes of power: Referentiality – how security relate to governing? Normativity – looking at what is the norm and where it is came from? Spatiality – how discipline and security is spread across space. Discipline is how to impose model of behaviour on others (panopticon). Security work in another way – it is free things up within the limits. So the two modes work together. Power start from the study of given reality. Data is about the management of flows. The specific relevance to data in cities is done by looking at refrigerated warehouses that are used within the framework of smart grid to balance energy consumption – storing and releasing energy that is preserved in them. The whole warehouse has been objectified and quantified – down to specific product and opening and closing doors. He see the core of the control through connections, processes and flows. Think of liquid surveillance – beyond the human.

Finally, Teresa Scassa explored Crime Data and Analytics: Accounting for Crime in the City. Crime data is used in planning, allocation of resources, public policy making – broad range of uses. Part of oppositional social justice narratives, and it is an artefact of the interaction of citizen and state, as understood and recorded by the agents of the state operating within particular institutional cultures. Looking at crime statistics that are provided to the public as open data – derived from police files under some guidelines, and also emergency call data which made from calls to the policy to provide crime maps. The data that use in visualisation about the city is not the same data that is used for official crime statistics. There are limits to the data – institutional factors: it measure the performance of the police, not crime. It’s how police are doing their job – and there are lots of acts of ‘massaging’ the data by those that are observed. The stats are manipulated to produce the results that are requested. The police are the sensors, and there is unreporting of crime according to the opinion of police person – e.g. sexual assault, and also the privatisation of policing who don’t report. Crime maps are offered by private sector companies that sell analytics, and then provide public facing option – the narrative is controlled – what will be shared and how. Crime maps are declared as ‘public awareness or civic engagement’ but not transparency or accountability. Focus on property offence and not white collar one. There are ‘alternalytics’ – using other sources, such as victimisation survey, legislation, data from hospital, sexual assault crisis centres, and crowdsourcing. Example of the reporting bottom up is harrassmap to report cases that started in Egypt. Legal questions are how relationship between private and public sector data affect ownership, access and control. Another one is how the state structure affect data comparability and interoperability. Also there is a question about how does law prescribe and limit what data points can be collected or reported.

The session closed with a discussion that explored some examples of solutionism  like crowdsourcing that ask the most vulnerable people in society to contribute data about assault against them which is highly problematic. The crime data is popular in portals such as the London one, but it is mixed into multiple  concerns such as property price. David – The utopian concept of platform independence, and assuming that platforms are without values is inherently wrong.

The workshop closed with a discussion of the main ideas that emerged from it and lessons. How are all these things playing out. Some questions that started emerging are questions on how crowdsourcing can be bottom up (OSM) and sometime top-down, with issues about data cultures in Citizen Science, for example. There are questions on to what degree the political aspects of citizenship and subjectivity are playing out in citizen science. Re-engineering information in new ways, and rural/urban divide are issues that bodies such as Ordnance Survey need to face, there are conflicts within data that is an interesting piece, and to ensure that the data is useful. The sensors on legs is a concept that can be relevant to bodies such as Ordnance Survey. The concept of stack – it also relevant to where we position our research and what different researchers do: starting from the technical aspects to how people engage, and the workshop gave a slicing through these layers. An issue that is left outside is the business aspect – who will use it, how it is paid. We need the public libraries with the information, but also the skills to do things with these data. The data economy is important and some data will only produced by the state, but there are issues with the data practices within the data agencies within the state – and it is not ready to get out. If data is garbage, you can’t do much with it – there is no economy that can be based on it. An open questions is when data produce software? when does it fail? Can we produce data with and without connection to software? There is also the physical presence and the environmental impacts. Citizen engagement about infrastructure is lacking and how we tease out how things open to people to get involved. There was also need to be nuanced about the city the same way that we focused on data. Try to think about the way the city is framed: as a site to activities, subjectivity, practices; city as a source for data – mined; city as political jurisdiction; city as aspiration – the city of tomorrow; city as concentration of flows; city as a social-cultural system; city as a scale for analysis/ laboratory. The title and data and the city – is it for a city? Back to environmental issues – data is not ephemeral and does have tangible impacts (e.g. energy use in blockchain, inefficient algorithms, electronic WEEE that is left in the city). There are also issues of access and control – huge volumes of data. Issues are covered in papers such as device democracy. Wider issues that are making link between technology and wider systems of thought and considerations.

Beyond quantification: a role for citizen science and community science in a smart city

Arduino sensing in MaltaThe Data and the City workshop will run on the 31st August and 1st September 2015, in Maynooth University, Ireland. It is part of the Programmable City project, led by Prof Rob Kitchin. My contribution to the workshop is titled Beyond quantification: a role for citizen science and community science in a smart city and is extending a short article from 2013 that was published by UCL’s Urban Lab, as well as integrating concepts from philosophy of technology that I have used in a talk at the University of Leicester. The abstract of the paper is:

“When approaching the issue of data in Smart Cities, there is a need to question the underlying assumptions at the basis of Smart Cities discourse and, especially, to challenge the prevailing thought that efficiency, costs and productivity are the most important values. We need to ensure that human and environmental values are taken into account in the design and implementation of systems that will influence the way cities operate and are governed. While we can accept science as the least worst method of accumulating human knowledge about the natural world, and appreciate its power to explain and act in the world, we need to consider how it is applied within the city in a way that does leave space for cultural, environmental and religious values. This paper argues that a specific form of collaborative science – citizen science and community science – is especially suitable for making Smart Cities meaningful and democratic. The paper use concepts from Albert Borgmann’s philosophy of technology – especially those of the Device Paradigm and Focal Practices, to identify the areas were sensing the city can gain meaning for the participants.”

The paper itself can be accessed here.

Other papers from the same workshop that are already available include:

Rob Kitchin: Data-Driven, Networked Urbanism

Gavin McArdle & Rob Kitchin: Improving the Veracity of Open and Real-Time Urban Data

Michael Batty: Data About Cities: Redefining Big, Recasting Small

More details on the workshop will appear on the project website

Citizen Science and Policy – possible frameworks

Back in February, my report ‘Citizen Science & Policy: a European Perspective‘ was published by the Wilson Centre in the US. As I was trying to make sense of the relevance of citizen science to policy making, I used a framework that included the level of geography, area of policy making and the type of citizen science activity. This helped in noticing that citizen science is working well at the neighbourhood, city and national scales, while not so well at regional and international level. The reasons for it are mostly jurisdiction, funding and organisational structure and scale of operation.

Later on, at a workshop that was organised by Prof Aletta Bonn on Citizen Science and Policy impact, the idea of paying attention to the role of citizen science within the policy cycle was offered as another dimension of analysis.

Last week, at a workshop that was organised by the European Environment Agency (EEA) as part of their work on coordinating the European Protection Agencies (EPA) Network, I was asked to provide an introduction to these frameworks.

The presentation below is starting with noting that citizen science in an EPA is a specific case of using crowdsourced geographic information in government and some of the common issues that we have identified in the report on how governments use crowdsourced information are relevant to citizen science, too. Of particular interest are the information flows between the public and government, and the multiple flows of environmental information that the 3rd era of environmental information brought.

After noticing the individual, organisational, business and conceptual issues that influence use in general, I turn to the potential framing that are available – geography, stage in policy formation and mode of engagement, and after covering those I’m providing few examples of case to illustrate how specific cases fit into this analysis.

It was quite appropriate to present this framework in the EEA, considering that the image that was used to illustrate the page of the report on the Wilson Center site, is of the NoiseWatch app which was developed by the EEA…