A review of volunteered geographic information quality assessment methods

One of the joys of academic life is the opportunity to participate in summer schools – you get a group of researchers, from PhD students to experienced professors, to a nice place in the Italian countryside, and for a week the group focuses on a topic – discussing, demonstrating and trying it out. The Vespucci Institute in 2014 that was dedicated to citizen science and Volunteered Geographic Information (VGI) is an example for that. Such activities are more than a summer retreat – there are tangible academic outputs that emerge from such workshops – demonstrating that valuable work is done!

During the summer school in 2014, Hansi Senaratne suggested to write a review of VGI data quality approaches, and together with Amin Mobasheri and Ahmed Loai Ali (all PhD students) started to developed it. I and Cristina Capineri, as summer school organisers and the vice-chair & chair of COST ENERGIC network (respectively), gave advice to the group and helped them in developing a paper, aimed at one of the leading journal of Geographic Information Science (GIScience) – the International Journal of GIScience (IJGIS).

Hensi presents at the Vespucci summer school
Hansi presenting at the Vespucci summer school

The paper went through the usual peer review process, and with a huge effort from Hansi, Amin & Ahmed, it gone all the way to publication. It is now out. The paper is titled ‘A review of volunteered geographic information quality assessment methods‘ and is accessible through the journal’s website. The abstract is provided below, and if you want the pre-print version – you can download it from here.

With the ubiquity of advanced web technologies and location-sensing hand held devices, citizens regardless of their knowledge or expertise, are able to produce spatial information. This phenomenon is known as volunteered geographic information (VGI). During the past decade VGI has been used as a data source supporting a wide range of services, such as environmental monitoring, events reporting, human movement analysis, disaster management, etc. However, these volunteer-contributed data also come with varying quality. Reasons for this are: data is produced by heterogeneous contributors, using various technologies and tools, having different level of details and precision, serving heterogeneous purposes, and a lack of gatekeepers. Crowd-sourcing, social, and geographic approaches have been proposed and later followed to develop appropriate methods to assess the quality measures and indicators of VGI. In this article, we review various quality measures and indicators for selected types of VGI and existing quality assessment methods. As an outcome, the article presents a classification of VGI with current methods utilized to assess the quality of selected types of VGI. Through these findings, we introduce data mining as an additional approach for quality handling in VGI

Advertisements

Eye on Earth (Day 1 – afternoon) – policy making demand for data and knowledge for healthy living

The afternoon of the first day of Eye on Earth (see previous post for an opening ceremony and the morning sessions) had multiple tracks. I selected to attend Addressing policy making demand for data; dialogue between decision makers and providers

wpid-wp-1444139631192.jpgThe speakers were asked to address four points that address issues of data quality control and assurance, identify the major challenges facing data quality for decision-making in the context of crowd-sourcing and citizen science. Felix Dodds  who chaired the session noted that – the process of deciding on indicators for SDGs is managed through the UN Inter-agency group, and these indicators and standards of measurements need to last for 15 years.  There is now also ‘World Forum on Sustainable Development Data’ and review of the World Summit on Information Society (WSIS) is also coming. The speakers are asked to think about  coordination mechanisms and QA to ensure good quality data? How accessible is the data? Finally, what is the role of citizen science within this government information? We need to address the requirements of the data – at international, regional, and national levels.

Nawal Alhosany (MASDAR institute): Data is very important ingredient in making policy when you try to make policy on facts and hard evidence. Masdar is active throughout the sustainability chain, with a focus on energy. The question how to ensure that data is of good quality, and Masdar recognised gap in availability of data 10 years ago. For example, some prediction tools for solar power were not taking into account local conditions, as well as quality assurance that is suitable to local needed. Therefore, they developed local measurement and modelling tools (ReCREMA). In terms of capacity building, they see issues in human capacity across the region, and try to address it (e.g. lack of open source culture). In Masdar, they see a role for citizen science – and they make steps towards it through STEM initiatives such as Young Future Energy Leaders and other activities.

David Rhind (Nuffiled Foundation): many of the data sets that we want cover national boundaries – e.g. radioactive plum from Chernobyl. When we want to mix population and environment, we need to deal with mixing boundaries and complex problems with data integrity. There are also serious problem with validity – there are 21 sub-Saharan countries that haven’t done household survey sine 2006, so how can we know about levels of poverty today? There is a fundamental question of what is quality, and how can we define it in any meaningful sense. Mixing data from different sources is creating a problem of what quality mean. Some cases can rely on international agreements – e.g. N principles, or the UK regulatory authority to check statistics. Maybe we should think of international standards like in accountancy. In terms of gaps in capacity, there is a quick change due to need for analysis and data scientists are becoming available in the UK, but there is issue with policy makers who do not have the skills to understand the information. Accessible data is becoming common with the open data approach, but many countries make official data less open for security. However, data need some characteristics – need to be re-use , easy to distribute, public and with open licensing. The issue about the citizen science – there are reasons to see it as an opportunity – e.g. OpenStreetMap, but there are many factors that make its integration challenging. There is a need for proper communication – e.g. the miscommunication in L’Aquila

Kathrine Brekke (ICLEI) – perspective from local government. Local government need data for decision-making. Data also make it the city suitable for investment, insurance, and improve transparency and accountability. There are issues of capacity in terms of collecting the data, sharing it, and it is even down to language skills (if it is not available in English, international comparison is difficult). There are initiatives such as open.dataforcities.org to allow sharing of city data. There are 100 sustainability indicators that are common across cities and can be shared. In terms of data quality we can also include crowdsourcing – but then need to ensure that it the data will be systematic and comparable. The standards and consistency are key – e.g. greenhouse registry is important and therefore there is global protocol for collecting the data.

Ingrid Dillo (DANS, Netherlands) there is data deluge with a lot of potential, but there are challenges about the quality of the data and trust. Quality is about fitness for use. DANS aim is to ensure archiving of data from research projects in the Netherlands. Data quality in science – made of scientific data quality but also technical. Scientific integrity is about the values of science – standards of conduct within science. There are issues with fraud in science that require better conduct. Data management in small projects lack checks and balances, with peer pressure as major driver to ensure quality – so open science is one way to deal with that. There are also technical issues such as metadata and data management so it can be used and stored in certified trustworthy digital repository.

Robert Gurney (University of Reading) -in environmental science there is the Belmont Forum e-Infrastructures & data management. The Belmont forum is association of environmental science funders from across the world. The initiative is to deal with the huge increase in data. Scientists are early adopters of technology and some of the lessons can be used from what scientists are doing by other people in the environmental sector. The aim is to deliver knowledge that is needed for action. The infrastructure is needed to meet global environmental challenges. This require working with many supercomputers – the problems are volume, variety, veracity, velocity (Big Data) – we getting many petabytes – can reach 100 Petabytes by 2020. The problem is that data is in deep silos – even between Earth Observation archives. The need to make data open and sharable. There will be 10% of funding going towards e-infrastructure. They created data principles and want to have the principle of open by default.

Marcos Silva (CITES)  Cites is about the trade in engendered species . CITES (since mid 1970s)  regulate trade in multi-billion dollar business with 850,000 permits a year. Each permits say that it’s OK to export a specimen without harming the population. It is data driven. CITES data can help understanding outliers and noticing trends. There are issues of ontologies, schema, quality etc. between the signatories – similar to environmental information. They would like to track what happen to the species across the world. They are thinking about a standard about all the transactions with specimen which will create huge amount of data. Even dealing with illegal poaching and protection of animals, there is a need for interoperable data.

Discussion: Data Shift for citizen generated data for SDG goals. Is there data that is already used? How we are going to integrate data against other types of data? We risk filtering citizen science data out because it follow different framework. Rhind – statisticians are concerned about citizen science data, and will take traditional view, and not use the data. There is a need to have quality assurance not just at the end. The management of indicators and their standards will require inclusion of suitable data. Marcos ask what is considered citizen science data? e.g. reporting of data by citizens is used in CITES and there are things to learn – how the quality of the data can be integrated with traditional process that enforcement agencies use. Science is not just data collection and analysis, such as climateprediction.net  and multiple people can analyse information. Katherine talked about crowdsourcing – e.g. reporting of trees in certain cities  so there is also dialogue of deciding which trees to plant. Ingrid – disagree that data collection on its own is not science. Nawal – doing projects with schools about energy, which open participation in science. Rhind – raised the issue of the need for huge data repository and the question if governments are ready to invest. Gurney – need to coordinate multiple groups and organisations that are dealing with data organisations. There is a huge shortage of people in environmental science with advanced computing skills.

wpid-wp-1444166132788.jpgThe second session that I attended explored Building knowledge for healthy lives opened by Jacqueline McGlade – the context of data need to focus on the SDGs, and health is underpinning more goals then environmental issues. UNEP Live is aimed to allow access UN data – from country data, to citizen science data – so it can be found. The panel will explore many relations to health: climate change, and its impact on people’s life and health. heatwaves and issues of vulnerability to extreme events. Over 40 countries want to use the new air quality monitoring that UNEP developed, including the community in Kibera.

wpid-wp-1444166114783.jpgHayat Sindi is the CEO of i2Institute, exploring social innovations. Our ignorance about the world is profound. We are teaching children about foundation theories without questioning science heroes and theories, as if things are static. We are elevating ideas from the past and don’t question them. We ignore the evidence. The fuel for science is observation. We need to continue and create technology to improve life. Social innovation is important – and she learn it from diagnostic for all (DFA) from MIT. The DFA is low cost, portable, easy to use and safely disposable. The full potential of social innovation is not fulfilled. True scientists need to talk with people, understand their need, and work with them

Maria Neira (WHO) – all the SDGs are linked to health. A core question is what are the environmental determinants of health. Climate change, air quality – all these are part of addressing health and wellbeing. Need to provide evidence based guidelines, and the WHO also promote health impact assessment for major development projects. There are different sectors – housing, access to water, electricity – some healthcare facility lack access to reliable source of energy. Air pollution is a major issue that the WHO recognise as a challenge – killing 7m people a year. With air quality we don’t have a choice with a warning like we do with tobacco. The WHO offering indicators who offer that the access to energy require to measure exposure to air pollution. There is a call for strong collaboration with other organisation. There is a global platform on air quality and health that is being developed. Aim to enhance estimation of the impacts from air quality.

Joni Seager (GGEO coordinating lead author) talking about gender and global environmental outlook. She looks at how gender is not included in health and environmental data. First example – collect gender data and then hide it. Gender analysis can provide better information can help in decision making and policy formation.  Second method – dealing with households – they don’t have agency in education, access to car or food security, but in reality there is no evidence that food security is household level attribute – men and women have different experience of coping strategies – significant different between men or women. Household data is the view of the men and not the real information. Household data make women especially invisible. There are also cases where data is not collected. In some areas – e.g. sanitation, information is not collected. If we building knowledge for healthy life, we should ask who’s knowledge and who’s life?

Parrys Raines (Climate Girl) grown in Australia and want to protect the environment – heard about climate change as 6 years old and then seek to research and learn about the data – information is not accessible to young girls. She built close relationships with UNEP. There are different impacts on young people. She is also sharing information about air quality and pollution to allow people to include youth in the discussion and solutions. Youth need to be seen as a resource across different levels – sharing generation, global thinking. There is need for intergenerational communication – critical. knowledge of data is critical for the 21st century. Need organisations to go out and support youth – from mentoring to monetary support.

wpid-wp-1444166106561.jpgIman Nuwayhid talking about the health and ecological sustainability in the Arab world. There are many Millennium Development Goals MDGs that have been achieved, but most of the countries fell short of achieving them. In ecological sustainability, the picture is gloomy in the Arab world – many countries don’t have access to water. Demand for food is beyond the capacity of the region to produce. Population is expected to double in next 30 years. Poorer countries have high fertility – lots of displacement: war, economic and environmental. Development – there are striking inequities in the region – some of the wealthiest countries and the poorest countries in the world. Distribution of water need to consider which sector should use it. In comparison of health vs military expenditure, the Arab world spend much more on military than on health. There is interaction between environment, population and development. The region ecological footprint is highest and increasing. There are also issues of political instability that can be caused by environmental stresses. Displacement of people between countries create new stresses and question the value of state based analysis. Uncertainty is a major context for the region and science in general.

Discussion: the air quality issue – monitoring is not enough without understanding the toxicity, dispersion. Air pollution are impacted also by activities such as stone quarries. Need to balance monitoring efforts with accuracy and the costs of acting. Need to develop models and methods to think about it’s use. Some urban area of light and noise have also impacts not just on death but on quality of life and mental problems.

Two side events of interest run in parallel:

wpid-wp-1444166098477.jpgThe European Environmental Bureau presented a side event on collaborative research and activist knowledge on environmental justice. Pressure on resources mean extractive industries operate in the south with the outcomes used in the North. There is an increased level of conflicts in the south. The EJOLT project is a network of 23 partners in 23 countries. It’s collaborative research of scientists, grass roots organisations, NGOs and legal organisations. They had a whole set results. A visible result is the Atlas of environmental justice. There is plenty to say about citizen science and how important is that information come from people who are closed to the ground. They work with team in ecological economics, that created a moderated process for collecting and sharing information. The atlas allow to look at information according to different categories, and this is link to stories about the conflict and it’s history – as well as further details about it. The atlas is a tool to map conflicts but also to try and resolve them. The EEB see the atlas as an ongoing work and they want to continue and develop sources of information and reporting. Updating and maintaining the tool is a challenge that the organisation face.

At the same time, the best practice guidelines Putting Principle 10 into action was launched, building on the experience from Aarhus guide, there are plenty of case studies and information and it will be available at on the UNEP website

wpid-wp-1444166160281.jpgThe gala dinner included an award to the sensable city lab project in Singapore, demonstrating the development of personalise travel plans that can help avoiding pollution and based on 30-40 participants who collected data using cheap sensors.

Notes from ICCB/ECCB 2015 (Day 2) – Citizen Science data quality

DSC_0073
Posters session at ICCB/ECCB2015

The second day of the ICCB/ECCB 2015 started with a session that focused on the use and interpretation of citizen science data. The  Symposium Citizen Science in Conservation Science: the new paths, from data collection to data interpretation was organised by Karine Princé and included the following talks:

Bias, information, signal and noise in citizen science data – Nick Isaac – information content of a dataset is question dependent on what was captured and how, as well as survey effort. Data is coming in different ways from a range of people who collect them for different purposes. Biological records are unstructured – they don’t address a specific question and need to know how they come about – information about the data collection protocols is important to make sense of the data. If you are collecting data through citizen science, remember that data will outlive the porject, so need good metadata, and data standards to ensure that it can be used by others. There are powerful statistical tools and we should use to model the bias and not try to avoid it, and little bit of metadata would go a long way so worth recording it.

Conservation management prioritization with citizen science data and species abundance models – Alison Johnston (BTO/Cornell Lab of Ornithology) distribution of species are dynamic and they change by seasons. This is especially important for migratory birds – conservation at specific times (wintering, breading or migrating). The BirdReturns programme in California is a way to flood rice field to provide water-birds habitat, and is an effective and not hugely costly. However, dynamic conservation need precision in information. Citizen Science data can help in occurrence model and want to identify abundance as this will help to prioritise the activities. They used eBird data. In California there are 230,000 checklists but there are biases in the data. There are variable efforts and expertise, and bias in sites, seasons, time. There are also different relationships with habitat, it is also difficult to identify the extreme abundance. They used the Spatio-Temporal Exploratory Models (STEM) which allow modelling with random grids – averaging across cells that have different origins (Fink et al 2010 Ecological Applications). Using the model, they identified areas of high activities – especially the abundance model. Of the two models, the abundance model seem more suitable in using citizen science data for dynamic conservation. The results were used with reverse auction to maximise the use of the available funds to provide large areas of temporary wetland.

Citizen sciences for monitoring biodiversity in habitat structured spaces – Camille Coron (Paris Sud)  described a model estimate for several species and their abundances – they wanted to use several datasets that are at different types of protocols from citizen science projects. Some with strong protocols and some without. They assume that space is covered wtih different types of habitat, but the habitat itself is not known. They look at bird species in Aquitaine – 34 species. 2 datasets are from precise protocols and the third dataset is oportunistics. They developed a statistical model to allow to estimate the data, using a detection probability, abundance, and the intensity of the observation activity. In opportunistic dataset the effort is not known. The model have important gains when species are rare, secondly when the considered species in hardly detected in the data and when there are many species. By using the combined robust protocol projects, the estimation of species distribution is improved.

Can opportunistic occurrence records improve the large-scale estimation of abundance trends? – Joern Pagel – there is lack of comprehensive data large scale variation in abundance and he describe a model that deal with it. The model is based on the assumption that population density is a main driver of variation in species detectability. Using UK butterfly data they tested the model, combining the very details local transects (140 with weekly monitoring) with opportunistic presence recording (over 500K records) using 10×10 km grid. The transects were used to estimate the abundance (described in a paper in methods in ecology and evolution). They found that opportunistic occurrences records can carry a signal of population density but need to be careful about assumptions and there are high uncertainties that are associated with it.

When do occupancy models produce reliable inferences from opportunistic data?– Arco Van Strien (statistics Netherlands) Statistics Netherlands are involved in butterflies and dragonflies monitoring – from transects and also opportunistic data. opportunistic data – unstandardised data, and can see artificial trends if effort varies over time – so the idea was to changes in recorder efforts derived from occupancy models. They coupled two logistic regression models – modelling the ecological process and the observation process. They wanted to explore the usefulness of opportunistic data & occupancy models, and used a Bayesian model, evaluating the results against standardised data. They looked for inferences – phenology (trying to find the pick date in detection), national trend in distribution, species richness per site, local trends in distribution.  The peak date- found a 0.9 correlation between opportunistic data and standardised data. National trends – there is also strong correlation – 0.8/0.9. Species richness – also correlation of over 0.9, but in local trends, the correlation is dropping to 0.4-0.5 for both butterfly and dragonfly. the conclusion – opportunistic data is great and need to be careful about the inference from it.

Making sense of citizen science data: A review of methods – Olivier Gimenez (CNRS) – interest in large terrestrial and marine mammals, they are difficult to monitor in the field and thinking of citizen science data can be used for that. Looked at all the papers with citizen science, and looked as specifically those that look at the data. Wanted to build taxonomy of methods that are used to handle citizen science data. He identified five methods. First, filtering and correction approach – so know or assume to know bias and trying to correct it – e.g. list length analysis. They are highly sensitive to specific biases. The second category – simulation approach, simulate the bias and check how your favourite method behaves given this bias. Third approach is a regression approach – use relevant variables to account for biases -e.g. ecological variables that used to build and predict models, and then use observer bias variables – e.g. distance from cities. The fourth approach is combination approach – combine citizen science data with data from standard protocol to allow to understand and correct the data. The last approach is the occupancy approach – correction for false-negatives and time/spatial variation in detection, so it can be used also extended to deal with false-positives and and also to deal with multiple species. Conclusion: we should focus more on citizens, to describe the models – we need to understand more about them (e.g. record data and the people that collected it) and social science have a major role to play.

 

In the session paths for the future: building conservation leadership capacity, Kirithi Karanth (Wildlife Conservation Society) looked at ‘Citizen Scientists as agents for conservation‘. In the 1980s WCS started monitoring tigers and some people who are not trained scientists wanted to join in. What draw in people was interest in tigers, and that was the start of their citizen science journey. 5000 km walked in 229 transects in the forest. It started with ecological survey across entire regions from charismatic species but also to rare species. Current project projects have 40-50 volunteer in amphibian and bird survey outside protected areas. The volunteers identify rare species. As project grown, so the challenges – e.g. around human-wildlife conflicts and that helped in having over 5000 villages and 7000 households surveyed in their area. Through the fieldwork, people understand conservation better. Another project recruited 75 volunteer to document tourism impact and the result were used by decision in the supreme court on how to regulate tourism. The have over 5000 citizen scientists, with active group of 1000 at each moment. The impact over 30 years – over 10,00 surveys in 15 states in India, with over 250 academic publications and 300 popular articles. A lot of the people who volunteers evolved into educators, film-makers, conservationists, and also share information blogs, articles, films, activists, and academics. The recognition also increase in graduate programmes – with professional masters programmes. Some of the volunteers – 10% become fully committed to conservation, but the other 90% are critical to wider engagement in society.

 

Third day of INSPIRE 2014 – any space for civil society and citizens?

At the last day of INSPIRE conference, I’ve attended a session about  apps and applications and the final plenary which focused on knowledge based economy and the role of inspire within it. Some notes from the talks including my interpretations and comments.

Dabbie Wilson from the Ordnance Survey highlighted the issues that the OS is facing in designing next generation products from an information architect point of view. She noted that the core large scale product, MasterMap has been around for 14 years and been provided in GML all the way through. She noted that now the client base in the UK is used to it and happy with (and when it was introduced, there was a short period of adjustment that I recall, but I assume that by now everything is routine). Lots of small scale products are becoming open and also provided as linked data. The user community is more savvy – they want the Ordnance Survey to push data to them, and access the data through existing or new services and not just given the datasets without further interaction. They want to see ease of access and use across multiple platforms. The OS is considering moving away from provision of data to online services as the main way for people to get access to the data. The OS is investing heavily in Mobile apps for leisure but also helping the commercial sector in developing apps that are based on OS data and tools. For example, OS locate app provide mechanisms to work worldwide so it’s not only UK. They also put effort to create APIs and SDKs – such as OS OnDemands – and also allowing local authorities to update their address data. There is also focus on cloud-based application – such as applications to support government activities during emergencies. The information architecture side moving from product to content. The OS will continue to maintain content that is product agnostic and running the internal systems for a long period of 10 to 20 years so they need to decouple outward facing services from the internal representation. The OS need to be flexible to respond to different needs – e.g. in file formats it will be GML, RDF and ontology but also CSV and GeoJSON. Managing the rules between the various formats is a challenging task. Different representations of the same thing is another challenge – for example 3D representation and 2D representation.

Didier Leibovici presented a work that is based on Cobweb project and discussing quality assurance to crowdsourcing data. In crowdsourcing there are issues with quality of both the authoritative and the crowdsourcing data. The COBWEB project is part of a set of 5 citizen observatories, exploring air quality, noise, water quality, water management, flooding and land cover, odour perception and nuisance and they can be seen at http://www.citizen-obs.eu. COBWEB is focusing on the infrastructure and management of the data. The pilot studies in COBWEB look at landuse/land cover, species and habitat observations and flooding. They are mixing sensors in the environment, then they get the data in different formats and the way to managed it is to validate the data, approve its quality and make sure that it’s compliant with needs. The project involve designing an app, then encouraging people to collect the data and there can be lack of connection to other sources of data. The issues that they are highlighting are quality/uncertainty, accuracy, trust and relevance. One of the core questions is ‘is crowd-sourcing data need to different to any other QA/QC?’ (my view: yes, but depending on the trade offs in terms of engagement and process) they see a role of crowdsourcing in NSDI, with real time data capture QA and post dataset collection QA (they do both) and there are also re-using and conflating data sources. QA is aimed to know what is collected  – there are multiple ways to define the participants which mean different ways of involving people and this have implications to QA. They are suggesting a stakeholder quality model with principles such as vaueness, ambiguity, judgement, reliability, validity, and trust. There is a paper in AGILE 2014 about their framework.  The framework suggests that the people who build the application need to develop the QA/QC process and do that with workflow authoring tool, which is supported with ontology and then running it as web processing service. Temporality of data need to be consider in the metadata, and how to update the metadata on data quality.

Patrick Bell considered the use of smartphone apps – in a project of the BGS and the EU JRC they review existing applications. The purpose of the survey to explore what national geological organisations can learn from the shared experience with development of smartphone apps – especially in the geological sector. Who is doing the development work and which partnerships are created? What barriers are perceived and what the role of INSPIRE directive within the development of these apps? They also try to understand who are the users?  There are 33 geological survey organisations in the EU and they received responses from 16 of them. They found 23 different apps – from BGS – iGeology http://www.bgs.ac.uk/igeology/home.html and provide access to geological amps and give access to subsidence and radon risk with in-app payment. They have soil information in the MySoil app which allow people to get some data for free and there is also ability to add information and do citizen science. iGeology 3D is adding AR to display a view of the geological map locally. aFieldWork is a way to capture information in harsh environment of Greenland.  GeoTreat is providing information of sites with special value that is relevant to tourists or geology enthusiasts. BRGM – i-infoTerre provide geological information to a range of users with emphasis on professional one, while i-infoNappe tell you about ground water level. The Italian organisation developed Maps4You with hiking route and combining geology with this information in Emilia-Romagna region. The Czech Geologcial survey provide data in ArcGIS online.

The apps deal with a wide range of topics, among them geohazards, coastline, fossils, shipwrecks … The apps mostly provide map data and 3D, data collection and tourism. Many organisation that are not developing anything stated no interest or a priority to do so, and also lack of skills. They see Android as the most important – all apps are free but then do in app purchase. The apps are updated on a yearly basis. about 50% develop the app in house and mostly work in partnerships in developing apps. Some focus on webapps that work on mobile platform, to cross platform frameworks but they are not as good as native apps, though the later are more difficult to develop and maintain. Many people use ESRI SDK and they use open licenses. Mostly there is lack of promotion of reusing the tools – most people serve data. Barriers – supporting multiple platform, software development skills, lack of reusable software and limited support to reuse across communities – heavy focus on data delivery, OGC and REST services are used to deliver data to an app. Most suggesting no direct link to INSPIRE by respondents but principles of INSPIRE are at the basis of these applications.

Timo Aarmio – presented the OSKARI platform to release open data to end users (http://www.oskari.org/). They offer role-based security layers with authenticates users and four levels of permissions – viewing, viewing on embedded maps, publishing and downloading. The development of Oskari started in 2011 and is used by 16 member organisations and the core team is running from National Land Survey of Finland. It is used in Arctic SDI, ELF and Finish Geoportal – and lots of embedded maps. The end-users features allow search of metadata, searching map layers by data providers or INSPIRE themes. they have drag and drop layers and customisation of features in WFS.  Sharing is also possible with uploading shapefiles by users.  They also have printing functionality which allow PNG or PDF and provide also embedded maps so you can create a map and then embed  it in your web page.  The data sources that they support are OGC web services – WMS, WMTS, WFS, CSW and also ArcGIS REST, data import for Shapefiles and KML, and JSON for thematic maps . Spatial analysis is provided with OGC Web Processing Service – providing basic analysis of 6 methods – buffer, aggregate, union, intersect, union of analysed layres and area and sector. They are planning to add thematic maps, more advanced spatial analysis methods, and improve mobile device support. 20-30 people work on Oskari with 6 people at the core of it.

The final session focused on knowledge based economy and the link to INSPIRE.

Andrew Trigg provide the perspective of HMLR on fueling the knowledge based economy with open data. The Land registry dealing with 24 million titles with 5 million property transaction a year. They provided open access to individual titles since 1990 and INSPIRE and the open data agenda are important to the transition that they went through in the last 10 years. Their mission is now include an explicit reference to the management and reuse of land and property data and this is important in terms of how the organisation defines itself. From the UK context there is shift to open data through initiatives such as INSPIRE, Open Government Partnership, the G8 Open Data Charter (open by default) and national implementation plans. For HMLR, there is the need to be INSPIRE Compliance, but in addition, they have to deal with public data group, the outcomes of the Shakespeare review and commitment to a national information infrastructure. As a result, HMLR now list 150 datasets but some are not open due to need to protect against fraud and other factors. INSPIRE was the first catalyst to indicate that HMLR need to change practices and allowed the people in the organisation to drive changes in the organisation, secure resources and invest in infrastructure for it. It was also important to highlight to the board of the organisation that data will become important. Also a driver to improving quality before releasing data. The parcel data is available for use without registration. They have 30,000 downloads of the index polygon of people that can potentially use it. They aim to release everything that they can by 2018.

The challenges that HMLR experienced include data identification, infrastructure, governance, data formats and others. But the most important to knowledge based economy are awareness, customer insight, benefit measurement and sustainable finance. HMLR invested effort in promoting the reuse of their data however, because there is no registration, their is not customer insight but no relationships are being developed with end users – voluntary registration process might be an opportunity to develop such relations. Evidence is growing that few people are using the data because they have low confidence in commitment of providing the data and guarantee stability in format and build applications on top of it, and that will require building trust. knowing who got the data is critical here, too. Finally, sustainable finance is a major thing – HMLR is not allowed to cross finance from other areas of activities so they have to charge for some of their data.

Henning Sten Hansen from Aalborg University talked about the role of education. The talk was somewhat critical of the corporatisation of higher education, but also accepting some of it’s aspects, so what follows might be misrepresenting his views though I think he tried to mostly raise questions. Henning started by noting that knowledge workers are defined by OECD as people who work autonomously and reflectively, use tools effectively and interactively, and work in heterogeneous groups well (so capable of communicating and sharing knowledge). The Danish government current paradigm is to move from ‘welfare society’ to the ‘competitive society’ so economic aspects of education are seen as important, as well as contribution to enterprise sector with expectations that students will learn to be creative and entrepreneurial. The government require more efficiency and performance from higher education and as a result  reduce the autonomy of individual academics. There is also expectation of certain impacts from academic research and emphasis on STEM  for economic growth, governance support from social science and the humanities need to contribute to creativity and social relationships. The comercialisation is highlighted and pushing patenting, research parks and commercial spin-offs. There is also a lot of corporate style behaviour in the university sector – sometime managed as firms and thought as consumer product. He see a problem that today that is strange focus and opinion that you can measure everything with numbers only. Also the ‘Google dream’ dream is invoked – assuming that anyone from any country can create global companies. However, researchers that need time to develop their ideas more deeply – such as Niels Bohr who didn’t published and secure funding – wouldn’t survive in the current system. But is there a link between education and success? LEGO founder didn’t have any formal education [though with this example as with Bill Gates and Steve Jobs, strangely their business is employing lots of PhDs – so a confusion between a person that start a business and the realisation of it]. He then moved from this general context to INSPIRE, Geoinformation plays a strong role in e-Governance and in the private sector with the increase importance in location based services. In this context, projects such as GI-N2K (Geographic Information Need to Know) are important. This is a pan European project to develop the body of knowledge that was formed in the US and adapting it to current need. They already identified major gaps between the supply side (what people are being taught) and the demand side – there are 4 areas that are cover in the supply side but the demand side want wider areas to be covered. They aim to develop a new BoK for Europe and facilitating knowledge exchange between institutions. He concluded that Higher education is prerequisite  for the knowledge economy – without doubt but the link to innovation is unclear . Challenges – highly educated people crowd out the job market and they do routine work which are not matching their skills, there are unclear the relationship to entrepreneurship and innovation and the needed knowledge to implement ideas. What is the impact on control universities reducing innovation and education – and how to respond quickly to market demands in skills when there are differences in time scale.

Giacomo Martirano provided a perspective of a micro-enterprise (http://www.epsilon-italia.it/IT/) in southern Italy. They are involved in INSPIRE across different projects – GeoSmartCities, Smart-Islands and SmeSpire – so lots of R&D funding from the EU. They are also involved in providing GIS services in their very local environment. From a perspective of SME, he see barriers that are orgnaisational, technical and financial. They have seen many cases of misalignment of technical competencies of different organisations that mean that they can’t participate fully in projects. Also misalignment of technical ability of clients and suppliers, heterogeneity in client organisation culture that add challenges. Financial management of projects and payment to organisations create problems to SME to join in because of sensitivity to cash-flow. They experience cases were awarded contracts won offering a price which is sometime 40% below the reference one. There is a need to invest more and more time with less aware partners and clients. When moving to the next generation of INSPIRE – there is a need to engage with micro-SMEs in the discussion ‘don’t leave us alone’ as the market is unfair. There is also a risk that member states, once the push for implementation reduced and without the EU driver will not continue to invest. His suggestion is to progress and think of INSPIRE as a Serivce – SDI as a Service can allow SMEs to join in. There is a need for cooperation between small and big players in the market.

Andrea Halmos (public services unit, DG CONNECT) – covering e-government, she noted her realisation that INSPIRE is more than ‘just environmental information’. From DG CONNECT view, ICT enabled open government, and the aim of the digital agenda for Europe is to empowering citizen and businesses, strengthening the internal market, highlighting efficiency and effectiveness and recognised pre-conditions. One of the focus is the effort to put public services in digital format and providing them in cross border way. The principles are to try to be user centred, with transparency and cross border support – they have used life events for the design. There are specific activities in sharing identity details, procurement, patient prescriptions, business, and justice.  They see these projects as the building blocks for new services that work in different areas. They are seeing challenges such financial crisis, but there is challenge of new technologies and social media as well as more opening data. So what is next to public administration? They need to deal with customer – open data, open process and open services – with importance to transparency, collaboration and participation (http://www.govloop.com/profiles/blogs/three-dimensions-of-open-government). The services are open to other to join in and allow third party to create different public services. We look at analogies of opening decision making processes and support collaboration with people – it might increase trust and accountability of government. The public service need to collaborative with third parties to create better or new services. ICT is only an enablers – you need to deal with human capital, organisational issue, cultural issues, processes and business models – it even question the role of government and what it need to do in the future. What is the governance issue – what is the public value that is created at the end? will government can be become a platform that others use to create value? They are focusing on Societal Challenge   Comments on their framework proposals are welcomed – it’s available at http://ec.europa.eu/digital-agenda/en/news/vision-public-services 

After these presentations, and when Alessandro Annoni (who was charring the panel) completed the first round of questions, I was bothered that in all these talks about knowledge-based economy only the government and the private sector were mentioned as actors, and even when discussing development of new services on top of the open data and services, the expectation is only for the private sector to act in it. I therefore asked about the role of the third-sector and civil-society within INSPIRE and the visions that the different speakers presented. I even provided the example of mySociety – mainly to demonstrate that third-sector organisations have a role to play.

To my astonishment, Henning, Giacomo, Andrea and Alessandro answered this question by first not treating much of civil-society as organisations but mostly as individual citizens, so a framing that allow commercial bodies, large and small, to act but citizens do not have a clear role in coming together and acting. Secondly, the four of them seen the role of citizens only as providers of data and information – such as the reporting in FixMyStreet. Moreover, each one repeated that despite the fact that this is low quality data it is useful in some ways. For example, Alessandro highlighted that OSM mapping in Africa is an example for a case where you accept it, because there is nothing else (really?!?) but in other places it should be used only when it is needed because of the quality issue – for example, in emergency situation when it is timely.

Apart from yet another repetition of dismissing citizen generated environmental information on the false argument of data quality (see Caren Cooper post on this issue), the views that presented in the talks helped me in crystallising some of the thoughts about the conference.

As one would expect, because the participants are civil servants, on stage and in presentations they follow the main line of the decision makers for which they work, and therefore you could hear the official line that is about efficiency, managing to do more with reduced budgets and investment, emphasising economic growth and very narrow definition of the economy that matters. Different views were expressed during breaks.

The level in which the citizens are not included in the picture was unsurprising under the mode of thinking that was express in the conference about the aims of information as ‘economic fuel’. While the tokenism of improving transparency, or even empowering citizens appeared on some slides and discussions, citizens are not explicitly included in a meaningful and significant way in the consideration of the services or in the visions of ‘government as platform’. They are reprieved as customers or service users.  The lesson that were learned in environmental policy areas in the 1980s and 1990s, which are to provide an explicit role for civil society, NGOs and social-enterprises within the process of governance and decision making are missing. Maybe this is because for a thriving civil society, there is a need for active government investment (community centres need to built, someone need to be employed to run them), so it doesn’t match the goals of those who are using austerity as a political tool.

Connected to that is the fact that although, again at the tokenism level, INSPIRE is about environmental applications, the implementation now is all driven by narrow economic argument. As with citizenship issues, environmental aspects are marginalised at best, or ignored.

The comment about data quality and some responses to my talk remind me of Ed Parsons commentary from 2008 about the UK GIS community reaction to Web Mapping 2.0/Neogeography/GeoWeb. 6 years on from that , the people that are doing the most important geographic information infrastructure project that is currently going, and it is progressing well by the look of it, seem somewhat resistant to trends that are happening around them. Within the core area that INSPIRE is supposed to handle (environmental applications), citizen science has the longest history and it is already used extensively. VGI is no longer new, and crowdsourcing as a source of actionable information is now with a decade of history and more behind it. Yet, at least in the presentations and the talks, citizens and civil-society organisations have very little role unless they are controlled and marshaled.

Despite all this critique, I have to end with a positive note. It has been a while since I’ve been in a GIS conference that include the people that work in government and other large organisations, so I did found the conference very interesting to reconnect and learn about the nature of geographic information management at this scale. It was also good to see how individuals champion use of GeoWeb tools, or the degree in which people are doing user-centred design.