GSF-NESTI Open Science & Scientific Excellence workshop – researcher, participants, and institutional aspects

The Global Science Forum – National Experts on Science and Technology Indicators (GSF-NESTI) Workshop on “Reconciling Scientific Excellence and Open Science” (for which you can see the full report here) asked the question “What do we want out of science and how can we incentivise and monitor these outputs?”. In particular, the objective of the workshop was “to explore what we want out of public investment in science in the new era of Open Science and what might be done from a policy perspective to incentivise the production of desired outputs.” with an aim to explore the overarching questions of:
1. What are the desirable (shorter-term) outputs and (longer-term) impacts that we expect from Open Science and what are potential downsides?
2. How can scientists and institutions be incentivised to produce these desirable outcomes and manage the downsides?
3. What are the implications for science monitoring and assessment mechanisms?

The session that I was asked to contribute to focused on Societal Engagement: “The third pillar of Open Science is societal engagement. Ensuring open access to scientific information and data, as considered in the previous sessions, is one way of enabling societal engagement in science. Greater access to the outputs of public research for firms is expected to promote innovation. However, engaging with civil society more broadly to co-design and co-produce research, which is seen as essential to addressing many societal challenges, will almost certainly require more pro-active approaches.
Incentivising and measuring science’s engagement with society is a complex area that ranges across the different stages of the scientific process, from co-design of science agendas and citizen science through to education and outreach. There are many different ways in which scientists and scientific institutions engage with different societal actors to informing decision-making and policy development at multiple scales. Assessing the impact of such engagement is difficult and is highly context and time-dependent“.

For this session, the key questions were

  • “What do we desire in terms of short and long-term outputs and impacts from societal engagement?
  • How can various aspect of scientific engagement be incentivised and monitored?
  • What are the necessary skills and competencies for ‘citizen scientists’ and how can they be developed and rewarded?
  • How does open science contribute to accountability and trust?
  • Can altmetrics help in assessing societal engagement?”

In my talk, I’ve decided to address the first three questions, by reflecting on my personal experience (so the story of a researcher trying to balance the “excellence” concepts and “societal engagement”), then consider the experience of the participants in citizen science projects, and finally the institutional perspective.


I’ve started my presentation [Slide 3] with my early experiences in public engagement with environmental information (and participants interest in creating environmental information) during my PhD research, 20 years ago. This was a piece of research that set me on the path of societal engagement, and open science – for example, the data that we were showing was not accessible to the general public at the time, and I was investigating how the processes that follow the Aarhus convention and use of digital mapping information in GIS can increase public engagement in decision making. This research received a small amount of funding from UCL, and later from ESRC, but not significantly.

I then secured an academic position in 2001, and it took to 2006 [Slide 4] to develop new systems – for example, this London Green Map was developed shortly after Google Maps API became available, and while this is one of the first participatory GIS applications on to of this novel API, this was inherently unfunded (and was done as an MSc project). Most of my funded work at this early stage of my career had no link to participatory mapping and citizen science. This was also true for the research into OpenStreetMap [Slide 5], which started around 2005, and apart from a small grant from the Royal Geographical Society, was not part of the main funding that I secured during the period.

The first significant funding specifically for my work came in 2007-8, about 6 years into my academic career [Slide 6]. Importantly, it came because the people who organised a bid for the Higher Education Innovation Fund (HEIF), realised that they are weak in the area of community engagement and the work that I was doing in participatory mapping fit into their plans. This became a pattern, where people approach with a “community engagement problem” – so there is here a signal that awareness to societal engagement started to grow, but in terms of the budget and place in the projects, it was at the edge of the planning process. By 2009, the investment led to the development of a community mapping system [Slide 7] and the creation of Mapping for Change, a social enterprise that is dedicated to this area.

Fast forward to today [Slide 8-10], and I’m involved in creating software for participatory mapping with non-literate participants, that support the concept of extreme citizen science. In terms of “scientific excellence”, this development, towards creating a mapping system that anyone, regardless of literacy can use [Slide 11] is funded as “challenging engineering” by EPSRC, and as “frontier research” by the ERC, showing that it is possible to completely integrated scientific excellence and societal engagement – answering the “reconciling” issue in the workshop. A prototype is being used with ZSL to monitor illegal poaching in Cameroon [Slide 12], demonstrating the potential impact of such a research.

It is important to demonstrate the challenges of developing societal impact by looking at the development of Mapping for Change [Slide 13]. Because it was one of the first knowledge-based social enterprises that UCL established, setting it up was not simple – despite sympathy from senior management, it didn’t easily fit within the spin-off mechanisms of the university, but by engaging in efforts to secure further funding – for example through a cross universities social enterprise initiatives – it was possible to support the cultural transformation at UCL.

There are also issues with the reporting of the impact of societal engagement [Slide 14] and Mapping for Change was reported with the REF 2014 impact case studies. From the universities perspective, using these cases is attractive, however, if you recall that this research is mostly done with limited funding and resources, the reporting is an additional burden which is not coming with appropriate resources. This lack of resources is demonstrated by Horizon 2020, which with all the declarations on the importance of citizen science and societal engagement, dedicated to Science with and for Society only 0.60% of the budget [Slide 15].

Participant experience

Alice Sheppard presenting her escallatorWe now move to look at the experience of participants in citizen science projects, pointing that we need to be careful about indicators and measurements.

We start by pointing to the wide range of activities that include public engagement in science [Slide 17-18] and the need to provide people with the ability to move into deeper or lighter engagement in different life stages and interests. We also see that as we get into more deep engagement, the number of people that participate drop (this is part of participation inequality).

For specific participants, we need to remember that citizen science projects are trying to achieve multiple goals – from increasing awareness to having fun, to getting good scientific data [Slide 19] – and this complicates what we are assessing in each project and the ability to have generic indicators that are true to all projects. There are also multiple learning that participants can gain from citizen science [Slide 20], including personal development, and also attraction and rejection factors that influence engagement and enquiry [Slide 21]. This can also be demonstrated in a personal journey – in this example Alice Sheppard’s journey from someone with interest in science to a citizen science researcher [Slide 22].

However, we should not look only at the individual participant, but also at the communal level. An example for that is provided by the noise monitoring app in the EveryAware project [Slide 23] (importantly, EveryAware was part of Future Emerging Technologies – part of the top excellence programme of EU funding). The application was used by communities around Heathrow to signal their experience and to influence future developments [Slide 24]. Another example of communal level impact is in Putney, where the work with Mapping for Change led to change in the type of buses in the area [Slide 25].

In summary [Slide 26], we need to pay attention to the multiplicity of goals, objectives, and outcomes from citizen science activities. We also need to be realistic – not everyone will become an expert, and we shouldn’t expect mass transformation. At the same time, we shouldn’t expect it not to happen and give up. It won’t happen without funding (including to participants and people who are dedicating significant time).

Institutional aspects

The linkage of citizen science to other aspects of open science come through DITOs bus in Birmingham participants’ right to see the outcome of work that they have volunteered to contribute to [Slide 28]. Participants are often highly educated, and can also access open data and analyse it. They are motivated by contribution to science, so a commitment to open access publication is necessary. This and other aspects of open science and citizen science are covered in the DITOs policy brief [Slide 29]. A very important recommendation from the brief is that recognition that “Targeted actions are required. Existing systems (funding, rewards, impact assessment and evaluation) need to be assessed and adapted to become fit for Citizen Science and Open Science.”

We should also pay attention to recommendations such as those from the League of European Research Universities (LERU) report from 2016 [Slide 30]. In particular, there are recommendations to universities (such as setting a single contact point) and to funders (such as setting criteria to evaluate citizen science properly). There are various mechanisms to allow universities to provide an entry point to communities that need support. Such a mechanism is called “science shop” and provide a place where people can approach the university with an issue that concerns them and identify researchers that can work with them. Science shops require coordination and funding to the students who are doing their internships with community groups. Science shops and centres for citizen science are a critical part of opening up universities and making them more accessible [Slide 31].

Universities can also contribute to open science, open access, and citizen science through learning – such as, with a MOOC that designed to train researchers in the area of citizen science and crowdsourcing that we run at UCL [Slide 32].

In summary, we can see that citizen science is an area that is expanding rapidly. It got multifaceted aspects for researchers, participants and institutions, and care should be taken when considering how to evaluate them and how to provide indicators about them – mix methods are needed to evaluate & monitor them.

There are significant challenges of recognition: as valid excellent research, to have a sustainable institutional support, and the most critical indicator – funding. The current models in which they are hardly being funded (<1% in NERC, for example) show that funders still have a journey between what they are stating and what they are doing.


Reflection on the discussion: from attending the workshop and hearing about open access, open data, and citizen science, I left the discussion realising that the “societal engagement” is a very challenging aspect of the open science agenda – and citizen science practitioners should be aware of that. My impression is that with open access, as long as the payment is covered (by funder or the institution), and as long as the outlet is perceived as high quality, scientists will be happy to do so. The same can be said about open data – as long as funders are willing to cover the costs and providing mechanisms and support for skills, for example through libraries then we can potentially have progress there, too (although over protection over data by individual scientists and groups is an issue).

However, citizen science is opening up challenges and fears about expertise, and perceptions about it risking current practices, societal status, etc. Especially when considering the very hierarchical nature of scientific work – at the very local level through different academic job ranking, and within a discipline with specific big names setting the agenda in a specific field. These cultural aspects are more challenging.

In addition, there seem to be a misunderstanding of what citizen science is and mixing it with more traditional public engagement, plus some views that it can do fine by being integrated into existing research programmes. I would not expect to see major change without providing a clear signal through significant funding over a period of time that will indicate to scientists that the only way to unlock such funding is through societal engagement. This is not exactly a “moonshot” type funding – pursue any science that you want but open it. This might lead to the necessary cultural change.

Advertisements

OECD Open Science and Scientific Excellence Workshop – Paris

The OECD organised and hosted a Global Science Forum (GSF) and National Experts on Science and Technology Indicators (NESTI) Workshop on  “Reconciling Scientific Excellence and Open Science: What do we want out of science and how can we incentivise and monitor these outputs?” (9 April, 2018, OECD). In agreement with the OECD Secretariat, the information here is not attributed to anyone specific (Here is the blog post about my own presentation).

The workshop opened with the point that speaking about reconciling open science and science seem contradictory. Scientific excellence was based on the value of publications, but the digital transformation and the web have changed things – from elite access to a library where outputs are held to one that is available to everyone over the web, and we can see citizens accessing data. We also need to look at the future – opening even more, which is something positive but there are challenges in measuring, the impact of different bibliometrics and other indicators.

The openness happens quickly, and we need to understand the transformation and then think about the statistical aspects of this information. There is an effort of developing a roadmap to see the integration of open science across science policy initiatives.

The area is fairly complex: excellence, how science is changing, incentivise and measuring science – all these are tightly related to each other. Some of the fundamental questions: what do we want from science? only excellence or other things? How can we incentivise the academic community to move in the direction of open science – and what the policy community of science need to do about it. National Statistical communities and Global Science Forum are two important groups that can influence it in terms of policy and the measurement the impacts and processes.

The meeting is looking at open science, publishing, open data, and engagement with society, as well as indicators and measurement.

The slides from all the talks are available here. 

Session 1. Scientific excellence through open science or vice versa? What is excellence and how can it be operationalised in the evidence and policy debate?

Paula Stephan (Georgia State University, USA) addressed the challenges of science – lack of risk-taking, and lack of career opportunities to Early Career Scientists in their research. The factors that impact that – especially short-term bibliometrics and then, how open science can help in dealing with the issues.

The original rationale for government support science is the high risk that is associated with basic research. The competitive selective procedures reducing risk and leading to safer options to secure funding (including NIH or ERC). James Rothman who won Nobel prize in Physiology pointed that in the 1970s there was a much higher level of risk that allows him to explore things for 5 years before he started being productive. Concerns about that aspects appeared by AAAS in 2008 ARISE report, and NASA and DARPA became much more risk-averse.

In addition, there is lack of career opportunities for ECRs – the number of PhD is growing, but the number of research position declining – both in industry and academia. Positions are scare and working in universities is an alternative career. Because of the way that the scarce jobs or research applications are based on short citation windows – high impact journal paper is critical for career development. Postdocs are desperate to get a Nature or Science paper. Assessment of novel papers (papers that use references never before made together) showed that only 11% of papers are novel, and highly novel papers is associated with risk: disproportionate concentration at the top and bottom in citations distribution, and also get cited outside the field. The more novel the paper is, the less likely it is to appear in high ranking journal. The bibliometrics discourage researchers from taking these risks with novel paper.

Open science gives opportunity – citizen science give an opportunity for new ways of addressing some issues  – e.g. through crowdfunding to accommodate risky research. In addition, publication in open access can support these novel paper strategies.

Richard Gold (McGill University, Montreal, Canada) looked at why institutions choose open science – exponentially increasing costs of research, but it’s not enough and there are requests to ask for more funding. Productivity is declining – measured by the number of papers per investment. Firms are narrowing their focus of research.

We can, therefore, consider Open Science partnerships – OA publications, Open Data and no patents on co-created outputs as a potential way to address these challenges. This can be centred around academic and not-for-profit research centre, and generally about basic understanding of scientific issues, with data in the centre. Institutions look at it as a partial solution – decreasing duplication as no need to replicate, provide quality through many eyes, and providing synergies because there is a more diverse set of partners. It can increase productivity because data can be used in different fields, using wider networks of ideas and the ability to search through a pool of ideas. We can see across fields – more researchers, but fewer outputs in. In patent applications, we see that also the 1950s was the recent peak in novelty in terms of linking unrelated field, and this is dropping since.

An alternative to this is a system like the Structural Genomics Consortium – attracting philanthropic and industrial funding. There is also a citizen science aspects – ability to shape the research agenda in addition to providing the data. The second thing is that the data can be used with their communities – patients and indigenous groups are more willing to be involved. Open science better engages and empower patients in the process – easier to get consent.

Discussion: during the selection of projects, the bibliometrics indications need to be removed from the application and from funding decisions. Need people to read the research ideas, and need to move away from funding only a single person as the first author – need to incentivise teams and support. Need to think how to deal with impact of research and not only on the original research (someone might use the dataset that was produced in open science for a publication, not by the person who did the work).

There is a sense that the “lack of risk-taking” is an issue, but there is a need for measuring and showing if it is happening. Lots of scientists censuring their work and there is a need to document this happening. The global redistribution of people is about which areas people concentrate on – e.g. between physics and agriculture.

Session 2 – Open access publication and dissemination of scientific information

Rebecca Lawrence (Faculty of 1000) described how F1000 is aiming to develop a different model of publication – separating publication from evaluation. The publication is there because of funders and researchers evaluate others around where they publish. There are all sort of manipulations: overselling, p-value fishing, creative outliers, plagiarism, non-publication by a journal that don’t want low impact papers and more. There is a growing call for the move towards open access publication – e.g. the open science policy platform, European open science cloud, principles such as DORA, FAIR (Findable, Accessible, Interoperable, Reusable) and an increase of pre-print sources. There is also a new range of how science is being organised – how to make it sustainable in areas where there aren’t receiving much funding – use of pre-print services, and also exploring the peer review funding. F1000 is about thinking about the speed of s finding. The model was developed with Wellcome, Gates foundation and creating a platform that is controlled by funders, or institutions, and by researchers. In this model, publishers are service providers. F1000 support a wide range of outputs: research article, data, software, methods, case studies. They check that the paper technically: is the data behind it accessible and that it was not published before. The publication is done a complete open peer review – so you can see who is reviewing and what was done by the author. Within the article, you can see the stage in the research – even before peer review. Making the paper a living document – usually 14 days between submission and publication, and usually a month including being reviewed. The peer review here is transparent and the reviewers are being cited. This is good for ECRs to gain experience.

The indicators need to take into account career levels, culture (technical and reflective) and not only fields, and thinking about different structures – individual, group, institution. Need open metrics, and certain badges that tell you what you are looking for and also qualitative measures- traditional publications can curate articles.

2. Vincent Tunru (Flockademic, Netherlands) explored the issue of incentivising open science. Making science more inclusive – making more people being able to contribute to the scientific process. Open access can become the goal instead of the means to become more inclusive. If the information is free, people can read the results of publicly funded research, but there is a barrier to publish research within the OA model – publication costs should be much lower: other areas (music, news) have gone down in costs because of the internet. In some disciplines, there is the culture of sharing pre-print and getting feedback before submission to journals – although places like ArXiv is doing the work. The primary value of the submission to a journal is the credentialing, High-level journals can create scarcity to justify the demand. Nature scientific reports is taking over PLOS ONE because of that. We need to decouple credentialing from the specific journals. Different measures of excellence are possible, but we need to consider how we do it today – assuming that it is reviewers and editors are the ones who consider what excellence means. Need to focus on inclusivity and affordability. [See Vincent blog post here]

Kim Holmberg (University of Turku, Finland) focused on altmetrics –  Robert Merton pointed already in the 1950s that the referencing system is about finding a work that wasn’t known before but also about recognition of the other researchers. That leads then to how the journal impact factor and the H-Index became part of research assessment. These are being used more and more in research evaluation especially in the past 15 years. Earlier research has pointed out many flaws with them. In addition, they fail to take into account the complexity of scientific activities, nor do they tell you anything about the societal impact of research. One way to look at the complexity is the Open Science Career Assessment Matrix (OS-CAM).

We can think about the traces that people leave online as they go through the research process – discussing research ideas, collecting data, analysing, disseminating results. These traces can become altmetrics – another view of research activities. It is not just social media: the aim is to expand the view of what’s impact is about. With altmetrics we can analyse the networks that the researcher is involved in and that can give insights into new ways of interaction between the researcher with society. Citations show that a paper has been used by another researcher, while altmetrics can indicate how it has been disseminated and discussed among a wider audience. But there are still lots of questions about the meaning and applicability of altmetrics.

There are reports from the Mutual Learning Exercise europa.eu/!bj48Xg – looking at altmetrics, incentives and rewards for open science activities. For instance, in the area of career & research evaluation, researchers need specific training and education about open science, and in the area of evolving authorship identifying and rewarding peer review and publishing of negative results need to be developed. Implementation of open science needs to guarantee long-term sustainability and reward role-models who can provide a demonstration of this new approach to involving in science. The roadmap from the MLE suggests a process for this implementation.

Discussion: there is the issue of finding a good researcher in a group of researchers and publications is a way to see the ideas, but the link to open science and how it can help in that is unclear. However, finding a good researcher does not happen through all these metrics – it’s a human problem and not only a metric. Will originality be captured by these systems? Publication is only small part of the research activity – in every domain, there is a need to change and reduce the publication, but not only to think that someone will read the same paper again and again (after each revision). Attention is the scarce resource that needs to be managed and organised not to assume that more find a way to filter the information.

The response to this pointed that because of research funding is public, we should encourage publishing as much as possible so others can find the information, but we need good tools for searching and evaluating research so you can find it.

Another confusion – want to see the link between open access publication and open science. Open access can exist in the publish or perish structure. What is it in OA that offer an alternative to the close publishing structure. How can that lead us to different insight into researchers activities? In response to this, it was pointed out that it is important to understand the difference between Open Access and Open Science (OA = openly available research publications, OS = all activities and efforts that open the whole research process, including publishing of research results).

There is growing pressure for people to become media savvy and that means taking time from research.

Altmetrics: originally thought of as a tool that can help researchers find interesting and relevant research, not necessarily for evaluation (http://altmetrics.org/manifesto/).

Discussion: there is the issue of finding a good researcher in a group of researchers and publications is a way to see the ideas, but the link to open science and how it can help in that is unclear. However, finding a good researcher is not through all these metrics – it’s a human problem and not only a metric. Will originality be captured by these systems? Publication is only small part of the research activity – in every domain, there is a need to change and reduce the publication, but not only to think that someone will read the same paper again and again (after each revision). Attention is the scarce resource that needs to manage and organised not to assume that more find a way to filter the information.

The response to this pointed that because of research funding is public, we should encourage publishing as much as possible so others can find the information, but we need good tools for searching and evaluating research so you can find it.

Another confusion – want to see the link between open access publication and open science. Open access can exist in the publish or perish structure. What is it in OA that offer an alternative to the close publishing structure. How can that lead us to different insight into researchers activities?

There is growing pressure for people to become media savvy and that means taking time from research.

Altmetrics: originally as a tool that can help other researchers, not necessarily for evaluation.

Session 3. Open research data: good data management and data access

Simon Hodson (CODATA) – Open Science and FAIR data. The reconciling elements – the case for open science is the light that it shines on the data and make it useful. It allows reuse, reproducibility, and replicability – it is very much matching each other. CODATA is part of the International Council for Science – focusing capacity building, policy, and coordination. The case for open science – good scientific practice depends on communicating the evidence. In the past, a table or a graph that summarises some data was an easy way of sharing information, but as data and analysis grew, we need to change the practice of sharing results. The publications of “Science as an open enterprise” (2012), including pointing that the failure to report the data underlying the science is seen as malpractice. Secondly, open data practices transform certain areas of research – genomics, remote sensing in earth systems science. Can we replicate this in other research areas? Finally, can we foster innovation and reuse of data and finding within and outside the academic system – making it available to the public at large.

Open science has multiple elements – open science is not only open access and open data. We need data to be interoperable and reusable and should be available for machine learning and have an open discussion. There are perceptions of reproducibility of research but also change in attitudes. We need to think about culture – how scientific communities established their practices. In different research areas, there are very different approaches – e.g. in biomedical research, this is open but in social science, there is little experience of data sharing and reuse and don’t see benefits. There is a need for a sociology of science analysis of these changes. Some of these major changes: meetings about genome research in Bermuda and Fort Lauderdale agreement which was because of certain pressures. There is significant investment in creating data that is not being used once – e.g. Hubble. Why data across small experiments is not open to reuse? We need to find making this happen.

FAIR principle allows data to be reusable. FAIR came from OECD work, Royal Society report 2012 and G8 statement. What we need to address: skills, also limits of sharing, need to clarify guidelines for openness. We need to have standards, skills and reward data stewardship. We need to see data citation of data. There is a need for new incentives – the cultural change happened when prominent people in the field set up the agreement.

Fiona Murphy (Fiona Murphy Mitchell Consulting, UK) Working in the area of data publishing and providing the perspective of someone who is exploring how to practice open science. There are cultural issues: why to share, with whom, what rewards, and what is the risk. Technical – how is that is done, what are the workflows, tools, capacity, and time investment. There are issues of roles and responsibilities and who’s problem is it to organise the data.

Examples of projects – SHARC – share research data alliance – international and multi-stakeholders and aim to grow the capacity to share data. The specific group is working a White Paper on recommendations. The main issues are standards for metrics: need to be transparent, need about reputation, and impact on a wider area. Also, what will be the costs of non-sharing? There are different standards in terms of policies, also need persistent identifiers and the ability to reproduce. Equality of access to services is needed – how to manage peer to peer and how is that integrated into promotion and rewards. The way to explore that is by carrying out pilots projects to understand side effects. There is also a need to develop ethical standards.

The Belmont Forum Data Publishing Policy – looking at creating the data accessibility that is part of a digital publication. Developing consistency of message so researchers will know what they are facing. There are lots of issues – some standard wording is emerging, and capturing multiple data sets, clarify licensing etc.

We can also think about what would have started if all the current system was in place – the scholarlycommons.org is suggesting principles for “born digital” scientific practice should evolve. The approach to thinking about commons, they have created some decision trees to help with the project. Working as open scientists is a challenge today – for example, need to develop a decision tree software and other things are proving challenging to act as a completely open scientist. It’s a busy space and there is a gulf between high-level policy and principles and their delivery.

Jeff Spies (Centre for Open Science, Virginia) [via video-link] Jeff is covering open research data, urgent problems, and incremental solutions. looking at strategies that are the most impactful (which is different from the center for open science). We need to broaden the definition of data – we need context: more than just the data itself or the metadata – it is critical for the assessment, metascience work. We can think of knowledge graph – more then the semantic information for the published text, and the relationship of people, place, data, methods, software… but the situation in incentives is – from psychological perspectives, the getting awards for specific publications is so strong that makes the focus on what is publishable. We have rates of retractions go up as impact factor goes up. There is urgency and the lock-in the publishers are trying to capture the life-cycle of research. The problem is that culture change is very slow and we need to protect the data – funders and policymakers that can make a difference. Researchers don’t have the ability to curate data – but libraries are the people that can have a resource for that and focus. Potential – the researcher asked to link to promotion policies and that will force universities to share them, and if the policy mention data sharing (as a way to force universities to change)

Discussion: there is concern about the ability of researchers to deal with data. There is a problem of basic data literacy.

The problem with making the data FAIR it is about 10% of the project costs and where it is useful, or where it is not enough or too much – just organising the data with the librarians is not enough as data requires a lot of domain knowledge. There are significant costs. however, in the same way, that the total costs of science to include the effort of peer review, or getting to publications (either subscription or publication), then we should also pay for the data curation. There is a need for appraisal and decision how data and process will be done.

We need to think about the future use of data – the same as natural history specimens and we can never know what should be done. Questions about the meaning of data are very important – it’s not only specimens but also photographs and not necessarily digital.

Libraries can adapt and can get respects – they are experts in curation and archiving

Session 4. Societal engagement 

Kazuhiro Hayashi (NISTEP, Tokyo, Japan) Open science as a social engagement in Japan. Is in science and technology – is being involved in open access journal and keen about altmetrics – now involved in open science policy. Generally, see multi-role – top down and bottom up – from working in G7 science expert group in open science, and also in creating software and journals. Involved in citizen science NISTEP journal and lectures, and involved in altmetrics, multi-stakeholders workshop and future earth. He would like to showcase studies:

Citizen science – the funding system in Japan for science is coming from the state mainly and they have a difficult time to do public engagement – spontaneous researchers “wild researchers”. Suggesting a more symmetrical system – creating also independent researchers which are getting their budget from a business and they publish in online journals. Wild researchers are based on crowdfunding and relay on the engagement of citizens. From his experience, recognise the new relationship between citizens and scientists: research style, new career paths and funding. Negative aspects of citizen science include populism in crowdfunding – need to be popular but not suitable for the crowd. Als need a new scheme for ECRs and need to include it. Also, there is a potential for misuse and plagiarism because of lack of data and science literacy.

Altmetrics – contributed to NISO Altmetrics initiative working group – difficult to define, and current altmetrics scores in Japanese literature are closely related to Maslow’s hierarchy of need. There are plenty of institutional repositories that – access to journal articles on repositories is more social – readers are non-researchers who would go to journal websites. Need to look at social impact – look mentioning and network analysis but it is difficult to analyse. There is need to look at the flow of data across the web.

Multi-stakeholders workshop – considering the future of open science and society. With environmental sciences and informatics. the outcome is to think about erasing influences of different socio-economic status on participants. Co-development of data infrastructure and the action of social transformation. There is an importance in capacity building. Need to see how open science and transdisciplinary work co-evolved. For social engagement – very time-consuming and need to get funded, and need open for creative activities for citizens and scientists. Think about new relationships between science and society. Need to use tentative indicators to transform society and culture – creating a future of open science and society – move from “publish or perish” to “share or perish”. Japan will have 2 citizen science sessions at the Japan open science summit on June 18-19 2018.

Muki Haklay (UCL, London, UK) [see my separate blog post]

Cecilia Cabello Valdes (Foundation for Science and Technology, Madrid, Spain) Societal engagement in open science. The foundation is aimed to promote science link with society – original with interest of increasing interest of the Spanish citizens. They are managing calls and fund different activities (about 3,250K Eur). More than 200 projects. They do activities such as Famelab – giving events to promote science and technology, in an open way. The science news agency – there is lack of awareness of scientific research – the SiNC agency – the papers are taken by general media – over 1000 journalists who use the information. They carry out summer science camps: 1920 funded students funded in 16 universities.They also manage the national museum of science and technology (Muncyt) and they share the history of science and technology in Spain. It’s a unique type of a science museum.

In citizen science, they have done a lot of work in awareness of the public to science and technology, and to keep public support for science investment. More recently they create a council of foundations for science – there wasn’t awareness of social foundations that haven’t invest in science and not only cultural activities. There are 3 foundations that are involved with the council and they are having a direct contact with the minister to develop this area of funding. The second initiative is crowdfunding for science – they help to carry out a campaign that helps in creating activities – it is also a tool of engagement.

Outreach is difficult – the council support policymakers and the general public is aware of the issues. So there are challenges – and that need to transform and how do we measure it? Some of the roles that the council need to do is to incentivise policymakers to understand what they want to achieve and then have indicators to assist in seeing that the goals are achieved. They participated in the process of policy recommendation about open science, and then translate that into action – for policymakers and society. In Fecyt they also provide resources: access to WoS/Scopus, evaluation of journals, standardised CV of researchers, and open science. Finally they participation in studies that look at measurements of science and the results

Discussion: Science Shops – are there examples that link to Maker spaces? Yes, there are examples of activities such as Public Lab but also the Living Knowledge network

Many societal engagements are not open science – they treat society as a separate entity: a struggle of making citizen science into open science – data remain closed. What are the aspects that lend themselves to open science and citizen science? – there are many definitions and there are different ways to define the two, but for example, the need to access publications, or the participation in the analysis of open data, or the production of open data, are all examples for an overlap.

Part of the discussion is about sharing knowledge, the part that says that researcher is like anyone else? There is a big difference between the scientific community and everyone else? The effort is not recognised in society and might you remove the prestige than no one would want to participate in science?

As you know, public interest – why the citizens want to participate in research? the citizens want the result of public research will help people to improve their quality of life. The science should address social problems.

How much people participate in – precipita is a new project and fund are not matched and they provide the technical help, and the promotion is through a campaign through different institutions

Should citizen science democratise science which is controversial – when information became more accessible as in Gutenberg, we are increasing the ability. Need to make citizen science a way to increase access to science.

How to get to integrated science into pockets and need to find a way to integrate these things together. There is a package that needs to support together: access, data, and public engagement and need to focus on them.

Citizen science needs to be integrated into all the science and needs to make results.

Session 5. Scientific Excellence re-visited

David Carr (Wellcome Trust, London, UK) Wellcome is committed to providing their research outputs – seeing it as part of good research practice. As a funder, they’ve had a long-standing policy on open accessing publications (since 2005) and other research outputs. Need to have also the costs of carrying out public engagement, and open access publications should be part of the funding framework. Also asking reviewers to recognise and value a wide range of research outputs. There are still need to think of reward and assessment structures, the sustaining of the infrastructures that are needed, and the need to create data specialists and managing the process to increase it. There are concerns by the research community about open access. Wellcome established open research team – looking at funder led and community-led activities, and also policy leadership. They now have the “WellcomeOpenResearch.org publishing platform” which is using F1000 platform, they also had the open science prize. They also look on policy leadership – e.g. the San Francisco DORA (declaration on research assessment). Also looking at changes to application forms to encourage other forms of outputs and then provide guidance to staff, reviewers and panel members. They also celebrate with applicants when they do open research, and also inform them about the criteria and options. They also carry out effort to evaluate if the open science indeed delivers on the promises through projects in different places – e.g. the McGill project.

Environmental information: between scarcity/abundance and emotions/rationality

The Eye on Earth Summit, which was held in Abu Dhabi last week, allowed me to immerse myself in the topics that I’ve been researching for a long time: geographic information, public access to environmental information, participation, citizen science, and the role of all these in policy making. My notes (day 1 morning, day 1 afternoon, day 2 morning, day 2 afternoon, day 3 morning & day 3 afternoon) provide the background for this post, as well as the blog posts from Elisabeth Tyson (day 1, day 2) and the IISD reports and bulletins from the summit. The first Eye on Earth Summit provided me with plenty to think about, so I thought that it is worth reflecting on my ‘Take home’ messages.

What follows are my personal reflections from the summit and the themes that I feel are emerging in the area of environmental information today. 

wpid-wp-1444166132788.jpgWhen considering the recent ratification of the Sustainable Development Goals or SDGs by the UN Assembly, it is not surprising that they loomed large over the summit – as drivers for environmental information demand for the next 15 years, as focal points for the effort of coordination of information collection and dissemination, but also as an opportunity to make new links between environment and health, or promoting environmental democracy (access to information, participation in decision making, and access to justice). It seems that the SDGs are very much in the front of the mind of the international organisations who are part of the Eye on Earth alliance, although other organisations, companies and researchers who are coming with more technical focus (e.g. Big Data or Remote Sensing) are less aware of them – at least in terms of referring to them in their presentations during the summit.

Beyond the SDGs, two overarching tensions emerged throughout the presentations and discussions – and both are challenging. They are the tensions between abundance and scarcity, and between emotions and rationality. Let’s look at them in turn.

Abundance and scarcity came up again and agin. On the data side, the themes of ‘data revolution’, more satellite information, crowdsourcing from many thousands of weather observers and the creation of more sources of information (e.g. Environmental Democracy Index) are all examples for abundance in the amount of available data and information. At the same time, this was contrasted with the scarcity in the real world (e.g species extinction, health of mangroves), scarcity of actionable knowledge, and scarcity with ecologists with computing skills. Some speakers oscillated between these two ends within few slides or even in the same one. There wasn’t an easy resolution for this tension, and both ends were presented as challenges.

wpid-wp-1444327727288.jpg

With emotions and scientific rationality, the story was different. Here the conference was packed with examples that we’re (finally!) moving away from a simplistic ‘information deficit model‘ that emphasise scientific rationality as the main way to lead a change in policy or public understanding of environmental change. Throughout the summit presenters emphasised the role of mass media communication, art (including live painting development through the summit by GRID-Arendal team), music, visualisation, and story telling as vital ingredients that make information and knowledge relevant and actionable. Instead of a ‘Two Cultures’ position, Eye on Earth offered a much more harmonious and collaborative linkage between these two ways of thinking and feeling.

Next, and linked to the issue of abundance and scarcity are costs and funding. Many talks demonstrated the value of open data and the need to provide open, free and accessible information if we want to see environmental information used effectively. Moreover, providing the information with the ability of analyse or visualise it over the web was offered as a way to make it more powerful. However, the systems are costly, and although the assessment of the IUCN demonstrated that the investment in environmental datasets is modest compared to other sources (and the same is true for citizen science), there are no sustainable, consistent and appropriate funding mechanisms, yet. Funding infrastructure or networking activities is also challenging, as funders accept the value, but are not willing to fund them in a sustainable way. More generally, there is an issue about the need to fund ecological and environmental studies – it seem that while ‘established science’ is busy with ‘Big Science’ – satellites, Big Data, complex computer modelling – the work of studying ecosystems in an holistic way is left to small group of dedicated researchers and to volunteers. The urgency ad speed of environmental change demand better funding for these areas and activities.

This lead us to the issue of Citizen Science, for which the good news are that it was mentioned throughout the summit, gaining more prominence than 4 years ago in the first summit (were it also received attention). In all plenary sessions, citizen science or corwdsourced geographic information were mentioned at least once, and frequently by several speakers. Example include Hermes project for recording ocean temperatures, Airscapes Singapore for urban air quality monitoring, the Weather Underground of sharing weather information, Humanitarian OpenStreetMap Team work in Malawi, Kathmandu Living Lab response to the earthquake in Nepal, Arab Youth Climate Movement in Bahrain use of iNaturalist to record ecological observations, Jacky Judas work with volunteers to monitor dragonflies in Wadi Wurayah National Park  – and many more. Also the summit outcomes document is clear:  “The Summit highlighted the role of citizen science groups in supporting governments to fill data gaps, particularly across the environmental and social dimensions of sustainable development. Citizen Science was a major focus area within the Summit agenda and there was general consensus that reporting against SDGs must include citizen science data. To this end, a global coalition of citizen science groups will be established by the relevant actors and the Eye on Earth Alliance will continue to engage citizen science groups so that new data can be generated in areas where gaps are evident. The importance of citizen engagement in decision-making processes was also highlighted. “

However, there was ambivalence about it – should it be seen as an instrument, a tool to produce environmental information or as a mean to get wider awareness and engagement by informed citizens? How best to achieve the multiple goals of citizen science: raising awareness, educating, providing skills well beyond the specific topic of the project, and democratising decision making and participation? It seem to still be the case that the integration of citizen science into day to day operations is challenging for many of the international organisations that are involved in the Eye on Earth alliance.

Another area of challenging interactions emerged from the need for wide partnerships between governments, international organisations, Non-Governmental Organisations (NGOs), companies, start-ups, and even ad-hoc crowds that respond to a specific event or an issue which are afforded by digital and social network. There are very different speeds in implementation and delivery between these bodies, and in some cases there are chasms that need to be explored – for example, an undercurrent from some technology startups is that governments are irrelevant and in some forms of thinking that ‘to move fast and break things’ – including existing social contracts and practices – is OK. It was somewhat surprising to hear speakers praising Uber or AirBnB, especially when they came from people who familiar with the need for careful negotiations that take into account wider goals and objectives. I can see the wish to move things faster – but to what risks to we bring by breaking things?

With the discussions about Rio Principle 10 and the new developments in Latin America, the Environmental Democracy Index, and the rest, I became more convinced, as I’ve noted in 2011, that we need to start thinking about adding another right to the three that are included in it (access to environmental information, participation in decision-making, and access to justice), and develop a right to produce environmental information that will be taken seriously by the authorities – in other words, a right for citizen science. I was somewhat surprised by the responses when I raised this point during the discussion on Principle 10.

Final panel (source: IISD)

Finally, Eye on Earth was inclusive and collaborative, and it was a pleasure to see how open people were to discuss issues and explore new connections, points of view or new ways of thinking about issues. A special point that raised several positive responses was the gender representation in such high level international conference with a fairly technical focus (see the image of the closing panel). The composition of the speakers in the summit, and the fact that it was possible to have such level of women representation was fantastic to experience (making one of the male-only panels on the last day odd!). It is also an important lesson for many academic conferences – if Eye on Earth can, I cannot see a reason why it is not possible elsewhere.

Data and the City workshop (day 1)

The workshop, which is part of the Programmable City project (which is funded by the European Research Council), is held in Maynooth on today and tomorrow. The papers and discussions touched multiple current aspects of technology and the city: Big Data, Open Data, crowdsourcing, and critical studies of data and software. The notes below are focusing on aspects that are relevant to Volunteered Geographic Information (VGI), Citizen Science and participatory sensing – aspects of Big Data/Open data are noted more briefly.

Rob Kitchin opened with a talk to frame the workshop, highlighting the history of city data (see his paper on which the talk is based). We are witnessing a transformation from data-informed cities to data-driven cities. Within these data streams we can include Big Data, official data, sensors, drones and other sources. The sources also include volunteered information such as social media, mapping, and citizen science. Cities are becoming instrumented and networked and the data is assembled through urban informatics (focusing on interaction and visualisation) and urban science (which focus on modelling and analysis( . There is a lot of critique – with relations to data, there are questions about the politics of urban data, corporatisation of governance, the use of buggy, brittle and hackable urban systems, and social and ethical aspects.  Examples to these issues include politics: accepting that data is not value free or objective and influenced by organisations with specific interest and goals. Another issue is the corporatisation of data, with questions about data ownership and data control. Further issues of data security and data integrity when systems are buggy and brittle – there have been cases of hacking into a city systems already. Social, Political, and ethical aspects include data protection and privacy, dataveillance/surveillance, social sorting through algorithms, control creep, dynamic pricing and anticipatory governance (expecting someone to be a criminal). There are also technical questions: coverage, integration between systems, data quality and governance (and the communication of information about quality), and skills and organisational capabilities to deal with the data.
The workshop is to think critically about the data, and asking questions on how this data is constructed and run.

The talk by Jim Thatcher & Craig Dalton – explored provenance models of data. A core question is how to demonstrate that data is what is saying it is and where it came from. In particular, they consider how provenance applies to urban data. There is an epistemological leap from an individual (person) to a data point(s) – per person there can be up to 1500 data attribute per person in corporate database. City governance require more provenance in information than commercial imperatives. They suggest that data user and producers need to be aware of the data and how it is used.

Evelyn Ruppert asked where are the data citizens? Discuss the politics in data, and thinking about the people as subjects in data – seeing people as actors who are intentional and political in their acts of creating data. Being digital mediates between people and technology and what they do. There are myriad forms of subjectivation – there are issues of rights and how people exercise these rights. Being a digital citizens – there is not just recipient of rights but also the ability to take and assert rights. She used the concept of cyberspace as it is useful for understanding rights of the people who use it, while being careful about what it means. There is conflation of cyberspace and the Internet and failures to see it as completely separate space. She sees Cyberspace is the set of relations and engagements that are happening over the Internet. She referred to her recent book ‘Being Digital Citizens‘. Cyberspace has relationships to real space – in relations to Lefebvre concepts of space. She use speech-act theory that explore the ability to act through saying things, and there is a theoretical possibility of performativity in speech. We are not in command of what will happen with speech and what will be the act. We can assert acts through the things we do, and not only in the thing we say and that’s what is happening with how people use the Internet and construct cyberspace.

Jo Bates talked about data cultures and power in the city. Starting from hierarchy in dat and information. Data can be thought as ‘alleged evidence’ (Buckland) – data can be thought as material, they are specific things – data have dimensionality, weight and texture and it is existing something. Cox, in 1981, view the relationship between ideas, institutions and material capabilities – and the tensions between them – institutions are being seen as stabilising force compare to ideas and material capabilities, although the institutions may be outdated. She noted that sites of data cultures are historically constituted but also dynamic and porous – but need to look at who participate and how data move.

The session followed by a discussion, some of the issues: I’ve raised the point of the impact of methodological individualism on Evelyn and Jim analysis – for Evelyn, the digital citizenship is for collectives, and for Jim, the provenance and use of devices is done as part of collectives and data cultures. Jo explored the idea of “progressive data culture” and suggested that we don’t understand what are the conditions for it yet – the inclusive, participatory culture is not there. For Evelyn, data is only possible through the action of people who are involved in its making, and the private ownership of this data does not necessarily make sense in the long run. Regarding hybrid space view of cyberspace/urban spaces – they are overlapping and it is not helpful to try and separate them. Progressive data cultures require organisational change at government and other organisations. Tracey asked about work on indigenous data, and the way it is owned by the collective – and  noted that there are examples in the arctic with a whole setup for changing practices towards traditional and local knowledge. The provenance goes all the way to the community, the Arctic Spatial Data Infrastructure there are lots of issues with integrating indigenous knowledge into the general data culture of the system. The discussion ended with exploration of the special case of urban/rural – noting to the code/space nature of agricultural spaces, such as the remote control of John Deere tractors, use of precision agriculture, control over space (so people can’t get into it), tagged livestock as well as variable access to the Internet, speed of broadband etc.

The second session looked at Data Infrastructure and platforms, starting with Till Straube who looked at Situating Data Infrastructure. He highlighted that Git (GitHub) blurs the lines between code and data, which is also in functional programming – code is data and data is code. He also looked at software or conceptual technology stacks, and hardware is at the bottom. He therefore use the concept of topology from Science and Technology Studies and Actor-Network Theory to understand the interactions.

Tracey Lauriaultontologizing the city – her research looked at the transition of Ordnance Survey Ireland (OSi) with their core GIS – the move towards object-oriented and rules based database. How is the city translated into data and how the code influence the city? She looked at OSi, and the way it produce the data for the island, and providing infrastructure for other bodies (infrastructure). OSi started as colonial projects, and moved from cartographical maps and digital data model to a full object-oriented structure. The change is about understanding and conceptualising the mapping process. The ontology is what are the things that are important for OSi to record and encode – and the way in which the new model allows to reconceptualise space – she had access to a lot of information about the engineering, tendering and implementation process, and also follow some specific places in Dublin. She explore her analysis methods and the problems of trying to understand how the process work even when you have access to information.

The discussion that follows explored the concept of ‘stack’ but also ideas of considering the stack at planetary scale. The stack is pervading other ways of thinking – stack is more than a metaphor: it’s a way of thinking about IT development, but it can be flatten. It gets people to think how things are inter-relations between different parts. Tracey: it is difficult to separate the different parts of the system because there is so much interconnection. Evelyn suggested that we can think about the way maps were assembled and for what purpose, and understanding how the new system is aiming to give certain outcomes. To which Tracey responded that the system moved from a map to a database, Ian Hacking approach to classification system need to be tweaked to make it relevant and effective for understanding systems like the one that she’s exploring. The discussion expanded to questions about how large systems are developed and what methodologies can be used to create systems that can deal with urban data, including discussion of software engineering approaches, organisational and people change over time, ‘war stories’ of building and implementing different systems, etc.

The third and last session was about data analytics and the city – although the content wasn’t exactly that!

Gavin McArdle covered his and Rob Kitchin paper on the veracity of open and real-time urban data. He highlighted the value of open data – from claims of transparency and enlighten citizens to very large estimation of the business value. Yet, while data portals are opening in many cities, there are issues with the veracity of the data – metadata is not provided along the data. He covered spatial data quality indicators from ISO, ICA and transport systems, but questioned if the typical standard for data are relevant in the context of urban data, and maybe need to reconsider how to record it. By looking at 2 case studies, he demonstrated that data is problematic (e.g. indicating travel in the city of 6km in 30 sec). Communicating the changes in the data to other users is an issue, as well as getting information from the data providers – maybe possible to have meta-data catalogue that add information about a dataset and explanation on how to report veracity. There are facilities in Paris and Washington DC, but they are not used extensively

Next, Chris Speed talked about blockchain city – spatial, social and cognitive ledgers, exploring the potential of distributed recording of information as a way to create all forms of markets in information that can be controlled by different actors.

I have closed the session with a talk that is based on my paper for the workshop, and the slides are available below.

The discussion that followed explored aspects of representation and noise (produced by people who are monitored, instruments or ‘dirty’ open data), and some clarification of the link between the citizen science part and the philosophy of technology part of my talk – highlighting that Borgmann use of ‘natural’,’cultural’ and ‘technological’ information should not be confused with the everyday use of these words.

Crowdsourced Geographic Information in Government

Today marks the publication of the report ‘crowdsourced geographic information in government‘. ReportThe report is the result of a collaboration that started in the autumn of last year, when the World Bank Global Facility for Disaster Reduction and Recovery(GFDRR)  requested to carry out a study of the way crowdsourced geographic information is used by governments. The identification of barriers and success factors were especially needed, since GFDRR invest in projects across the world that use crowdsourced geographic information to help in disaster preparedness, through activities such as the Open Data for Resilience Initiative. By providing an overview of factors that can help those that implement such projects, either in governments or in the World Bank, we can increase the chances of successful implementations. To develop the ideas of the project, Robert Soden (GFDRR) and I run a short workshop during State of the Map 2013 in Birmingham, which helped in shaping the details of project plan as well as some preliminary information gathering. The project team included myself, Vyron Antoniou, Sofia Basiouka, and Robert Soden (GFDRR). Later on, Peter Mooney (NUIM) and Jamal Jokar (Heidelberg) volunteered to help us – demonstrating the value in research networks such as COST ENERGIC which linked us.

The general methodology that we decided to use is the identification of case studies from across the world, at different scales of government (national, regional, local) and domains (emergency, environmental monitoring, education). We expected that with a large group of case studies, it will be possible to analyse common patterns and hopefully reach conclusions that can assist future projects. In addition, this will also be able to identify common barriers and challenges.

We have paid special attention to information flows between the public and the government, looking at cases where the government absorbed information that provided by the public, and also cases where two-way communication happened.

Originally, we were aiming to ‘crowdsource’  the collection of the case studies. We identified the information that is needed for the analysis by using  few case studies that we knew about, and constructing the way in which they will be represented in the final report. After constructing these ‘seed’ case study, we aimed to open the questionnaire to other people who will submit case studies. Unfortunately, the development of a case study proved to be too much effort, and we received only a small number of submissions through the website. However, throughout the study we continued to look out for cases and get all the information so we can compile them. By the end of April 2014 we have identified about 35 cases, but found clear and useful information only for 29 (which are all described in the report).  The cases range from basic mapping to citizen science. The analysis workshop was especially interesting, as it was carried out over a long Skype call, with members of the team in Germany, Greece, UK, Ireland and US (Colorado) while working together using Google Docs collaborative editing functionality. This approach proved successful and allowed us to complete the report.

You can download the full report from UCL Discovery repository

Or download a high resolution copy for printing and find much more information about the project on the Crowdsourcing and government website 

Second day of INSPIRE 2014 – open and linked data

Opening geodata is an interesting issue for INSPIRE  directive. INSPIRE was set before the hype of Government 2.0 was growing and pressure on opening data became apparent, so it was not designed with these aspects in mind explicitly. Therefore the way in which the organisations that are implementing INSPIRE are dealing with the provision of open and linked data is bound to bring up interesting challenges.

Dealing with open and linked data was the topic that I followed on the second day of INSPIRE 2014 conference. The notes below are my interpretation of some of the talks.

Tina Svan Colding discussed the Danish attempt to estimate the value (mostly economically) of open geographic data. The study was done in collaboration with Deloitte, and they started with a change theory – expectations that they will see increase demands from existing customers and new ones. The next assumption is that there will be new products, companies and lower prices and then that will lead to efficiency and better decision making across the public and private sector, but also increase transparency to citizens. In short, trying to capture the monetary value with a bit on the side. They have used statistics, interviews with key people in the public and private sector and follow that with a wider survey – all with existing users of data. The number of users of their data increased from 800 users to over 10,000 within a year. The Danish system require users to register to get the data, so this are balk numbers, but they could also contacted them to ask further questions. The new users – many are citizens (66%) and NGO (3%). There are further 6% in the public sector which had access in principle in the past but the accessibility to the data made it more usable to new people in this sector. In the private sector, construction, utilities and many other companies are using the data. The environmental bodies are aiming to use data in new ways to make environmental consultation more engaging to audience (is this is another Deficit Model? assumption that people don’t engage because it’s difficult to access data?). Issues that people experienced are accessibility to users who don’t know that they need to use GIS and other datasets. They also identified requests for further data release. In the public sector, 80% identified potential for saving with the data (though that is the type of expectation that they live within!).

Roope Tervo, from the Finish Meteorological Institute talked about the implementation of open data portal. Their methodology was very much with users in mind and is a nice example of user-centred data application. They hold a lot of data – from meteorological observations to air quality data (of course, it all depends on the role of the institute). They have chose to use WFS download data, with GML as the data format and coverage data in meteorological formats (e.g. grib). He showed that selection of data models (which can be all compatible with the legislation) can have very different outcomes in file size and complexity of parsing the information. Nice to see that they considered user needs – though not formally. They created an open source JavaScript library that make it is to use the data- so go beyond just releasing the data to how it is used. They have API keys that are based on registration. They had to limit the number of requests per day and the same for the view service. After a year, they have 5000 users, and 100,000 data downloads per day and they are increasing. Increasing slowly. They are considering how to help clients with complex data models.

Panagiotis Tziachris was exploring the clash between ‘heavy duty’ and complex INSPIRE standards and the usual light weight approaches that are common in Open Data portal (I think that he intended in the commercial sector that allow some reuse of data). This is a project of 13 Mediterranean regions in Spain, Italy, Slovenia, Montenegro, Greece, Cyprus and Malta. The HOMER project (website http://homerproject.eu/) used different mechanism, including using hackathons to share knowledge and experience between more experienced players and those that are new to the area. They found them to be a good way to share practical knowledge between partners. This is an interesting side of purposeful hackathon within a known people in a project and I think that it can be useful for other cases. Interestingly, from the legal side, they had to go beyond the usual documents that are provided in an EU consortium, and  in order to allow partners to share information they created a memorandum of understanding for the partners as this is needed to deal with IP and similar issues. Also practices of open data – such as CKAN API which is a common one for open data websites were used. They noticed separation between central administration and local or regional administration – the competency of the more local organisations (municipality or region) is sometimes limited because knowledge is elsewhere (in central government) or they are in different stages of implementation and disagreements on releasing the data can arise. Antoehr issue is that open data is sometime provided at regional portals while another organisation at the national level (environment ministry or cadastre body) is the responsible to INSPIRE. The lack of capabilities at different governmental levels is adding to the challenges of setting open data systems. Sometime Open Data legislation are only about the final stage of the process and not abour how to get there, while INPIRE is all about the preparation, and not about the release of data – this also creates mismatching.

Adam Iwaniak discussed how “over-engineering” make the INSPIRE directive inoperable or relevant to users, on the basis of his experience in Poland. He asks “what are the user needs?” and demonstrated it by pointing that after half term of teaching students about the importance of metadata, when it came to actively searching for metadata in an assignment, the students didn’t used any of the specialist portals but just Google. Based on this and similar experiences, he suggested the creation of a thesaurus that describe keywords and features in the products so it allows searching  according to user needs. Of course, the implementation is more complex and therefore he suggests an approach that is working within the semantic web and use RDF definitions. By making the data searchable and index-able in search engines so they can be found. The core message  was to adapt the delivery of information to the way the user is most likely to search it – so metadata is relevant when the producer make sure that a search in Google find it.

Jesus Estrada Vilegas from the SmartOpenData project http://www.smartopendata.eu/ discussed the implementation of some ideas that can work within INSPIRE context while providing open data. In particular, he discussed a Spanish and Portuguese data sharing. Within the project, they are providing access to the data by harmonizing the data and then making it linked data. Not all the data is open, and the focus of their pilot is in agroforestry land management. They are testing delivery of the data in both INSPIRE compliant formats and the internal organisation format to see which is more efficient and useful. INSPIRE is a good point to start developing linked data, but there is also a need to compare it to other ways of linked the data

Massimo Zotti talked about linked open data from earth observations in the context of business activities, since he’s working in a company that provide software for data portals. He explored the business model of open data, INSPIRE and the Copernicus programme. From the data that come from earth observation, we can turn it into information – for example, identifying the part of the soil that get sealed and doesn’t allow the water to be absorbed, or information about forest fires or floods etc. These are the bits of useful information that are needed for decision making. Once there is the information, it is possible to identify increase in land use or other aspects that can inform policy. However, we need to notice that when dealing with open data mean that a lot of work is put into bringing datasets together. The standarisation of data transfer and development of approaches that helps in machine-to-machine analysis are important for this aim. By fusing data they are becoming more useful and relevant to knowledge production process. A dashboard approach to display the information and the processing can help end users to access the linked data ‘cloud’. Standarisation of data is very important to facilitate such automatic analysis, and also having standard ontologies is necessary. From my view, this is not a business model, but a typical one to the operations in the earth observations area where there is a lot of energy spend on justification that it can be useful and important to decision making – but lacking quantification of the effort that is required to go through the process and also the speed in which these can be achieved (will the answer come in time for the decision?). A member of the audience also raised the point that assumption of machine to machine automatic models that will produce valuable information all by themselves is questionable.

Maria Jose Vale talked about the Portuguese experience in delivering open data. The organisation that she works in deal with cadastre and land use information. She was also discussing on activities of the SmartOpenData project. She describe the principles of open data that they considered which are: data must be complete, primary, timely, accessible, processable; data formats must be well known, should be permanence and addressing properly usage costs. For good governance need to know the quality of the data and the reliability of delivery over time. So to have automatic ways for the data that will propagate to users is within these principles. The benefits of open data that she identified are mostly technical but also the economic values (and are mentioned many times – but you need evidence similar to the Danish case to prove it!). The issues or challenges of open data is how to deal with fuzzy data when releasing (my view: tell people that it need cleaning), safety is also important as there are both national and personal issues, financial sustainability for the producers of the data, rates of updates and addressing user and government needs properly. In a case study that she described, they looked at land use and land cover changes to assess changes in river use in a river watershed. They needed about 15 datasets for the analysis, and have used different information from CORINE land cover from different years. For example, they have seen change from forest that change to woodland because of fire. It does influence water quality too. Data interoperability and linking data allow the integrated modelling of the evolution of the watershed.

Francisco Lopez-Pelicer covered the Spanish experience and the PlanetData project http://www.planet-data.eu/ which look at large scale public data management. Specifically looking in a pilot on VGI and Linked data with a background on SDI and INSPIRE. There is big potential, but many GI producers don’t do it yet. The issue is legacy GIS approaches such as WMS and WFS which are standards that are endorsed in INSPIRE, but not necessarily fit into linked data framework. In the work that he was involved in, they try to address complex GI problem with linked data . To do that, they try to convert WMS to a linked data server and do that by adding URI and POST/PUT/DELETE resources. The semantic client see this as a linked data server even through it can be compliant with other standards. To try it they use the open national map as authoritative source and OpenStreetMap as VGI source and release them as linked data. They are exploring how to convert large authoritative GI dataset into linked data and also link it to other sources. They are also using it as an experiment in crowdsourcing platform development – creating a tool that help to assess the quality of each data set. The aim is to do quality experiments and measure data quality trade-offs associated with use of authoritative or crowdsourced information. Their service can behave as both WMS and “Linked Map Server”. The LinkedMap, which is the name of this service, provide the ability to edit the data and explore OpenStreetMap and thegovernment data – they aim to run the experiment in the summer so this can be found at http://linkedmap.unizar.es/. The reason to choose WMS as a delivery standard is due to previous crawl over the web which showed that WMS is the most widely available service, so it assumed to be relevant to users or one that most users can capture.

Paul van Genuchten talked about the GeoCat experience in a range of projects which include support to Environment Canada and other activities. INSPIRE meeting open data can be a clash of cultures and he was highlighting neogeography as the term that he use to describe the open data culture (going back to the neogeo and paleogeo debate which I thought is over and done – but clearly it is relevant in this context). INSPIRE recommend to publish data open and this is important to ensure that it get big potential audience, as well as ‘innovation energy’ that exist among the ‘neogeo’/’open data’ people. The common things within this culture are expectations that APIs are easy to use, clean interfaces etc. But under the hood there are similarities in the way things work. There is a perceived complexity by the community of open data users towards INSPIRE datasets. Many of Open Data people are focused and interested in OpenStreetMap, and also look at companies such as MapBox as a role model, but also formats such as GeoJSON and TopoJSON. Data is versions and managed in git like process. The projection that is very common is web mercator. There are now not only raster tiles, but also vector tiles. So these characteristics of the audience can be used by data providers to provide help in using their data, but also there are intermediaries that deliver the data and convert it to more ‘digestible’ forms. He noted CitySDK by Waag.org which they grab from INSPIRE and then deliver it to users in ways that suite open data practices.He demonstrated the case of Environment Canada where they created a set of files that are suitable for human and machine use.

Ed Parsons finished the set of talks of the day (talk link goo.gl/9uOy5N) , with a talk about multi-channel approach to maximise the benefits of INSPIRE.  He highlighted that it’s not about linked data, although linked data it is part of the solution to make data accessibility. Accessibility always wins online – and people make compromises (e.g. sound quality in CD and Spotify). Google Earth can be seen as a new channel that make things accessible, and while the back-end is not new in technology the ease of access made a big difference. The example of Denmark use of minecraft to release GI is an example of another channel. Notice the change over the past 10 years in video delivery, for example, so the early days of the video delivery was complex and require many steps and expensive software and infrastructure, and this is somewhat comparable to current practice within geographic information. Making things accessible through channels like YouTube and the whole ability around it changed the way video is used, uploaded and consumed, and of course changes in devices (e.g. recording on the phone) made it even easier. Focusing on the aspects of maps themselves, people might want different things that are maps  and not only the latest searchable map that Google provide – e.g. the  administrative map of medieval Denmark, or maps of flood, or something that is specific and not part of general web mapping. In some cases people that are searching for something and you want to give them maps for some queries, and sometime images (as in searching Yosemite trails vs. Yosemite). There are plenty of maps that people find useful, and for that Google now promoting Google Maps Gallery – with tools to upload, manage and display maps. It is also important to consider that mapping information need to be accessible to people who are using mobile devices. The web infrastructure of Google (or ArcGIS Online) provide the scalability to deal with many users and the ability to deliver to different platforms such as mobile. The gallery allows people to brand their maps. Google want to identify authoritative data that comes from official bodies, and then to have additional information that is displayed differently.  But separation of facts and authoritative information from commentary is difficult and that where semantics play an important role. He also noted that Google Maps Engine is just maps – just a visual representation without an aim to provide GIS analysis tools.