Non-traditional data approaches and the Sustainable Development Goals workshop

The workshop took place in IIASA, which is located in Laxenburg in Austria. The workshop was hosted by the earth observation and citizen science group at IASSA. The workshop focus on the interface between citizen science, earth observation, and traditional data collection methods in the context of monitoring and contributing to the Sustainable Development Goals (SDGs). A contextual/perspective academic paper is an expected output of the workshop, so this post is only a summary of the opening presentations. There is also an overlap with the aim of the WeObserve project and the communities of practice in it.

The Earth Observation community geared up already to how it can contribute to the SDGs. EuroGEOSS workshop identified several SDGs where there can be a contribution of citizen science: No. 3 in wealth and wellbeing (e.g. greenspace in cities), No. 4 on quality of education, No. 5 in gender equality, No. 6 on water quality and flood management, No. 11 on sustainable cities – air quality, noise, empty houses, No. 14 – plastics, and No. 15 in species monitoring, disease, and finally on Global Partnership (No. 17).

DSCN3119Australian Citizen Science Association view – some awareness to SDGs and few projects that are linked to SDGs explicitly, though there is an issue of details. From the US CSA, the view is that there are projects that can be linked – water monitoring, CoCoRHaS, phenology, and eBird. Examples also include grassroots environmental monitoring, or the Humanitarian OpenStreetMap Team. CitizenScience.Asia is a new network – in the context of China, people collect data to understand the environment, to collect evidence and protect rights, and for pure curiosity. The Blue-map used to report water pollution, it then goes to the government, and after being vetted it is shown, and some of it does not show. There are contributory DNA commercial project, but also “China Nature Watch” or Bauhinia Genome project that asks people to share information in Hong Kong. There are bottom-up projects, which include selling test kits for water which is used by people who share it on an online map – after 400-500 data points, the website was shut down by the government. There is also links to Public Lab – creating an automatic water monitor for flow. DSCN3122 Citizen Science Africa Association (CitSAF) – in Kenya the SDGs is getting attention (following the MDGs). NGOs activities are not synced with the government. Government pay attention to health, water, and education. CitSAF emerged from links to UNEP and focused on Kenya – air quality, some research on Malaria, and they can see interest in Nigeria, South Africa and other countries. CitSAF wants to increase the involvement and responsibility of citizens in African countries towards their natural and socio-cultural environment, especially in monitoring the SDGs. The SDG/CS Maximisation group which works across the citizen science associations (which Libby Hepburn coordinates) pointed out that the challenge is the bottom-up – from practitioners, and top-down from the UN and different countries. There is work on the credibility aspects of citizen science. There are is a need for facilitation between the CS community and the SDG community to progress things. The Citizen Science Global Partnership – launched in December 2017, as a network of networks to support citizen science activities. The global partnership has ideas and interest in working with the SDG but they are aspirational at the moment. They include – a platform for coordinating citizen science under the banner of SDG.

The Stockholm Environment Institute analysis of citizen science and SDGs: SEI has worked on environment/development over 30 years with many participatory activities, and worked explicitly on citizen science for the past 10 years. In the analysis they identified that citizen science can be used to refine and define goals; then monitoring; and even for achieving – e.g. in education, gender. The Citizen Science Centre in Zurich focuses on a platform – to allow projects, knowledge in the area, community of citizens and scientists, and projects. The open seventeen challenge is a good example for challenge-based workshops that help people to develop projects. There is an aim for developing an SDG citizen science toolkit. The Joint Research Centre of the European Commission has created an inventory of citizen science activities and mapping them against the SDGs with results being published soon. In addition, there is an effort of a standard for citizen science data and metadata with links to COST Action effort. There is a potential for recording aspects of participants if that is appropriate in the metadata. There is a specific effort of developing guidelines for environmental reporting in a process that will allow it to be cross EU.

SDSN – Sustainable Development Solution Network set up by the UN for the implementation of the sustainable development, with 800 members of universities, and other groups. Within that, the TRENDS group focuses on data governance? How people can integrate data from new sources. 20 expert members and focus on strengthening the data ecosystem, improve learning and data sharing, developing policies, and inform investment. The work is framed around data governance and use. The POPGRID project is attempting to reconcile different sources of data to get good population estimates. Another UN effort is the UN-GGIM have done work on identifying geospatial sources that can be used in SDG with an analysis to understand the indicators at different tiers – the http://ggim.un.org/UNGGIM-wg6. There is an opportunity to understand which indicators information is considered relevant, and where are data gaps. The thinking about crowdsourced and citizen science data is how to find it how to have metadata, understanding comparability and good usability for an SDG indicator. The is an issue about the global spatial data infrastructure for citizen science and crowdsourced data. There is a need to budget for data management, metadata recording and sharing of information from crowdsourced projects. There is a call for good practises and lessons learnt about the SDG indicators in the sustainable development knowledge platform.

UN Environment pointed that the SDGs includes 244 indicators, and they were developed through the inter-agency and expert group on SDG indicators (IAEG-SDG). The custodian agency is developing a methodology, improving capacity, and getting and using the data. The three types of data include country submission of data, data that is complimented with international estimates, and some global data products. There is an effort to consider a mapping exercise and then think where it can be used. A way forward is to identify one indicator, and try to get it accepted – need to be Tier 3. So the opportunity for citizen science is in an indicator that needs to be tier 3, but without an internationally established methodology or standard.

 

Advertisements

How many citizen scientists in the world?

Since the development of the proposal for the Doing It Together Science project (DITOs), I have been using the “DITOs escalator” model to express the different levels of engagement in science, while also demonstrating that the higher level have fewer participants, which mean that there is a potential for people to move between levels of engagement – sometime towards deeper engagement, and sometime towards lighter one according to life stages, family commitments, etc. This is what the escalator, after several revisions, look like:

DitosEscalator7

I have an ongoing interest in participation inequality (the observation that very few participants are doing most of the work) and the way it plays out and influences citizen science projects. When you start attaching numbers to the different levels of public engagement in science, participation inequality is appearing in this area, too. Since writing the proposal in 2015, I have been looking for indications that will support the estimation of the number of participants. During the process of working on a paper that uses the escalator, I’ve done the research to identify sources of information to support these estimations. While the paper is starting its peer review journey, I am putting out the part that relates to these numbers so this part can get open peer review here. I have decided to use 2017 as a recent year for which we can carry out the analysis. As for geographical scale, I’m using the United Kingdom as a country with very active citizen science community as my starting point.

At the bottom of the escalator, Level 1 considers the whole population, about 65 million people. Because of the impact of science across society, the vast majority, if not all, will have some exposure to science – even if this is only in the form of medical encounters.

However, the bare minimum of engagement is to passively consume information about science through newspapers, websites, and TV and Radio programme (Level 2). We can gauge the number of people at this level from the BBC programmes Blue Planet II and Planet Earth II, both focusing on natural history, with viewing figures of 14 million and about 10 million, respectively. We can, therefore, estimate these “passive consumers” at about 25% of the population.

At the next level is active consumption of science – such as visits to London’s Science Museum (UK visitors in 2017 – about 1.3), or the Natural History Museum (UK visitors in 2017 – about 2.1m), so an estimation of participation at 10% of the population seem justified.

Next, we can look at active engagement in citizen science but to a limited degree. Here, the Royal Society for the Protection of Birds (RSPB) annual Big Garden Birdwatch requires the participants to dedicate a single hour in the year. The project attracted about 500,000 participants in 2017, and we can, therefore, estimate participation at this level at about 1% of the population. This should also include about 170,000 people who carried out a single task on Zooniverse and other online projects.

At the fifth level, there are projects that require remote engagement, such as volunteer thinking on the Zooniverse platform, or in volunteer computing on the IBM World Community Grid (WCG), in which participants download a software on their computer to allow processing to assist scientific research. The number of participants in WCG from the UK in 2017 was about 18,000. In Zooniverse about 74,0000 people carried out more than a single task in 2017, thus estimating participation at this level at 0.1% of the population (thanks to Grant Miller, Zooniverse and Caitlin Larkin, IBM for these details).

The sixth level requires the regular data collection, such as the participation in the British Trust of Ornithology Garden Birdwatch got about 6,500 active participants in 2017 (BTO 2018), while about 5000 contributed to the biodiversity recording system iRecord (thanks to Tom August, CEH) and it will be reasonable to estimate that the participation is about 0.01% of the population.

The most engaged level include those who are engaged in DIY Science, such as exploring DIY Bio, or developing their own sensors, etc. We can estimate that it represents 0.001% of the UK population at most (thanks to Philippe Boeing & Ilia Levantis).

We can see that as the level of engagement increases, the demand from participants increase and the number of participants drops. Not that this is earth-shattering, though what is interesting is that the difference between levels is in order of magnitude. We also know that the UK enjoys all the possible benefits that are needed to foster citizen science: a long history of citizen science activities, established NGOs and academic institutions that support citizen science, good technological infrastructure (broadband, mobile phone use), well-educated population (39.1% with tertiary education), etc. So we’re talking about a best-case scenario.

It is also important, already at this point, to note that UNESCO’s estimates of the percentage of UK population who are active scientists (working in research jobs), is 0.4% which is bigger than the 0.111 for levels 5,6 and 7.  

Let’s try to extrapolate from the UK to the world.

First, how many people we can estimate to have the potential of being citizen scientists? We want them to be connected and educated, with a middle-class lifestyle that gives them leisure time for hobbies and volunteering.

The connectivity gives us a large number – according to ITU, 3.5 Billion people are using the Internet. The estimation of the size of middle-class is a bit smaller, at 3.2 Billion people.  However, we know that participants in most citizen science projects which use passive inclusiveness, where everyone is welcome without an active effort in outreach to under-represented groups, tend to be from people with higher education (a.k.a tertiary education). There is actually data about it – here is the information from Wikipedia about tertiary educational attainment. According to UNESCO’s statistics, there were over 672 million people with a form of tertiary education in 2017. Let’s say that not everyone in citizen science is with tertiary education (which is true) so our potential starting number is 1 Billion.

I’ll assume the same proportion of the UK, ignoring that it present for us the best case. So about 250 million of these are passive consumers of science (L2), and 100 million are active consumer (e.g. going to science museums) (L3). We can then have 10 million people that participate in the once a year events (L4); 1 million that are active in online citizen science (this is more than a one-off visit or trial) (L5); about 100,000 who are the committed participants (mostly nature observers) and about 10,000 DIY bio, makers, and DIY science people (L6 and L7).

Are these numbers make sense? Looking at the visits to science/natural history museums on Wikipedia, level 3 seems about right. Level 4 looks very optimistic – in addition to Big Garden Birdwatch, there were about 17,000 people participating in City Nature Challenge, and 73,000 participants in the Christmas Bird Count, and about 888,000 done a single task on Zooniverse – it looks like that a more realistic number is 3 million or 4 million. Level 5 is an underestimate – IBM Word Community Grid have 753,000 members, and there are other volunteer computing projects which will make it about 1 million, then there were about 163,000 global Zooniverse contributors (thanks to the information from Grant Miller), 130,000 Wikipedians, 50,000 active contributors in OpenStreetMap, and other online projects such as EyeWire etc. So let’s say that it’s about 1.5 Million. At level 6, again the number is about right – e.g. eBird reports 20,000 birders in their peak day. For the sake of the argument, let’s say that it’s double the number – 200,000. Level 7 also seems right, based on estimations of biohackers numbers in Europe.

Now let’s look at the number of scientists globally: in 2013 there were 7.3 million researchers worldwide. With the estimation of “serious” citizen scientists (levels 5,6 and 7) at about 1.7 million, we can see the issue of crowdsourcing here: the potential crowdsourcer community is, at the moment, much bigger than the volunteers.

Something that is important to highlight here is the amazing productivity of citizen scientists in terms of their ability to analyse, collect information, or inventing tools – we know from participation inequality that this tiny group of participants are doing a huge amount of work – the 50,000 OSM volunteers are mapping the world or the 73,000 Christmas Bird Count participants provided 56,000,000 observations or the attention impact of the Open Insulin Project. So numbers are not the only thing that we need to think about.

Moreover, this is not a reason to give up on increasing the number of citizen scientists. Look at the numbers of Google Local Guides – out of 1 Billion users, a passive crowdsourcing approach reached 50 million single time contributors, and 465,000 in the equivalent of levels 5 to 7. Therefore, citizen science has the potential of reaching much larger numbers. At the minimum, there is the large cohort of people with tertiary education, with at least 98 million people with Masters and PhD in the world.

Therefore, to enable a wider and deeper public engagement with science, apart from the obvious point of providing funding, institutional support, and frameworks to scale up citizen science, we can think of an “escalator” like process, which makes people aware of the various levels and assists them in moving up or down the engagement level. For example, due to a change in care responsibilities or life stages, people can become less active for a period of time and then chose to become more active later. With appropriate funding, support, and attention, growing the global citizen science should be possible. 

European Citizen Science Association (ECSA) 2018 Conference – day 2: Beyond the deficit model, inclusiveness, libraries, and

The second and last day of the conference (day 1 is covered here) started early, with a keynote: “Science society continuum: From ‘deficit model’ to social demand on research – the reform of science in progress” Lionel Larqué, FR – [physicist and head the collaboration of education, civil society organisations, and science. Influenced partnerships between science and society on non-deficit model of science.] The organisation ALLISS was set in 2012 – to address Science & Society Continuum. There is a book on “sciences participatives” and it is in French and aimed at the local community. Speak from the French perspective, the founders of the institution that he runs – 1800 members (institutions) and 15-20 years of cooperation. Science-society concepts: seeing it as good answers for the wrong questions, at the background of the public policy – what we can and can’t do. Science/society came from institutions – a structural bias, it came from scientific and European institutions – the reason to start it. It starts with wrong and incomplete data, ideas from the 1970s and 1980s about mistrust of citizens in science. What is the reality of current public view on science is unknown, we don’t know if the questions were well written. The policy was based on scientific prejudice, and assumptions about public mistrust in science – but generally, from 1972 to today, in France 78%-85% people have trust in the knowledge from science (without linking to technology or how science run). There is no strong data that will show the strong mistrust and mix criticism with mistrust. The French science academy is full of non-rational scientists who feed the discourse of public mistrust. A lot of bad reasons for creating agnotological public debate – some scientists want to instrumentalise the public debate. by saying that there is a mistrust, then you can rely on deficit model and ignore the public and that is useful. It also seems obvious to claim that it is obvious, as all institutions face mistrust – politics, media, law and order, and therefore assume that science is also getting it. The pressure on scientists is getting higher and the scientific community is suffering from the pressure – political power, social actors, finance. Scientific institutions are the last trusted institutions and ask to answer all the questions, and the scientists feel pressured by these demands and they see that as a problem that they want people to leave them to their own actions. There is a vicious cycle of address the deficit model because. ALLISS put forward the idea that we need to ask the new question. We need to face institutional walls – they don’t want to accept that society at large is way more educated and therefore scientific institutions need to change. ALLISS tries to figure out the institutional challenge.

The French situation: high level of trust from the public towards science, but criticism towards the institutions. There is a large scale cooperation between civil society organisations and scientific organisations (CNRS, INRS…). The number is very high, but the institutions are not looking at it in their strategic plans – cooperation developed despite institutional policies. In 2001-2009, the World Social Forum, from 8500 workshops, only 70 talked about science and technology. For a lot of social actors, science is outside the frame and in 2007 launched the “science and democracy world forum” – can we share a common view about it? The workshops show that dialogue was not the issue, but what can we change the context – what can you do to change partnerships. Need to change something: policy, concept, etc. . A mass of initiatives won’t be enough to change policy. The barrier of science institutions is a big barrier and it hasn’t changed from the 1970s to today. The main tradition of science is a problem for citizen science – it is put in a box and put into a specific space so it won’t change the bigger institutions. Citizen science dynamics is one that allows us to change things: we need to understand where we came from – design of research and science policies – the key design was for making Europe stronger, rebuilt, and link science and industry. Now there are local actors, local groups, and the science-policy doesn’t have tools that allow that – a non-industrial research policy focused on society is needed. Scientific institutions we have a wider policy alliance. Are the people that work in museums, institutions. Things won’t change the way we want them – they don’t have a sequential process, e.g. feminism impact in scientific study and what helped: bicycle, war, and image in the mass media in the 1960s of women in the media. Changes are not rational, but even when the forces are strong we need both the cumulative experience and the politics. Open science initiative might help us, maybe close to the SDG initiatives and we can explore them through research. We observe that the sociology of citizen science is that a lot of citizen science is coming from institutions that propagate the deficit model and we need to play both with these institutions and the cost are very high. We need to be clear that we need a change, we understand what we can change and what can’t be changed. The Shock Doctrine is something that we need to be aware of it – think outside ourselves. ALLISS and ECSA need to be ready.

Workshop “Empowerment, inclusiveness & equity in community-based research and CS”

Claudia Göbel, Michael Jorganson , ECSA (DE). Notes on https://etherpad.wikimedia.org/p/ECSA2018-EIE and there are issues at Michael: CBR – civil society have issues that need to be addressed by authorities but this need to be documented, There is also need for the development of new knowledge or new proposals (e.g. urban agriculture). Empowerment – knowledge might empower – but not enough, there is also translations and alliances to make it effective. There are sometimes need to figure out new methods in the institution and in society. Working deliberately with empowerment. Claudia – looked at the Soleri 2016: empowerment – capacity to make a change. The terminology can be about equity and inclusiveness. It’s about who is participating, and it builds on conversations that evolve from the CSA conference but also ECSA conference in 2016, workshops in Living Knowledge conference, policy roundtables. From the living knowledge conference, there are different ideas about research, especially different epistemologies of science “distant vs engaged research. The idea of a working ground on empowerment and some activities that a group can do.

 

Barbara Kieselnger – ideas of citizen social science – building on participatory action research, data activism, action research – but we now combine it with other sources. Done a classification of citizen science projects. Different projects that engage citizens, for example, a project in Barcelona and using an existing of environmental activists and political and street actions. Want to understand ozone pollution. The Careables – it’s a project which involves people with physical limitation and maker communication, sharing the co-design openly.

Balint Balazs – pointing about the silence of citizen science in central Europe (same issues at the UCL workshop on Geographical Cit Sci). Making invisible project visibility. Thinking of citizen social science. Aspects of empowerment: autonomy, competence, belonging, impact, meaning, resilience – need to think how they work.

Thomas Hervé Mboa Nkoudou– question the notion of inclusiveness: e.g. a transgender friend that ask about having us as a bigger group to colour a project. Adding a symbolic inclusiveness. In order to put in evidence the power of community – a summit in Ghana on the AfricaOSH – a big conversation about making/ hacking/bio-hacking and to bring together as a community what is the open science mean to us.

Muki Haklay – I’ve focused on passive and assertive inclusiveness, the need for a more nuanced view of participation as we have societal benefits from highly educated people, and the problem of methodological individualism in the analysis of empowerment and inclusion. Call for also a realistic understanding of resources – the more inclusive you are, the more expensive the process of including them is – e.g. the need to morally justify the intelligent maps effort, where each engagement in very expensive.

Libby Hepburn covered the issue of the global initiative of citizen science, which is providing an opportunity for different organisations and programmes to collaborate and the potential of leveraging the SDG to address societal challenges, demonstrate the needs for citizen science applications and use.

The session’s discussion turned to different aspects of inclusiveness and the creation of an ECSA working group.

Speed Talks “Citizen Engagement”

Nina James, University of South Australia (AUS): Strangers, Stewards and Newcomers in CS identities of those that participate – looked at 9 contributory project, 900 participants, and 1400 non-participants. It is very diverse fields – motivated by different things, she found in conservation 49-69 female mostly (70%). Different from non-participants. highly educated, sense of connection to the environment. First identity is environmental stewards – connected to nature, strong awareness, also actively politically engage, and participate in more than one projects. Science enthusiasts – participate in other cit sci, interested in science, interested in technology and confident about it, and less politically active. Also included in a project that there are introverts and extroverts (a project in a museum and also online). The men are topic oriented, motivated in science and technology, and in the outback in the fireballs in the sky that includes 77% men. There are newcomers – motivated by the topic. Millenials are in small percentage. The strangers are haven’t participated in citizen science – less politically engaged, lower education, too many conflicting interests. People are participating in different projects. The participation of female (70%) is an issue – result of an online survey.

Cat Stylinski, University of Maryland Center for Environmental Science (US): Embedded Assessment of Skills in CS. Embedded assessment in citizen science – provides an introduction. Volunteers need to develop skills in citizen science to participate, and this is important to upheld scientific standards. Need to identify the skills, train support, then assess the skills and then a need to think how this work. Assessment includes formal tests, informal observations, and data validation. Embedded assessment is done as people involved in the project – so giving an activity and then developing a rubric to compare what people did. Embedded assessment try to streamline the process – data validation is usually focusing on science variable, and instead of looking at the volunteers and how they learn the approach. Figuring out a new way to integrate the assessment with project’s process.

Kate Lewthwaite, Woodland Trust (UK): Engaging older citizen scientists in the digital era. A painful case study of moving people to a new website – working on woods and working with many volunteers in Nature Calendar – many recorders are over 60 and even 80. Important contributors to phenology. They wanted to move the website to a new system because of the technological change – but some people used the website for 10 years. Consulted with the scientific users of the data on improvements – better location information, ask the number of visits, and improving data about participants. Used persona for the design process. Overall the participants struggle much more than expected. Registration through verification links in email and needed to assist in copy and paste, and need to use an alphanumeric password. They haven’t read the website and couldn’t understand why there was a need to add a security information. The manipulation of mapping (survey123 style of moving the map) was confusing. Don’t do change – there was once a decade to do a change and plan support, expect more staff resources to make it happen, and they needed the support. They talked with 20 interviews and the development team explore the issue with infrequent users, That why they thought that everything is ready. Continue to run a paper-based system. They’ve lost some of the people in the transition, and don’t have the ability to provide an app, yet – it’s planned.

Karsten Elmose Vad, The Natural History Museum of Denmark, University of Copenhagen (DK): What motivates families to do CS? evaluation of the Ant Hunt (mentioned in the previous post) – an experiment of food preferences of ants. Take several hours, capturing ant, and sending them. They focused on families with children 6-13, Denmark doesn’t have an after-school science. Put the researcher on video and she wrote back to participants. got 356 experiments, 260 users, 24 species and 6000 ants. The evaluation shows that for more people having scientists connected to the project it was the majority, and it was valuable for them to get a response from a scientist which coordinated the project – felts that it provide participation in something big and the opportunity to work with a scientist. Valuable cross-generation activity, open-ended experiment, the scientific method. They didn’t care about the competition.

Gaia Agnello, ECSA (DE): Motivations and perceived benefits predict citizen
scientists´ level of engagement. Used the volunteer function index (clary & snider 1998) the analytical framework for voluntarism. Looking how these factors influence the programme – looking through an online questionnaire. 174 responses – more motivated to nature issues. It is important to understand motivation in relation to engagement. The initial motivation is not driving the level of engagement.

Talks  – “Social Innovation”

Tiberius Ignat et al., Scientific Knowledge Services (DE): Working Together: CS and Research Libraries – presented with Paul Ayres of UCL libraries. The request to talk at the conference is about the role of libraries in support activities especially research library – these are areas of research libraries that are important. They have supported organisation, highly standardised, well connected in a network and work well. They build collections or resources, data and material. The manage the incoming and outgoing of scientific communication with researchers and world leaders of open science and advocates of it – pushing open access and are experienced advocates. They are also open to innovation and work through transformation for all their roles. Fun people, centrally located, and also have a culture of being politeness towards answers. They have 10 major skills: collaboration between libraries, they have communication skills, have a FAIR concept that is integrated into their practices, good in infrastructure and governing it. They have experience in maintaining and curating collections. They have experience in open access, connecting people. They have demonstrated advocacy as a network – open access and fees campaign for example. The confluences are areas of opportunities – skills development, support, collection, FAIR data, infrastructure, evaluation, communication – general skills but also in the recruitment of volunteers, marketing and in advocacy. In 2017 there was a set of presentation on the “Roles for libraries in the Open Science landscape” and done 12 presentations and in 2918 presenting on 2018 “Focus on Open Science”. There is a demand for citizen science in these events. Looking at the OSPP of the EU, citizen science is one of the 8 pillars of open science. There is a consistent line of supporting open science in 2016 in Amsterdam, then in the OSPP which just produced a recommendations on citizen science, and LERU advice paper on open science in May 2018. Library engagement in citizen science – an example from UCL East – UCL library thinking about a local oral history in the borough of Newham. The other example is the Transcribe Bentham is the crowdsourcing with 624 and it is very cost effective – an example of contribution through the special collection . Another example is the establishment of university press that is dedicated to Open Access . The answers – why do citizens collaborate? What is the motivation to volunteers? and so on. Libraries have a very important role and there is an open survey at knowledge.services/citizenscience

Susanne Hecker et al., Helmholtz Centre for Environmental Research
UFZ/German Centre for Integrative Biodiversity Research (iDiv) Halle-JenaLeipzig (DE): Innovation in and with CS. The journey about the ECSA 2016 and the development of the new open access book from the conferences –  bringing the experiences of the conference, bringing 120 researchers and what we can expect from the book – 29 chapters in 5 sections. Part 1 is about innovation in citizen science – setting the scene: it will include the description of the Ten Principles of citizen science, standards for citizen science, then the contribution on scientific impact, my chapter on participation in citizen science, then technology and infrastructure and evaluation. Part II, focus on questions on society – understanding the social theory, empowerment and scientific library, inclusiveness, support (technically and socially) and the integration with the higher education system. We have 40 case studies in the book, but in particular in China, Europe, Global mosquito alert, and water quality. The third part, focus on the science-policy interface, including policy formulation with an input from people at the EC and from Environmental Protection Agencies, also Responsible Research and Innovation. The next section is the innovation in technology and environmental monitoring (part IV) and it looks at technologies, light pollution, data protocol, and national monitoring programmes. The last part – section V – looking at science communication and education – making it education, addressing science capital through citizen science, children, school education, and stories that change the world. Key recommendation complete the book. The discussion included questions about the production of the book at open access and the need to promote it to policymakers and to wider audiences

Closing session

Claudia Appenzeller-Winterberger-  – citizen science is engagement of scientists and of the citizens, and you need to think why are we doing it? Is we summarise the dialogue, it is about the question of scientists and let the public ask questions. Thinking global and acting local. We will have to think about these new questions: a lot of it is testing and doing citizen science.

Lessons learned from Volunteers Interactions with Geographic Citizen Science – Morning session

On the 27th April, UCL hosted a workshop on the “Lessons learned from Volunteers Interactions with Geographic Citizen Science“. The workshop description was as follows:

“A decade ago, in 2007, Michael Goodchild defined volunteered geographic information (VGI) as ‘the widespread engagement of large numbers of private citizens, often with little in the way of formal qualifications, in the creation of geo­graphic information, a function that for centuries has been reserved to official agencies.’ (p.2). The collection and use of this type of crowdsourced geographic data have grown rapidly with amateurs mapping the earth’s surface for all kind of purposes (e.g. collecting and disseminating information about accessibility in urban centres, for crisis and emergency response purposes, mapping illegal logging in remote areas and so on). A subset of these activities has been described as ‘geographic citizen science’ and includes scientific activities in which amateur scientists (volunteers) participate in geographic data collection, analysis and dissemination within the context of a scientific project (Haklay, 2013) or simply by using scientific methods and equipment. Although, there is an extensive discussion in the VGI and geographic citizen science literature about opportunities as well as implications (e.g. data coverage, data quality and trust issues, motivation and retainment of volunteers and so on), examples from the actual interaction are not so widely discussed, neither has evidence been collected from a broad spectrum of case studies to demonstrate how volunteers interact with those technologies and applications, what they are looking for and what it is that they need/try to accomplish (at a scientific, project and personal level) and what are the common design mistakes that influence interaction.” The following is a summary of the talk and presentations:

Welcome & Instructions – Artemis Skarlatidou the workshop is linked to our ERC funded project Intelligent Maps (ECSAnVis) and  EU funded Doing It Together science (DITOs) and the COST action – our work deal with geographical applications of citizen science and data collection. There is the COST Action CA15212 which got 243 members in 39 countries – all exploring aspects of citizen science – Work Group 1 (WG1) for scientific quality, WG2 education, WG3 society-science policy, WG4 the role of volunteers in citizen science, WG5 data and interoperability, and the synergies in WG6. In WG4, which Artemis lead. we’re looking at stakeholder mapping, motivation, needs and interaction issues, and mapping citizen science across Europe. Another relevant group is the ICA Commission on use user and usability issues, the International Society for Photogrammetry & remote sensing that have a WG V/3 that look at citizen science and crowdsourced information. Sultan Kocaman explained the ISPRS link – WG V/3 focus on the promotion of regional collaboration in citizen science and geospatial technologies within the focus of ISPRS area of education and outreach.

Louis Liebenberg presents Smartphone Icon User Interface design for Oralate Trackers – Louis Liebenberg who for 3 decades have been developing software to allow hunter gatherer to protect their knowledge of tracking. One of the challenges that Louis address is the understanding how our scientific thinking evolved. Louis suggests that tracking is an example for hypothesis testing and rational thinking that evolved in in tracking by hunter gatherers. He worked with !Nate from the San people since 1985 – the context of technology use by San for a long time. Already 100 years ago, hunters discovered that arrow points can be made from fence wire and started using them. This is an example of how hunter-gatherers adopt to technologies around them. Hunter-gatherers are not isolated: they always interacted and traded. Developing a software for a smartphone (you can get an Android phone for $10 in South Africa today), is similar to adopting the fence wire for the arrows 100 years ago. He learned from master trackers – the level of sophistication of trackers is astonished him since the mid 1980s. In the Kalahari, dogs were introduced in the 60s, and therefore the knowledge of tracking and the practices of hunting change. He used tracking and certification in it in order to secure employment. Master trackers are expected in an egalitarian society to show humility, so it is possible to miss them if you go and ask “who’s the best tracker here?” – the certification is a way to provide recognition and work. The tracking provided employment in the 1990s in surveying the movement of animals in the Kalahari. The persistent hunt – when you do it without any equipment, running animals down until they die from exhaustion which is an adaptation that humans have to be able to do that. Karoha was one of the persistence hunters but also able to use CyberTracker and use the system. Parallel to the software, Louis develop the tracker certification, to know if the data is reliable. As Master Trackers die, the knowledge is lost, so the certification provides an opportunity to encourage the younger generation to develop the knowledge and benefit from it. The level of details in animal tracks is very high. There is a high level of ambiguity in tracking and requirement to learn about claw marks and knowing what are the possibilities then it is possible with high certainty to understand which animal it was. Trackers also develop hypotheses on why the shape of hoofs is the way it is, and interpret activities of animals from the track – for example, identifying new ways of interpreting the behaviour of an animal that was not observer before. For example, the ability to guess that caracals are jumping upright in an attempt to catch a bird. CyberTracker started with the early Apple Newton with a GPS module, and then evolved into the Palm Pilot and continue to evolve. The interface was very limited in drawing icons – icons are either phonetic symbols (e.g. using a wheelbarrow to describe an item that sounds similar to the word in Africans). The details can be very extensive – species, age, number, male/female and so home. The data can provide information on abundance and potential of work are the communities. In a project in the Congo, they follow the trackers of different animals and they could show they Ebola impact Chimpanzees, Gorilla, but also other animals and then this was important to understand that you can identify Ebola in wildlife before it spreads into the human population. There is also a wide use of CyberTracker in citizen science on monitoring endangered species, and different projects by indigenous communities  Australia. They can also show that there are different results from what ecologists identify. A paper from 1999 about Rhino was co-authored by a tracker, demonstrating different models of publishing with citizen scientists. The first high impact that was co-authored by trackers was published recently in biological conservation. Questions: how to communicate from hypothesis by hunter-gatherers to the scientific sphere? The need is collaboration: data collected and organised by the trackers, and then the scientists write the report, but providing a report is challenging. The reality is co-authoring as there is always need for mentoring, reciprocal approach between scientists. Louis also circulates papers with experienced scientists to improve the paper. We all need peer review support. In terms of consent and engagement: there is a need to develop the relationship of trust and understanding – the first people who were involved in CyberTracker worked with Louis for 5 years, and Louis engaged as a tracker before they were willing to work with him. Some of the early papers in the Kalahari used trackers without mentioning their name even though the trackers carried out the research. Scientific institutions are one of the last authoritarians institutions – citizen science. Scientific elitism is intransigent and this makes citizen science exciting.

Lessons from supporting non-literate forest communities in the Congo-Basin to record their Traditional Ecological Knowledge – Michalis Vitos & Julia Altenbuchner the context of the Congo-basin is the second largest rainforest. This is a forest with 29 million people, with at least 500,000 nomadic communities that rely on resources. The forest is divided into concessions and then they are sued for resource extraction – how to make local groups heard? Local communities are excluded from protected areas. In the last few years, some legislation is changing – e.g. the FLEGT of the EU to control timber import and request for social payback and responsibility. ExCiteS collaborated with communities to support such process with technology. The challenges are dealing with non-literate groups who are also non-technologically literate. We use pictures as a way to communicate: the application working in a simple fashion – showing categories of things that people want to map, each category is leading to more specific options – the information can be captured and deciding if we want to save information and we can collect video and audio that are geotagged. In 3 simple steps, information can be captured. The process starts with a dialogue of what important for the communities, and then with this agreement on what will be collected. We do explore the usability of the application. About 70% can use the application, but 30% have a problem with categories – you follow a path of mapping banana, avocado and cacao – this requires categories, e.g. one of the set. Some participants found that confusing. Adding more icon to the category is becoming more complex. One approach was to test audio feedback in a local language – explaining the icons and what they mean. The experiments with the audio feedback help a bit, but not a lot. The next step was to go directly to the final icons and go directly to the final card – adding an NFC chip and adding the control to it. Participant finds the specific icon and then touch the card with the phone. With Tap&Map the success rate gets close to 100%.

Julia – the next issue is making sure that communities can manage their data- the vision is of intelligent maps  – having data collection, then local data repository and management, and then visualisation. But there is a challenge of the mapping and this was done by using UAVs and creating within a short time a high-resolution imagery. However, people don’t need maps as they know their area, but the maps are for communication. The maps are being used to check how the map is used – people felt under a lot of pressure when using the map. and the next experiment was not to put under pressure, and instead of doing a treasure hunt: going and looking for data by trying to find German Christmas decorations. The tracks of the people who participated in the study we can see how they looked for information. What we know is that people can use maps and understand them – the reference map. Now we want the thematic information – so when people take ownership and correct issues: this was done using the icons that were used as a resource and then to correct information. People were doing well in correcting information using a Tap&Map approach. We get feature corrections over 90%. This an ad-hoc approach: even without much exposure – we need to allow people to be sensors and the brains behind it.

Forest hunter-gatherers and Extreme Citizen Science: Reporting wildlife crime in collaboration with local and indigenous communities in Cameroon through community-led co-design – Simon Hoyte work in Cameroon for the last year and a half with Baka hunter-gatherers. Working in Cameroon in the south-east corner.Working with Dja reserve, working ZSL and 5 communities. In Cameroon, there are many issues with conservation – gorillas, chimpanzees, parrots, pangolins and elephants. Indigenous communities are lots of time are forgotten – those groups are familiar with the forest, with knowledge of 50,000 years and colonial approaches exclude. The technologies that are being used are Sapelli data collection tool, then there is the data management tool GeoKey and the CommunityMaps from Mapping for Change. The process starts with the community free prior informed consent – first starting with the concerns of the community and also building trust by staying overnight in the village and connect on a personal level. That is an important recommendation. Icons are being drawn from the sand, to a paper and then into the app. Functional actions changed from tick to thumbs app, or recording changed. XML layout of the project allow changes in the field. The second recommendation is the co-design that increases motivation. Audio and video are allowing information to be shared, including tracks – it allows a verification. Audio provides more information. Describing what people found. Indicators on the device are important – when recording is active a red icon allows you to see that something is working. The phone is checking for connection every 4 minutes. Using ID screen to recognise reported – can be used elsewhere. The community protocol also addresses who manage who will manage the phone and look after it. The report is upload and shared with the authorities – we need the diverse outcome. So in summary: trust building, co-design, media, feedback, simple tools, anonymous ID, community-led, and diverse outcomes. The map providing further more information.

Community based monitoring of tropical forests using information and communication technology (ICT) – Søren Brofeldt an example for a study that rely on Sapelli and expand the software to create the Prey Lang App: working in Cambodia, in the Prey Lang – 200,00 people who rely on the forest, and huge pressure of deforestation and a lot of the logging is illegal and it is supposed to be protected. The Prey Lang Community Network (PLCN) created around 2005-2007 and it is now a group of 600 people who are doing work over that last 10 years, and patrolling the area, confiscating chainsaw and catch wood and logs. Trying to address logging in the area. 2013 they try to communicate the problem to international society – to do what they wanted to set a forest monitoring programme and create a system to document illegal logging and provide evidence-based advocacy. The issue is to compile information and document breaches. The data is captured by Sapelli, and the information is validated by PLCN and scientists, which then helped in compiling report locally and globally, which then led to the positive platform. The platform was tweaked a bit and include information through a decision tree, they have different aspects. The things that they developed: unique functions – choosing icons or doing activities – they had basic activities in the first version: they have seen it as too simple. They started with 9 basic functions with 614 end-points of activities. By the third version, they had 9 functions, and 1663 options: types of trees, types of information, species and so on. They now have 10 functions (e.g. dropdown, word complete). Complexity does not lead to incorrect use (if training is adequate and added functionality is done in co-designed way). When people are experienced – people who use the app for 2 years can get into more complex functionality over time. Some of the issues with data – poor documents, double counting. over time, human errors are decreasing, and also technical issues. Poor connectivity and technical issue are a major issue – more than local ability to use. High quality is possible with active data management is needed.

Designing Human-Computer Interaction for Citizen Science Initiatives in Rural Developing Regions – Veljko Pejovic & Artemis Skarlatidou we need to understand how we move initiative from developed to developing regions in citizen science application. ICT4D point to environmental constraints: roads, electricity, There are also that this area lack skills in the workforce and cultural constraints. Clashes with assumptions. in the Extreme Citizen Science context: we need to identify solution adaptation in participatory design, there is a need for holistic implementation, and we need to make sure that we think about the whole process – from data collection to policy and this challenging. Finally, we also to consider the champions and engaging then (the book “Geek Heresy” by Toyama talks about it). The aim is to identify guidelines – this was done through participatory studies that are similar in the rural developing world and carried out 9 interviews with researchers with extensive experience in the field. An hour-long interviews x 2. The questions explored different aspects including interactions. The finding – need to mobilise the community by taking into account societal organisation (e.g. egalitarian aspects). Need to find local champions. We need to identify the ecosystem of the technology: chargers, cables. Also need to consider how the technology that was built to a different context work: rough fingertips, reflection in the screens and so on. There is also the issue of using hierarchical icon organisation which is pretty intuitive for educated people but it is challenging for participants (users) and also navigation buttons. This matches evidence from Medhi et al. Chi 2013. Juxtaposing this with illiterate users in urban Brazil, they managed to deal with hierarchical organisation and navigation – might be that the exposure to smartphones helped in developing these hierarchies. Icon design is different, but we can see that realistic icons with context are more suitable to use, not just an object. There are issues of actions and how to represent them. Getting honest feedback on the spot is a challenge – users don’t criticise before (Dell et al CHI 2012 – “yours is better”). Long trust relationship help in getting honest feedback. The participants lack the vocabulary to discuss HCI issues. To maintain motivation, there is a need to make data collection visible and ensure the real-world impact of data collection. Recommendation: develop context-specific apps – not genetic, and consider application interface that matches user’s skills and geographical information is a key.

Introducing user issues of the Global Forest Watch application – Jamie Gibson – developing with Vizzuality better maps and visualisation. Trying to think of citizen-focused GIS, interacting with the citizen in the design. Global Forests Watch (GFW) was developed in the last 3 years, and it is allowing to see the world’s forest and how they change. They wanted to tell a simple story: where forest is gained and lost. With few clicks, you can see the impact of conservation. GFW allow seeing how deforestation is implemented and how it is stopped. There is a need for global engagement – opening it to a whole crowd of people. Forest don’t have a connection to the web, and try to take data online to the field, walk to the area, investigate recent forest loss and report new areas – 4000-5000 users. They aim to integrate citizens into the design process. Forest Watcher is being used in important areas of the world and not where the most connected people area. They analyse where people use the app – when there are forest fires in Spain, people are updating GFW and explore. Use the analytics to find the places where we want more people to look and explore. This is integrated with interviews and usability testing. Working with experts who been working for a long time – including Jane Goodall Institute, Amazon Conservation Team, CAGDF, and BirdLife. As people use the application they build ownership and they provide a better feedback and richer information. In terms of what they learn, including the use of persona to think about monitors: need to have lots of other things that try to sync after the 14 days offline – the internet is slow and changed the app and the back end to make it faster. Use it to understand frustrations and find ways to wow moment. Face, name and story improve the quality of the thinking and understand their frustrations.

Lessons learned from Missing Maps – Jorieke Vyncke Her personal background is in interest in work that links to humanitarian purposes, and since 2017 is the missing maps coordinator. She is looking at the humanitarian organisation focus -more than 34,000 staffers in MSF and about 470 locations around the world. In many parts of the world there are empty maps and not geographical data. They discover OpenStreetMap and working with the American and British Red Cross, HOT and over 40 partners. They have principles from the Ostrom on working with groups. They compare rural and urban parts. In Idjiwi in DRC, the east of Congo – working with a multitude of problems: violence, refugees and more. Due to a measles outbreak, they needed population and mapping data. Included 250 remote volunteers who mapped 28,000 building in about a week. This helped in creating population estimation – critical for the logistical planning. They managed to identify 94% of the population. An example from Bangladesh in Hazaribagh informal settlement. The area was mapped with both local and remote mapping – including factories and tanneries – locating the workers that they wanted to reach – combining students from the university with workers that were reached through the union. The experience of mapping is done by the technical local students to make things happen. Using smartphones and field papers process. Paper is still effective, and then also the edit data in pairs on how to do the mapping – the end result provided an occupational health survey. The process motivated the community and they continue to use it. In different areas, they use remote mapping but the most important thing is to create a local mapping community and that makes a decision between empowerment and remote mapping with the importance of saving life.

Keynote: Approximated Reality: the use of digital tools by traditional communities in the Amazon – Vasco van Roosmalen working in Ecam – Equipe Conservacao Amazonia in Brazil since 1999. The big challenge is how to reconcile different visions of what the world is. In the Xingu area in Brazil, there was a need to create an ethno-map of the region. The community discusses what they want to map and how they want to represent them, but it also needed to be cartographically accurate as this is how you communicate with external bodies. The whole map is created for the community: to use resources, to remember the dead and to defend their land (using patterns of body paint). We can see that protected areas in the Xingu. Another area that he was involved in mapping is near Surinam – in an area the size of Holland with 2000 people, the community recorded information about their region. This helped in justifying the resources and the protection of the area. An area that is very rough to access, and the local survey by the community managed to map the area done that in 6 maps. The community collected much more data than what the map can show – over the coming years, they mapped with different groups millions of hectares and they developed a process of creating the maps. The collaboration with Google Earth Outreach led to the interaction with Chief Amir of the Surui. The link with commitment with Rebecca Moore helped in filling up areas that are missing and attaching video and audio to the map. They then wanted to record illegal logging using mapping tools and this was done with OpenDataKit – the data collection challenges are accuracy, ease of use, speed, etc. In 2008 started to understand REDD and developed the Surui Carbon Project – need a tremendous amount of data from the air and from the ground. The use of information such as the circumference of trees was done with ODK. They use Garmin devices: they weren’t scratch resistance. Now they use a Samsung smartphones that are cheap and can be replaced easily. For the GPS in the rainforest, it is challenging and they use barcode on the trees. They used the ODK build but discovered that it is not an easy interface: using a programmer in the staff and that is a limitation in terms of allowing to build forms easily. The project managed to demonstrate that indigenous people can collect data but the REDD credits were more challenging and they got them in 2013. Cultural maps where created in other indigenous lands in Brazil. The importance not just to demarcate the land but to collect data and help them to manage the area. Today there are many challenges – 13% of the Brazilian territory. In the Brazilian Amazon, there are many communities – 25 mil people of which only 350,00 indigenous for example, Quilombola groups and many other groups. There was no information on other groups and some of them are disadvantaged – e.g. Quilombola required mapping 7000 communities, they are descendent of West African slaves – they were persecuted, faced a lot of violence, and when slavery was abolished they were forgotten, but from the 1980s they are recognised in the constitution, but not enough recognised officially. His team was involved in creating a new map of the 7000 communities for which only on a team of 40 is looking after in the government level in Brasilia. They used approaches that are similar to the Indigenous mapping in order to record information and manage the land. They had people who became experts in mapping and then demonstrating how to map the land using google earth and demonstrating data collection. The communities also collect socio-economic data – using ODK and understanding their community and developing a life plan for the area (plan for the next 10-30 years). The question is who is listening to the information but by whom. A social network analysis of Facebook (which is 83% of users in Brazil use) Looking at interactions show that local association are not linked to environment, human right and there is missing links to health, to a specific campaign on the Belo Monte Power Plant but it is not linked to the community. They care about health, education, income, and only fifth is the environment – need to talk about what matters to communities. How to make conversations about them in the centre of the discussion and move beyond putting them in the corner of the environment. We need to engage with people with their communities in a way that makes sense to them.

 

 

 

 

 

 

 

GSF-NESTI Open Science & Scientific Excellence workshop – researcher, participants, and institutional aspects

The Global Science Forum – National Experts on Science and Technology Indicators (GSF-NESTI) Workshop on “Reconciling Scientific Excellence and Open Science” (for which you can see the full report here) asked the question “What do we want out of science and how can we incentivise and monitor these outputs?”. In particular, the objective of the workshop was “to explore what we want out of public investment in science in the new era of Open Science and what might be done from a policy perspective to incentivise the production of desired outputs.” with an aim to explore the overarching questions of:
1. What are the desirable (shorter-term) outputs and (longer-term) impacts that we expect from Open Science and what are potential downsides?
2. How can scientists and institutions be incentivised to produce these desirable outcomes and manage the downsides?
3. What are the implications for science monitoring and assessment mechanisms?

The session that I was asked to contribute to focused on Societal Engagement: “The third pillar of Open Science is societal engagement. Ensuring open access to scientific information and data, as considered in the previous sessions, is one way of enabling societal engagement in science. Greater access to the outputs of public research for firms is expected to promote innovation. However, engaging with civil society more broadly to co-design and co-produce research, which is seen as essential to addressing many societal challenges, will almost certainly require more pro-active approaches.
Incentivising and measuring science’s engagement with society is a complex area that ranges across the different stages of the scientific process, from co-design of science agendas and citizen science through to education and outreach. There are many different ways in which scientists and scientific institutions engage with different societal actors to informing decision-making and policy development at multiple scales. Assessing the impact of such engagement is difficult and is highly context and time-dependent“.

For this session, the key questions were

  • “What do we desire in terms of short and long-term outputs and impacts from societal engagement?
  • How can various aspect of scientific engagement be incentivised and monitored?
  • What are the necessary skills and competencies for ‘citizen scientists’ and how can they be developed and rewarded?
  • How does open science contribute to accountability and trust?
  • Can altmetrics help in assessing societal engagement?”

In my talk, I’ve decided to address the first three questions, by reflecting on my personal experience (so the story of a researcher trying to balance the “excellence” concepts and “societal engagement”), then consider the experience of the participants in citizen science projects, and finally the institutional perspective.


I’ve started my presentation [Slide 3] with my early experiences in public engagement with environmental information (and participants interest in creating environmental information) during my PhD research, 20 years ago. This was a piece of research that set me on the path of societal engagement, and open science – for example, the data that we were showing was not accessible to the general public at the time, and I was investigating how the processes that follow the Aarhus convention and use of digital mapping information in GIS can increase public engagement in decision making. This research received a small amount of funding from UCL, and later from ESRC, but not significantly.

I then secured an academic position in 2001, and it took to 2006 [Slide 4] to develop new systems – for example, this London Green Map was developed shortly after Google Maps API became available, and while this is one of the first participatory GIS applications on to of this novel API, this was inherently unfunded (and was done as an MSc project). Most of my funded work at this early stage of my career had no link to participatory mapping and citizen science. This was also true for the research into OpenStreetMap [Slide 5], which started around 2005, and apart from a small grant from the Royal Geographical Society, was not part of the main funding that I secured during the period.

The first significant funding specifically for my work came in 2007-8, about 6 years into my academic career [Slide 6]. Importantly, it came because the people who organised a bid for the Higher Education Innovation Fund (HEIF), realised that they are weak in the area of community engagement and the work that I was doing in participatory mapping fit into their plans. This became a pattern, where people approach with a “community engagement problem” – so there is here a signal that awareness to societal engagement started to grow, but in terms of the budget and place in the projects, it was at the edge of the planning process. By 2009, the investment led to the development of a community mapping system [Slide 7] and the creation of Mapping for Change, a social enterprise that is dedicated to this area.

Fast forward to today [Slide 8-10], and I’m involved in creating software for participatory mapping with non-literate participants, that support the concept of extreme citizen science. In terms of “scientific excellence”, this development, towards creating a mapping system that anyone, regardless of literacy can use [Slide 11] is funded as “challenging engineering” by EPSRC, and as “frontier research” by the ERC, showing that it is possible to completely integrated scientific excellence and societal engagement – answering the “reconciling” issue in the workshop. A prototype is being used with ZSL to monitor illegal poaching in Cameroon [Slide 12], demonstrating the potential impact of such a research.

It is important to demonstrate the challenges of developing societal impact by looking at the development of Mapping for Change [Slide 13]. Because it was one of the first knowledge-based social enterprises that UCL established, setting it up was not simple – despite sympathy from senior management, it didn’t easily fit within the spin-off mechanisms of the university, but by engaging in efforts to secure further funding – for example through a cross universities social enterprise initiatives – it was possible to support the cultural transformation at UCL.

There are also issues with the reporting of the impact of societal engagement [Slide 14] and Mapping for Change was reported with the REF 2014 impact case studies. From the universities perspective, using these cases is attractive, however, if you recall that this research is mostly done with limited funding and resources, the reporting is an additional burden which is not coming with appropriate resources. This lack of resources is demonstrated by Horizon 2020, which with all the declarations on the importance of citizen science and societal engagement, dedicated to Science with and for Society only 0.60% of the budget [Slide 15].

Participant experience

Alice Sheppard presenting her escallatorWe now move to look at the experience of participants in citizen science projects, pointing that we need to be careful about indicators and measurements.

We start by pointing to the wide range of activities that include public engagement in science [Slide 17-18] and the need to provide people with the ability to move into deeper or lighter engagement in different life stages and interests. We also see that as we get into more deep engagement, the number of people that participate drop (this is part of participation inequality).

For specific participants, we need to remember that citizen science projects are trying to achieve multiple goals – from increasing awareness to having fun, to getting good scientific data [Slide 19] – and this complicates what we are assessing in each project and the ability to have generic indicators that are true to all projects. There are also multiple learning that participants can gain from citizen science [Slide 20], including personal development, and also attraction and rejection factors that influence engagement and enquiry [Slide 21]. This can also be demonstrated in a personal journey – in this example Alice Sheppard’s journey from someone with interest in science to a citizen science researcher [Slide 22].

However, we should not look only at the individual participant, but also at the communal level. An example for that is provided by the noise monitoring app in the EveryAware project [Slide 23] (importantly, EveryAware was part of Future Emerging Technologies – part of the top excellence programme of EU funding). The application was used by communities around Heathrow to signal their experience and to influence future developments [Slide 24]. Another example of communal level impact is in Putney, where the work with Mapping for Change led to change in the type of buses in the area [Slide 25].

In summary [Slide 26], we need to pay attention to the multiplicity of goals, objectives, and outcomes from citizen science activities. We also need to be realistic – not everyone will become an expert, and we shouldn’t expect mass transformation. At the same time, we shouldn’t expect it not to happen and give up. It won’t happen without funding (including to participants and people who are dedicating significant time).

Institutional aspects

The linkage of citizen science to other aspects of open science come through DITOs bus in Birmingham participants’ right to see the outcome of work that they have volunteered to contribute to [Slide 28]. Participants are often highly educated, and can also access open data and analyse it. They are motivated by contribution to science, so a commitment to open access publication is necessary. This and other aspects of open science and citizen science are covered in the DITOs policy brief [Slide 29]. A very important recommendation from the brief is that recognition that “Targeted actions are required. Existing systems (funding, rewards, impact assessment and evaluation) need to be assessed and adapted to become fit for Citizen Science and Open Science.”

We should also pay attention to recommendations such as those from the League of European Research Universities (LERU) report from 2016 [Slide 30]. In particular, there are recommendations to universities (such as setting a single contact point) and to funders (such as setting criteria to evaluate citizen science properly). There are various mechanisms to allow universities to provide an entry point to communities that need support. Such a mechanism is called “science shop” and provide a place where people can approach the university with an issue that concerns them and identify researchers that can work with them. Science shops require coordination and funding to the students who are doing their internships with community groups. Science shops and centres for citizen science are a critical part of opening up universities and making them more accessible [Slide 31].

Universities can also contribute to open science, open access, and citizen science through learning – such as, with a MOOC that designed to train researchers in the area of citizen science and crowdsourcing that we run at UCL [Slide 32].

In summary, we can see that citizen science is an area that is expanding rapidly. It got multifaceted aspects for researchers, participants and institutions, and care should be taken when considering how to evaluate them and how to provide indicators about them – mix methods are needed to evaluate & monitor them.

There are significant challenges of recognition: as valid excellent research, to have a sustainable institutional support, and the most critical indicator – funding. The current models in which they are hardly being funded (<1% in NERC, for example) show that funders still have a journey between what they are stating and what they are doing.


Reflection on the discussion: from attending the workshop and hearing about open access, open data, and citizen science, I left the discussion realising that the “societal engagement” is a very challenging aspect of the open science agenda – and citizen science practitioners should be aware of that. My impression is that with open access, as long as the payment is covered (by funder or the institution), and as long as the outlet is perceived as high quality, scientists will be happy to do so. The same can be said about open data – as long as funders are willing to cover the costs and providing mechanisms and support for skills, for example through libraries then we can potentially have progress there, too (although over protection over data by individual scientists and groups is an issue).

However, citizen science is opening up challenges and fears about expertise, and perceptions about it risking current practices, societal status, etc. Especially when considering the very hierarchical nature of scientific work – at the very local level through different academic job ranking, and within a discipline with specific big names setting the agenda in a specific field. These cultural aspects are more challenging.

In addition, there seem to be a misunderstanding of what citizen science is and mixing it with more traditional public engagement, plus some views that it can do fine by being integrated into existing research programmes. I would not expect to see major change without providing a clear signal through significant funding over a period of time that will indicate to scientists that the only way to unlock such funding is through societal engagement. This is not exactly a “moonshot” type funding – pursue any science that you want but open it. This might lead to the necessary cultural change.

OECD Open Science and Scientific Excellence Workshop – Paris

The OECD organised and hosted a Global Science Forum (GSF) and National Experts on Science and Technology Indicators (NESTI) Workshop on  “Reconciling Scientific Excellence and Open Science: What do we want out of science and how can we incentivise and monitor these outputs?” (9 April, 2018, OECD). In agreement with the OECD Secretariat, the information here is not attributed to anyone specific (Here is the blog post about my own presentation).

The workshop opened with the point that speaking about reconciling open science and science seem contradictory. Scientific excellence was based on the value of publications, but the digital transformation and the web have changed things – from elite access to a library where outputs are held to one that is available to everyone over the web, and we can see citizens accessing data. We also need to look at the future – opening even more, which is something positive but there are challenges in measuring, the impact of different bibliometrics and other indicators.

The openness happens quickly, and we need to understand the transformation and then think about the statistical aspects of this information. There is an effort of developing a roadmap to see the integration of open science across science policy initiatives.

The area is fairly complex: excellence, how science is changing, incentivise and measuring science – all these are tightly related to each other. Some of the fundamental questions: what do we want from science? only excellence or other things? How can we incentivise the academic community to move in the direction of open science – and what the policy community of science need to do about it. National Statistical communities and Global Science Forum are two important groups that can influence it in terms of policy and the measurement the impacts and processes.

The meeting is looking at open science, publishing, open data, and engagement with society, as well as indicators and measurement.

The slides from all the talks are available here. 

Session 1. Scientific excellence through open science or vice versa? What is excellence and how can it be operationalised in the evidence and policy debate?

Paula Stephan (Georgia State University, USA) addressed the challenges of science – lack of risk-taking, and lack of career opportunities to Early Career Scientists in their research. The factors that impact that – especially short-term bibliometrics and then, how open science can help in dealing with the issues.

The original rationale for government support science is the high risk that is associated with basic research. The competitive selective procedures reducing risk and leading to safer options to secure funding (including NIH or ERC). James Rothman who won Nobel prize in Physiology pointed that in the 1970s there was a much higher level of risk that allows him to explore things for 5 years before he started being productive. Concerns about that aspects appeared by AAAS in 2008 ARISE report, and NASA and DARPA became much more risk-averse.

In addition, there is lack of career opportunities for ECRs – the number of PhD is growing, but the number of research position declining – both in industry and academia. Positions are scare and working in universities is an alternative career. Because of the way that the scarce jobs or research applications are based on short citation windows – high impact journal paper is critical for career development. Postdocs are desperate to get a Nature or Science paper. Assessment of novel papers (papers that use references never before made together) showed that only 11% of papers are novel, and highly novel papers is associated with risk: disproportionate concentration at the top and bottom in citations distribution, and also get cited outside the field. The more novel the paper is, the less likely it is to appear in high ranking journal. The bibliometrics discourage researchers from taking these risks with novel paper.

Open science gives opportunity – citizen science give an opportunity for new ways of addressing some issues  – e.g. through crowdfunding to accommodate risky research. In addition, publication in open access can support these novel paper strategies.

Richard Gold (McGill University, Montreal, Canada) looked at why institutions choose open science – exponentially increasing costs of research, but it’s not enough and there are requests to ask for more funding. Productivity is declining – measured by the number of papers per investment. Firms are narrowing their focus of research.

We can, therefore, consider Open Science partnerships – OA publications, Open Data and no patents on co-created outputs as a potential way to address these challenges. This can be centred around academic and not-for-profit research centre, and generally about basic understanding of scientific issues, with data in the centre. Institutions look at it as a partial solution – decreasing duplication as no need to replicate, provide quality through many eyes, and providing synergies because there is a more diverse set of partners. It can increase productivity because data can be used in different fields, using wider networks of ideas and the ability to search through a pool of ideas. We can see across fields – more researchers, but fewer outputs in. In patent applications, we see that also the 1950s was the recent peak in novelty in terms of linking unrelated field, and this is dropping since.

An alternative to this is a system like the Structural Genomics Consortium – attracting philanthropic and industrial funding. There is also a citizen science aspects – ability to shape the research agenda in addition to providing the data. The second thing is that the data can be used with their communities – patients and indigenous groups are more willing to be involved. Open science better engages and empower patients in the process – easier to get consent.

Discussion: during the selection of projects, the bibliometrics indications need to be removed from the application and from funding decisions. Need people to read the research ideas, and need to move away from funding only a single person as the first author – need to incentivise teams and support. Need to think how to deal with impact of research and not only on the original research (someone might use the dataset that was produced in open science for a publication, not by the person who did the work).

There is a sense that the “lack of risk-taking” is an issue, but there is a need for measuring and showing if it is happening. Lots of scientists censuring their work and there is a need to document this happening. The global redistribution of people is about which areas people concentrate on – e.g. between physics and agriculture.

Session 2 – Open access publication and dissemination of scientific information

Rebecca Lawrence (Faculty of 1000) described how F1000 is aiming to develop a different model of publication – separating publication from evaluation. The publication is there because of funders and researchers evaluate others around where they publish. There are all sort of manipulations: overselling, p-value fishing, creative outliers, plagiarism, non-publication by a journal that don’t want low impact papers and more. There is a growing call for the move towards open access publication – e.g. the open science policy platform, European open science cloud, principles such as DORA, FAIR (Findable, Accessible, Interoperable, Reusable) and an increase of pre-print sources. There is also a new range of how science is being organised – how to make it sustainable in areas where there aren’t receiving much funding – use of pre-print services, and also exploring the peer review funding. F1000 is about thinking about the speed of s finding. The model was developed with Wellcome, Gates foundation and creating a platform that is controlled by funders, or institutions, and by researchers. In this model, publishers are service providers. F1000 support a wide range of outputs: research article, data, software, methods, case studies. They check that the paper technically: is the data behind it accessible and that it was not published before. The publication is done a complete open peer review – so you can see who is reviewing and what was done by the author. Within the article, you can see the stage in the research – even before peer review. Making the paper a living document – usually 14 days between submission and publication, and usually a month including being reviewed. The peer review here is transparent and the reviewers are being cited. This is good for ECRs to gain experience.

The indicators need to take into account career levels, culture (technical and reflective) and not only fields, and thinking about different structures – individual, group, institution. Need open metrics, and certain badges that tell you what you are looking for and also qualitative measures- traditional publications can curate articles.

2. Vincent Tunru (Flockademic, Netherlands) explored the issue of incentivising open science. Making science more inclusive – making more people being able to contribute to the scientific process. Open access can become the goal instead of the means to become more inclusive. If the information is free, people can read the results of publicly funded research, but there is a barrier to publish research within the OA model – publication costs should be much lower: other areas (music, news) have gone down in costs because of the internet. In some disciplines, there is the culture of sharing pre-print and getting feedback before submission to journals – although places like ArXiv is doing the work. The primary value of the submission to a journal is the credentialing, High-level journals can create scarcity to justify the demand. Nature scientific reports is taking over PLOS ONE because of that. We need to decouple credentialing from the specific journals. Different measures of excellence are possible, but we need to consider how we do it today – assuming that it is reviewers and editors are the ones who consider what excellence means. Need to focus on inclusivity and affordability. [See Vincent blog post here]

Kim Holmberg (University of Turku, Finland) focused on altmetrics –  Robert Merton pointed already in the 1950s that the referencing system is about finding a work that wasn’t known before but also about recognition of the other researchers. That leads then to how the journal impact factor and the H-Index became part of research assessment. These are being used more and more in research evaluation especially in the past 15 years. Earlier research has pointed out many flaws with them. In addition, they fail to take into account the complexity of scientific activities, nor do they tell you anything about the societal impact of research. One way to look at the complexity is the Open Science Career Assessment Matrix (OS-CAM).

We can think about the traces that people leave online as they go through the research process – discussing research ideas, collecting data, analysing, disseminating results. These traces can become altmetrics – another view of research activities. It is not just social media: the aim is to expand the view of what’s impact is about. With altmetrics we can analyse the networks that the researcher is involved in and that can give insights into new ways of interaction between the researcher with society. Citations show that a paper has been used by another researcher, while altmetrics can indicate how it has been disseminated and discussed among a wider audience. But there are still lots of questions about the meaning and applicability of altmetrics.

There are reports from the Mutual Learning Exercise europa.eu/!bj48Xg – looking at altmetrics, incentives and rewards for open science activities. For instance, in the area of career & research evaluation, researchers need specific training and education about open science, and in the area of evolving authorship identifying and rewarding peer review and publishing of negative results need to be developed. Implementation of open science needs to guarantee long-term sustainability and reward role-models who can provide a demonstration of this new approach to involving in science. The roadmap from the MLE suggests a process for this implementation.

Discussion: there is the issue of finding a good researcher in a group of researchers and publications is a way to see the ideas, but the link to open science and how it can help in that is unclear. However, finding a good researcher does not happen through all these metrics – it’s a human problem and not only a metric. Will originality be captured by these systems? Publication is only small part of the research activity – in every domain, there is a need to change and reduce the publication, but not only to think that someone will read the same paper again and again (after each revision). Attention is the scarce resource that needs to be managed and organised not to assume that more find a way to filter the information.

The response to this pointed that because of research funding is public, we should encourage publishing as much as possible so others can find the information, but we need good tools for searching and evaluating research so you can find it.

Another confusion – want to see the link between open access publication and open science. Open access can exist in the publish or perish structure. What is it in OA that offer an alternative to the close publishing structure. How can that lead us to different insight into researchers activities? In response to this, it was pointed out that it is important to understand the difference between Open Access and Open Science (OA = openly available research publications, OS = all activities and efforts that open the whole research process, including publishing of research results).

There is growing pressure for people to become media savvy and that means taking time from research.

Altmetrics: originally thought of as a tool that can help researchers find interesting and relevant research, not necessarily for evaluation (http://altmetrics.org/manifesto/).

Discussion: there is the issue of finding a good researcher in a group of researchers and publications is a way to see the ideas, but the link to open science and how it can help in that is unclear. However, finding a good researcher is not through all these metrics – it’s a human problem and not only a metric. Will originality be captured by these systems? Publication is only small part of the research activity – in every domain, there is a need to change and reduce the publication, but not only to think that someone will read the same paper again and again (after each revision). Attention is the scarce resource that needs to manage and organised not to assume that more find a way to filter the information.

The response to this pointed that because of research funding is public, we should encourage publishing as much as possible so others can find the information, but we need good tools for searching and evaluating research so you can find it.

Another confusion – want to see the link between open access publication and open science. Open access can exist in the publish or perish structure. What is it in OA that offer an alternative to the close publishing structure. How can that lead us to different insight into researchers activities?

There is growing pressure for people to become media savvy and that means taking time from research.

Altmetrics: originally as a tool that can help other researchers, not necessarily for evaluation.

Session 3. Open research data: good data management and data access

Simon Hodson (CODATA) – Open Science and FAIR data. The reconciling elements – the case for open science is the light that it shines on the data and make it useful. It allows reuse, reproducibility, and replicability – it is very much matching each other. CODATA is part of the International Council for Science – focusing capacity building, policy, and coordination. The case for open science – good scientific practice depends on communicating the evidence. In the past, a table or a graph that summarises some data was an easy way of sharing information, but as data and analysis grew, we need to change the practice of sharing results. The publications of “Science as an open enterprise” (2012), including pointing that the failure to report the data underlying the science is seen as malpractice. Secondly, open data practices transform certain areas of research – genomics, remote sensing in earth systems science. Can we replicate this in other research areas? Finally, can we foster innovation and reuse of data and finding within and outside the academic system – making it available to the public at large.

Open science has multiple elements – open science is not only open access and open data. We need data to be interoperable and reusable and should be available for machine learning and have an open discussion. There are perceptions of reproducibility of research but also change in attitudes. We need to think about culture – how scientific communities established their practices. In different research areas, there are very different approaches – e.g. in biomedical research, this is open but in social science, there is little experience of data sharing and reuse and don’t see benefits. There is a need for a sociology of science analysis of these changes. Some of these major changes: meetings about genome research in Bermuda and Fort Lauderdale agreement which was because of certain pressures. There is significant investment in creating data that is not being used once – e.g. Hubble. Why data across small experiments is not open to reuse? We need to find making this happen.

FAIR principle allows data to be reusable. FAIR came from OECD work, Royal Society report 2012 and G8 statement. What we need to address: skills, also limits of sharing, need to clarify guidelines for openness. We need to have standards, skills and reward data stewardship. We need to see data citation of data. There is a need for new incentives – the cultural change happened when prominent people in the field set up the agreement.

Fiona Murphy (Fiona Murphy Mitchell Consulting, UK) Working in the area of data publishing and providing the perspective of someone who is exploring how to practice open science. There are cultural issues: why to share, with whom, what rewards, and what is the risk. Technical – how is that is done, what are the workflows, tools, capacity, and time investment. There are issues of roles and responsibilities and who’s problem is it to organise the data.

Examples of projects – SHARC – share research data alliance – international and multi-stakeholders and aim to grow the capacity to share data. The specific group is working a White Paper on recommendations. The main issues are standards for metrics: need to be transparent, need about reputation, and impact on a wider area. Also, what will be the costs of non-sharing? There are different standards in terms of policies, also need persistent identifiers and the ability to reproduce. Equality of access to services is needed – how to manage peer to peer and how is that integrated into promotion and rewards. The way to explore that is by carrying out pilots projects to understand side effects. There is also a need to develop ethical standards.

The Belmont Forum Data Publishing Policy – looking at creating the data accessibility that is part of a digital publication. Developing consistency of message so researchers will know what they are facing. There are lots of issues – some standard wording is emerging, and capturing multiple data sets, clarify licensing etc.

We can also think about what would have started if all the current system was in place – the scholarlycommons.org is suggesting principles for “born digital” scientific practice should evolve. The approach to thinking about commons, they have created some decision trees to help with the project. Working as open scientists is a challenge today – for example, need to develop a decision tree software and other things are proving challenging to act as a completely open scientist. It’s a busy space and there is a gulf between high-level policy and principles and their delivery.

Jeff Spies (Centre for Open Science, Virginia) [via video-link] Jeff is covering open research data, urgent problems, and incremental solutions. looking at strategies that are the most impactful (which is different from the center for open science). We need to broaden the definition of data – we need context: more than just the data itself or the metadata – it is critical for the assessment, metascience work. We can think of knowledge graph – more then the semantic information for the published text, and the relationship of people, place, data, methods, software… but the situation in incentives is – from psychological perspectives, the getting awards for specific publications is so strong that makes the focus on what is publishable. We have rates of retractions go up as impact factor goes up. There is urgency and the lock-in the publishers are trying to capture the life-cycle of research. The problem is that culture change is very slow and we need to protect the data – funders and policymakers that can make a difference. Researchers don’t have the ability to curate data – but libraries are the people that can have a resource for that and focus. Potential – the researcher asked to link to promotion policies and that will force universities to share them, and if the policy mention data sharing (as a way to force universities to change)

Discussion: there is concern about the ability of researchers to deal with data. There is a problem of basic data literacy.

The problem with making the data FAIR it is about 10% of the project costs and where it is useful, or where it is not enough or too much – just organising the data with the librarians is not enough as data requires a lot of domain knowledge. There are significant costs. however, in the same way, that the total costs of science to include the effort of peer review, or getting to publications (either subscription or publication), then we should also pay for the data curation. There is a need for appraisal and decision how data and process will be done.

We need to think about the future use of data – the same as natural history specimens and we can never know what should be done. Questions about the meaning of data are very important – it’s not only specimens but also photographs and not necessarily digital.

Libraries can adapt and can get respects – they are experts in curation and archiving

Session 4. Societal engagement 

Kazuhiro Hayashi (NISTEP, Tokyo, Japan) Open science as a social engagement in Japan. Is in science and technology – is being involved in open access journal and keen about altmetrics – now involved in open science policy. Generally, see multi-role – top down and bottom up – from working in G7 science expert group in open science, and also in creating software and journals. Involved in citizen science NISTEP journal and lectures, and involved in altmetrics, multi-stakeholders workshop and future earth. He would like to showcase studies:

Citizen science – the funding system in Japan for science is coming from the state mainly and they have a difficult time to do public engagement – spontaneous researchers “wild researchers”. Suggesting a more symmetrical system – creating also independent researchers which are getting their budget from a business and they publish in online journals. Wild researchers are based on crowdfunding and relay on the engagement of citizens. From his experience, recognise the new relationship between citizens and scientists: research style, new career paths and funding. Negative aspects of citizen science include populism in crowdfunding – need to be popular but not suitable for the crowd. Als need a new scheme for ECRs and need to include it. Also, there is a potential for misuse and plagiarism because of lack of data and science literacy.

Altmetrics – contributed to NISO Altmetrics initiative working group – difficult to define, and current altmetrics scores in Japanese literature are closely related to Maslow’s hierarchy of need. There are plenty of institutional repositories that – access to journal articles on repositories is more social – readers are non-researchers who would go to journal websites. Need to look at social impact – look mentioning and network analysis but it is difficult to analyse. There is need to look at the flow of data across the web.

Multi-stakeholders workshop – considering the future of open science and society. With environmental sciences and informatics. the outcome is to think about erasing influences of different socio-economic status on participants. Co-development of data infrastructure and the action of social transformation. There is an importance in capacity building. Need to see how open science and transdisciplinary work co-evolved. For social engagement – very time-consuming and need to get funded, and need open for creative activities for citizens and scientists. Think about new relationships between science and society. Need to use tentative indicators to transform society and culture – creating a future of open science and society – move from “publish or perish” to “share or perish”. Japan will have 2 citizen science sessions at the Japan open science summit on June 18-19 2018.

Muki Haklay (UCL, London, UK) [see my separate blog post]

Cecilia Cabello Valdes (Foundation for Science and Technology, Madrid, Spain) Societal engagement in open science. The foundation is aimed to promote science link with society – original with interest of increasing interest of the Spanish citizens. They are managing calls and fund different activities (about 3,250K Eur). More than 200 projects. They do activities such as Famelab – giving events to promote science and technology, in an open way. The science news agency – there is lack of awareness of scientific research – the SiNC agency – the papers are taken by general media – over 1000 journalists who use the information. They carry out summer science camps: 1920 funded students funded in 16 universities.They also manage the national museum of science and technology (Muncyt) and they share the history of science and technology in Spain. It’s a unique type of a science museum.

In citizen science, they have done a lot of work in awareness of the public to science and technology, and to keep public support for science investment. More recently they create a council of foundations for science – there wasn’t awareness of social foundations that haven’t invest in science and not only cultural activities. There are 3 foundations that are involved with the council and they are having a direct contact with the minister to develop this area of funding. The second initiative is crowdfunding for science – they help to carry out a campaign that helps in creating activities – it is also a tool of engagement.

Outreach is difficult – the council support policymakers and the general public is aware of the issues. So there are challenges – and that need to transform and how do we measure it? Some of the roles that the council need to do is to incentivise policymakers to understand what they want to achieve and then have indicators to assist in seeing that the goals are achieved. They participated in the process of policy recommendation about open science, and then translate that into action – for policymakers and society. In Fecyt they also provide resources: access to WoS/Scopus, evaluation of journals, standardised CV of researchers, and open science. Finally they participation in studies that look at measurements of science and the results

Discussion: Science Shops – are there examples that link to Maker spaces? Yes, there are examples of activities such as Public Lab but also the Living Knowledge network

Many societal engagements are not open science – they treat society as a separate entity: a struggle of making citizen science into open science – data remain closed. What are the aspects that lend themselves to open science and citizen science? – there are many definitions and there are different ways to define the two, but for example, the need to access publications, or the participation in the analysis of open data, or the production of open data, are all examples for an overlap.

Part of the discussion is about sharing knowledge, the part that says that researcher is like anyone else? There is a big difference between the scientific community and everyone else? The effort is not recognised in society and might you remove the prestige than no one would want to participate in science?

As you know, public interest – why the citizens want to participate in research? the citizens want the result of public research will help people to improve their quality of life. The science should address social problems.

How much people participate in – precipita is a new project and fund are not matched and they provide the technical help, and the promotion is through a campaign through different institutions

Should citizen science democratise science which is controversial – when information became more accessible as in Gutenberg, we are increasing the ability. Need to make citizen science a way to increase access to science.

How to get to integrated science into pockets and need to find a way to integrate these things together. There is a package that needs to support together: access, data, and public engagement and need to focus on them.

Citizen science needs to be integrated into all the science and needs to make results.

Session 5. Scientific Excellence re-visited

David Carr (Wellcome Trust, London, UK) Wellcome is committed to providing their research outputs – seeing it as part of good research practice. As a funder, they’ve had a long-standing policy on open accessing publications (since 2005) and other research outputs. Need to have also the costs of carrying out public engagement, and open access publications should be part of the funding framework. Also asking reviewers to recognise and value a wide range of research outputs. There are still need to think of reward and assessment structures, the sustaining of the infrastructures that are needed, and the need to create data specialists and managing the process to increase it. There are concerns by the research community about open access. Wellcome established open research team – looking at funder led and community-led activities, and also policy leadership. They now have the “WellcomeOpenResearch.org publishing platform” which is using F1000 platform, they also had the open science prize. They also look on policy leadership – e.g. the San Francisco DORA (declaration on research assessment). Also looking at changes to application forms to encourage other forms of outputs and then provide guidance to staff, reviewers and panel members. They also celebrate with applicants when they do open research, and also inform them about the criteria and options. They also carry out effort to evaluate if the open science indeed delivers on the promises through projects in different places – e.g. the McGill project.

Citizen Science & Scientific Crowdsourcing – week 5 – Data quality

This week, in the “Introduction to Citizen Science & Scientific Crowdsourcing“, our focus was on data management, to complete the first part of the course (the second part starts in a week’s time since we have a mid-term “Reading Week” at UCL).

The part that I’ve enjoyed most in developing was the segment that addresses the data quality concerns that are frequently raised about citizen science and geographic crowdsourcing. Here are the slides from this segment, and below them a rationale for the content and detailed notes

I’ve written a lot on this blog about data quality and in many talks that I gave about citizen science and crowdsourced geographic information, the question about data quality is the first one to come up. It is a valid question, and it had led to useful research – for example on OpenStreetMap and I recall the early conversations, 10 years ago, during a journey to the Association for Geographic Information (AGI) conference about the quality and the longevity potential of OSM.

However, when you are being asked the same question again, and again, and again, at some point, you start considering “why am I being asked this question?”. Especially when you know that it’s been over 10 years since it was demonstrated that the quality is beyond “good enough”, and that there are over 50 papers on citizen science quality. So why is the problem so persistent?

Therefore, the purpose of the segment was to explain the concerns about citizen science data quality and their origin, then to explain a core misunderstanding (that the same quality assessment methods that are used in “scarcity” conditions work in “abundance” conditions), and then cover the main approaches to ensure quality (based on my article for the international encyclopedia of geography). The aim is to equip the students with a suitable explanation on why you need to approach citizen science projects differently, and then to inform them of the available methods. Quite a lot for 10 minutes!

So here are the notes from the slides:

[Slide 1] When it comes to citizen science, it is very common to hear suggestions that the data is not good enough and that volunteers cannot collect data at a good quality, because unlike trained researchers, they don’t understand who they are – a perception that we know little about the people that are involved and therefore we don’t know about their ability. There are also perceptions that like Wikipedia, it is all a very loosely coordinate and therefore there are no strict data quality procedures. However, we know that even in the Wikipedia case that when the scientific journal Nature shown over a decade ago (2005) that Wikipedia is resulting with similar quality to Encyclopaedia Britannica, and we will see that OpenStreetMap is producing data of a similar quality to professional services.
In citizen science where sensing and data collection from instruments is included, there are also concerns over the quality of the instruments and their calibration – the ability to compare the results with high-end instruments.
The opening of the Hunter et al. paper (which offers some solutions), summarises the concerned that are raised over data

[Slide 2] Based on conversations with scientists and concerned that are appearing in the literature, there is also a cultural aspect at play which is expressed in many ways – with data quality being used as an outlet to express them. This can be similar to the concerns that were raised in the cult of the amateur (which we’ve seen in week 2 regarding the critique of crowdsourcing) to protect the position of professional scientists and to avoid the need to change practices. There are also special concerns when citizen science is connected to activism, as this seems to “politicise” science or make the data suspicious – we will see next lecture that the story is more complex. Finally, and more kindly, we can also notice that because scientists are used to top-down mechanisms, they find alternative ways of doing data collection and ensuring quality unfamiliar and untested.

[Slide 3] Against this background, it is not surprising to see that checking data quality in citizen science is a popular research topic. Caren Cooper have identified over 50 papers that compare citizen science data with those that were collected by professional – as she points: “To satisfy those who want some nitty gritty about how citizen science projects actually address data quality, here is my medium-length answer, a brief review of the technical aspects of designing and implementing citizen science to ensure the data are fit for intended uses. When it comes to crowd-driven citizen science, it makes sense to assess how those data are handled and used appropriately. Rather than question whether citizen science data quality is low or high, ask whether it is fit or unfit for a given purpose. For example, in studies of species distributions, data on presence-only will fit fewer purposes (like invasive species monitoring) than data on presence and absence, which are more powerful. Designing protocols so that citizen scientists report what they do not see can be challenging which is why some projects place special emphasize on the importance of “zero data.”
It is a misnomer that the quality of each individual data point can be assessed without context. Yet one of the most common way to examine citizen science data quality has been to compare volunteer data to those collected by trained technicians and scientists. Even a few years ago I’d noticed over 50 papers making these types of comparisons and the overwhelming evidence suggested that volunteer data are fine. And in those few instances when volunteer observations did not match those of professionals, that was evidence of poor project design. While these studies can be reassuring, they are not always necessary nor would they ever be sufficient.” (http://blogs.plos.org/citizensci/2016/12/21/quality-and-quantity-with-citizen-science/)

[Slide 4] One way to examine the issue with data quality is to think of the clash between two concepts and systems of thinking on how to address quality issue – we can consider the condition of standard scientific research conditions as ones of scarcity: limited funding, limited number of people with the necessary skills, a limited laboratory space, expensive instruments that need to be used in a very specific way – sometimes unique instruments.
The conditions of citizen science, on the other hand, are of abundance – we have a large number of participants, with multiple skills, but the cost per participant is low, they bring their own instruments, use their own time, and are also distributed in places that we usually don’t get to (backyards, across the country – we talked about it in week 2). Conditions of abundance are different and require different thinking for quality assurance.

[Slide 5] Here some of the differences. Under conditions of scarcity, it is worth investing in long training to ensure that the data collection is as good as possible the first time it is attempted since time is scarce. Also, we would try to maximise the output from each activity that our researcher carried out, and we will put procedures and standards to ensure “once & good” or even “once & best” optimisation. We can also force all the people in the study to use the same equipment and software, as this streamlines the process.
On the other hand, in abundance conditions we need to assume that people are coming with a whole range of skills and that training can be variable – some people will get trained on the activity over a long time, while to start the process we would want people to have light training and join it. We also thinking of activities differently – e.g. conceiving the data collection as micro-tasks. We might also have multiple procedures and even different ways to record information to cater for a different audience. We will also need to expect a whole range of instrumentation, with sometimes limited information about the characteristics of the instruments.
Once we understand the new condition, we can come up with appropriate data collection procedures that ensure data quality that is suitable for this context.

[Slide 6] There are multiple ways of ensuring data quality in citizen science data. Let’s briefly look at each one of these. The first 3 methods were suggested by Mike Goodchild and Lina Li in a paper from 2012.

[Slide 7] The first method for quality assurance is crowdsourcing – the use of multiple people who are carrying out the same work, in fact, doing peer review or replication of the analysis which is desirable across the sciences. As Watson and Floridi argued, using the examine of Zooniverse, the approaches that are being used in crowdsourcing give these methods a stronger claim on accuracy and scientific correct identification because they are comparing multiple observers who work independently.

[Slide 8] The social form of quality assurance is using more and less experienced participants as a way to check the information and ensure that the data is correct. This is fairly common in many areas of biodiversity observations and integrated into iSpot, but also exist in other areas, such as mapping, where some information get moderated (we’ve seen that in Google Local Guides, when a place is deleted).

[Slide 9] The geographical rules are especially relevant to information about mapping and locations. Because we know things about the nature of geography – the most obvious is land and sea in this example – we can use this knowledge to check that the information that is provided makes sense, such as this sample of two bumble bees that are recorded in OPAL in the middle of the sea. While it might be the case that someone seen them while sailing or on some other vessel, we can integrate a rule into our data management system and ask for more details when we get observations in such a location. There are many other such rules – about streams, lakes, slopes and more.

[Slide 10] The ‘domain’ approach is an extension of the geographic one, and in addition to geographical knowledge uses a specific knowledge that is relevant to the domain in which information is collected. For example, in many citizen science projects that involved collecting biological observations, there will be some body of information about species distribution both spatially and temporally. Therefore, a new observation can be tested against this knowledge, again algorithmically, and help in ensuring that new observations are accurate. If we see a monarch butterfly within the marked area, we can assume that it will not harm the dataset even if it was a mistaken identity, while an outlier (temporally, geographically, or in other characteristics) should stand out.

[Slide 11] The ‘instrumental observation’ approach removes some of the subjective aspects of data collection by a human that might make an error, and rely instead on the availability of equipment that the person is using. Because of the increase in availability of accurate-enough equipment, such as the various sensors that are integrated in smartphones, many people keep in their pockets mobile computers with the ability to collect location, direction, imagery and sound. For example, images files that are captured in smartphones include in the file the GPS coordinates and time-stamp, which for a vast majority of people are beyond their ability to manipulate. Thus, the automatic instrumental recording of information provides evidence for the quality and accuracy of the information. This is where the metadata of the information becomes very valuable as it provides the necessary evidence.

[Slide 12] Finally, the ‘process oriented’ approach bring citizen science closer to traditional industrial processes. Under this approach, the participants go through some training before collecting information, and the process of data collection or analysis is highly structured to ensure that the resulting information is of suitable quality. This can include the provision of standardised equipment, online training or instruction sheets and a structured data recording process. For example, volunteers who participate in the US Community Collaborative Rain, Hail & Snow network (CoCoRaHS) receive standardised rain gauge, instructions on how to install it and online resources to learn about data collection and reporting.

[Slide 13]  What is important to be aware of is that methods are not being used alone but in combination. The analysis by Wiggins et al. in 2011 includes a framework that includes 17 different mechanisms for ensuring data quality. It is therefore not surprising that with appropriate design, citizen science projects can provide high-quality data.