At this point in time, there are 497 people that went through the trouble of accessing UCLeXtend and creating a profile. They are a small group of the people that seen the blog post (about 1,100) or the tweet about it (about 600 likes, retweets or clicking on the link). There are further 400 people that filled in the online form that I set before the course was open and stated their interest in it.
The course is structured as a set of lectures, each of them broken into segments of 10 minutes each, and although the annotated slides are available and it is likely that many people prefer them over listening to a PowerPoint video (it’s better in class!), the rate of viewing of the videos gives an indication of engagement.
Here are our viewing statistics for now:
We can start seeing how the sub-tasks (viewing a series of videos) is already creating the inequality – lots of people watch part of the first video, and either give up (maybe switching to the notes) or leaving it to another time. By part 4 of the first lecture, we are already at very few views (the “Lecture 3 Part 2” video is the one that I’ve integrated in the previous blog post).
What is interesting to see is how fast participation inequality emerges within the online course, and notice that there is now a core of about 5-10 people (about 1% to 2%) that are following the course at the same rate as the 9 students who are in the face to face class. I expect people to also follow the course over a longer period of time, so I wouldn’t read too much into the pattern and wait until the end of the course and a bit after it to do a full analysis.
When I was considering setting up the course as a hybrid online/offline, I was expecting this, since the amount of time that is required to follow up the course is nearly 4-5 hours a week – something reasonable for an MSc student during a course, but tough for a distance learner (I have a huge appreciation to these 10 people that are following!).
A new paper that is based on the PhD work of Valentine Seymour is out. Valentine has been researching the patterns of volunteering in environmental projects at the organisation The Conservation Volunteers. In the paper, we draw parallels between the activities of environmental volunteers and citizen science participants. The analysis demonstrates that the patterns of participation are similar.
Environmental volunteering and environmental citizen science projects both have a pivotal role in civic participation. However, one of the common challenges is recruiting and retaining an adequate level of participant engagement to ensure the sustainability of these projects. Thus, understanding patterns of participation is fundamental to both types of projects. This study uses and builds on existing quantitative approaches used to characterise the nature of volunteer engagement in online citizen science projects, to see whether similar participatory patterns exist in offline environmental volunteering projects. The study uses activity records of environmental volunteers from a UK environmental charity “The Conservation Volunteers,” and focuses on three characteristics linked to engagement: longevity, frequency, and distance travelled. Findings show differences in engagement patterns and contributor activity between the three UK regions of Greater London, Greater Manchester, and Yorkshire. Cluster analysis revealed three main types of volunteer engagement profiles which are similar in scale across all regions, namely participants can be grouped into “One-Session,” “Short-Term,” and “Long-Term” volunteer. Of these, the “One-Session” volunteer accounted for the largest group of volunteers.
In this fairly short chapter what I am trying to communicate is that while we know that participation inequality is happening and part of crowdsourced information, we need to consider how it influences issues such as data quality, and think how it come about. I am trying to make suggest how we ended with skewed contributions – after all, at the beginnings of most projects, everyone are at the same level – zero contribution, and then participation inequality emerge.
‘Citizen Science as Participatory Science‘ is one of the most popular posts that I have published here. The post is the core section of a chapter that was published in 2013 (the post itself was written in 2011). For the first European Citizen Science Association conference I was asked to give a keynote on the second day of the conference, which I have titled ‘Participatory Citizen Science‘, to match the overall theme of the conference, which is ‘Citizen Science – Innovation in Open Science, Society and Policy’. The abstract of the talk:
In the inaugural ECSA conference, we are exploring the intersection of innovation, open science, policy and society and the ways in which we can established new collaborations for a common good. The terms participation and inclusion are especially important if we want to fulfil the high expectations from citizen science, as a harbinger of open science. In the talk, the conditions for participatory citizen science will be explored – the potential audience of different areas and activities of citizen science, and the theoretical frameworks, methodologies and techniques that can be used to make citizen science more participatory. The challenges of participation include designing projects and activities that fit with participants’ daily life and practices, their interests, skills, as well as the resources that they have, self-believes and more. Using lessons from EU FP7 projects such as EveryAware, Citizen Cyberlab, and UK EPSRC projects Extreme Citizen Science, and Street Mobility, the boundaries of participatory citizen science will be charted.
As always, there is a gap between the abstract and the talk itself – as I started exploring the issues of participatory citizen science, some questions about the nature of participation came up, and I was trying to discuss them. Here are the slides:
After opening with acknowledgement to the people who work with us (and funded us), the talk turn the core issue – the term participation.
Type ‘participation’ into Google Scholar, and the top paper, with over 11,000 citations, is Sherry Rubin Arnstein’s ‘A ladder of citizen participation’. In her ladder, Sherry offered 8 levels of participation – from manipulation to citizen control. Her focus was on political power and the ability of the people who are impacted by the decisions to participate and influence them. Knowingly simplified, the ladder focus on political power relationships, and it might be this simple presentation and structure that explains its lasting influence.
Since its emergence, other researchers developed versions of participation ladders – for example Wiedmann and Femers (1993), here from a talk I gave in 2011:
These ladders come with baggage: a strong value judgement that the top is good, and the bottom is minimal (in the version above) or worse (in Arnstein’s version).The WeGovNow! Projectis part of the range of ongoing activities of using digital tools to increase participation and move between rungs in these concept of participation, with an inherent assumption about the importance of high engagement.
At the beginning of 2011, I found myself creating a ladder of my own. Influenced by the ladders that I learned from, the ‘levels of citizen science’ make an implicit value judgement in which ‘extreme’ at the top is better than crowdsourcing. However, the more I’ve learned about citizen science, and had time to reflect on what participation mean and who should participate and how, I feel that this strong value judgement is wrong and a simple ladder can’t capture the nature of participation in Citizen Science.
There are two characteristics that demonstrate the complexity of participation particularly well: the levels of education of participants in citizen science activities, and the way participation inequality (AKA 90-9-1 rule) shape the time and effort investment of participants in citizen science activities.
We can look at them in turns, by examining citizen science projects against the general population. We start with levels of education – Across the EU28 countries, we are now approaching 27% of the population with tertiary education (university). There is wide variability, with the UK at 37.6%, France at 30.4%, Germany 23.8%, Italy 15.5%, and Romania 15%. This is part of a global trend – with about 200 million students studying in tertiary education across the world, of which about 2.5 million (about 1.25%) studying to a doctoral level.
However, if we look at citizen science project, we see a different picture: in OpenStreetMap, 78% of participants hold tertiary education, with 8% holding doctoral level degrees. In Galaxy Zoo, 65% of participants with tertiary education and 10% with doctoral level degrees. In Transcribe Bentham (TB), 97% of participants have tertiary education and 24% hold doctoral level degrees. What we see here is much more participation with people with higher degrees – well above their expected rate in the general population.
The second aspect, Participation inequality, have been observed in OpenStreetMap volunteer mapping activities, iSpot – in both the community of those who capture information and those that help classify the species, and even in an offline conservation volunteering activities of the Trust for Conservation Volunteers. In short, it is very persistent aspect of citizen science activities.
For the sake of the analysis, lets think of look at citizen science projects that require high skills from participants and significant engagement (like TB), those that require high skills but not necessarily a demanding participation (as many Zooniverse project do), and then the low skills/high engagement project (e.g. our work with non-literate groups), and finally low skills/low engagement projects. There are clear benefits for participation in each and every block of this classification:
high skills/high engagement: These provide provide a way to include highly valuable effort with the participants acting as virtual research assistants. There is a significant time investment by them, and opportunities for deeper engagement (writing papers, analysis)
high skills/low engagement: The high skills might contribute to data quality, and allow the use of disciplinary jargon, with opportunities for lighter or deeper engagement to match time/effort constraints
low skills/high engagement: Such activities are providing an opportunity for education, awareness raising, increased science capital, and other skills. They require support and facilitation but can show high potential for inclusiveness.
low skills/low engagement: Here we have an opportunity for active engagement with science with limited effort, there is also a potential for family/Cross-generational activities, and outreach to marginalised groups (as OPen Air Laboratories done)
In short – in each type of project, there are important societal benefits for participation, and it’s not only the ‘full inclusion at the deep level’ that we should focus on.
Interestingly, across these projects and levels, people are motivated by science as a joint human activity of creating knowledge that is shared.
So what can we say about participation in citizen science – well, it’s complex. There are cases where the effort is exploited, and we should guard against that, but outside these cases, the rest is much more complex picture.
The talk move on to suggest a model of allowing people to adjust their participation in citizen science through an ‘escalator’ that we are aiming to conceptually develop in DITOs.
Finally, with this understanding of participation, we can understand better the link to open science, open access and the need of participants to potentially analyse the information.
Following the two previous assertions, namely that:
‘you can be supported by a huge crowd for a very short time, or by few for a long time, but you can’t have a huge crowd all of the time (unless data collection is passive)’(original post here)
‘All information sources are heterogeneous, but some are more honest about it than others’(original post here)
The third assertion is about pattern of participation. It is one that I’ve mentioned before and in some way it is a corollary of the two assertions above.
‘When looking at crowdsourced information, always keep participation inequality in mind’
Because crowdsourced information, either Volunteered Geographic Information or Citizen Science, is created through a socio-technical process, all too often it is easy to forget the social side – especially when you are looking at the information without the metadata of who collected it and when. So when working with OpenStreetMap data, or viewing the distribution of bird species in eBird (below), even though the data source is expected to be heterogeneous, each observation is treated as similar to other observation and assumed to be produced in a similar way.
Yet, data is not only heterogeneous in terms of consistency and coverage, it is also highly heterogeneous in terms of contribution. One of the most persistence findings from studies of various systems – for example in Wikipedia , OpenStreetMap and even in volunteer computing is that there is a very distinctive heterogeneity in contribution. The phenomena was term ‘Participation Inequality‘ by Jakob Nielsn in 2006 and it is summarised succinctly in the diagram below (from Visual Liberation blog) – very small number of contributors add most of the content, while most of the people that are involved in using the information will not contribute at all. Even when examining only those that actually contribute, in some project over 70% contribute only once, with a tiny minority contributing most of the information.
Therefore, when looking at sources of information that were created through such process, it is critical to remember the nature of contribution. This has far reaching implications on quality as it is dependent on the expertise of the heavy contributors, on their spatial and temporal engagement, and even on their social interaction and practices (e.g. abrasive behaviour towards other participants).
Because of these factors, it is critical to remember the impact and implications of participation inequality on the analysis of the information. There will be some analysis to which it will have less impact and some where it will have major one. In either cases, it need to be taken into account.