Methodical Snark critical reflections on how we measure and assess civic tech

Apples, oranges and open data


Open Knowledge International recently asked for feedback on survey questions for the 2016 Open Data Index. This is great, and has produced a modest but likely useful discussion to  improve Index processes for national research, as well as the resulting data. But regardless of how much effort goes into fine tuning the survey questions, there’s a fundamental problem underlying the idea of an international open data index. There’s a good argument to be made that you simply can’t compare the politics of #open across countries. Open Knowledge should think carefully about what this means when refining how they present the Index, and see what can be learned from the last 15 years of experience with international indices on human rights and governance.

The problem with indices

Western experts have been ranking political performance of countries for a long time. From well known initiatives like Transparency International’s Corruption Perception Index or the Freedom in the World report, to more niche efforts like the Environmental Performance Index, the UNDP mapped 35 comparative assessment initiatives in 2007, and there seem to be more all the time.

These efforts provoked a lot of controversy. Much of it surrounded the fact that western experts don’t always fully understand political realities in the countries they are evaluating, and the popular assertion that “assessments should be conducted by citizens of the country being assessed and not by outsiders sitting in judgment upon it.” This produced a useful debate, arguably leading directly to other imperfect measurements like the African Peer Review Mechanism, and efforts by thought leaders like Global Integrity to promote more nationally owned and implemented assessments.

The (more fundamental) problem with international indices, and the one that’s relevant to ODI, is that countries don’t really compare. You can compare which countries have adopted international human rights treaties, and those treaties can in turn very clearly define what human rights are, but when you try to produce data on the policies and practices for protecting and implementing  those rights, things get messy real quickly, because, well, politics are messy and systems are different (the critique remains active today).

This is important from a methodological perspective (can we really use a single digit number to represent presence of complex conception like the rule of law?, can we really compare the implementation of freedom of religion in countries with secular and religious constitutional identities by policy and jurisprudence?).

It’s also important from an advocacy perspective. At the end of the day, all these indices are aiming somehow to improve what they measure, be it national integrity, protection of minorities or access to open data. But in the late 2000s it became clear that international ranking has a backlash potential. Methodological weaknesses can be used to discredit advocates that use rankings to push for national change, or can be manipulated for window dressing that frustrates meaningful change. Global Integrity was again the thought leader in this regard, and abandoned their Global Integrity Index in favor of independent national assessments in 2011, arguing that the latter are better tools for national advocacy and change.

There’s a lot to learn from this in the open data context, and without getting too deep into the weeds, we can make two assumptions:

  1. comparing and ranking countries’ politics isn’t easy, and we have to think carefully about what kind of comparisons we can actually and defensibly make, and
  2. we have to think carefully about what kind of political debate and political incentives different methods and analyses can provoke in countries.

Comparing open data indicators

So comparison is hard, and this is just as true for open data. Most immediately, there’s the indicators. For example:

  • Is data published by  a third-party related to government?
  • Are there bureaucratic hurdles to accessing the data?
  • Is data hard to find?

Concepts like “related to”, “hurdle” and “hard” are the survey designer’s nightmare, because everyone understands them differently and so when you have a bunch of different people providing the answers, you can’t compare them (what I think is hard, you might think is easy). This is tricky for the ODI because the field is so diverse and the idea of open is so recently developed that a lot of the things we want to measure are articulated as principles and not on the basis of government practice.

It seems as though ODI will address these methodological challenges by adding detail, elaborating for example on the multitude of ways in which finding data can be “hard”, or elaborating the “characteristics included in the downloadable file,” then asking researchers to identify those characteristics, so that they can compare once the data is in. This is a good approach, because it avoids building normative assumptions into the survey instrument (only the analysis), and because it will likely tell us something useful about the ways in which governments should make it easier to find open data. But it’s more work for everyone involved, and it doesn’t make things any less messy.

But even if we can implement methods to secure inter-coder reliability, there are definitional problems, such as those surrounding the identification of open licenses or determining whether data is in the public domain. Because countries often develop and implement their own licensing schemes for open data, researchers for the index are asked to evaluate whether licences are comparable with the open data definition (which is clearly defined). But differences in terms, and language, not to mention technical and legal irregularities, will often necessitate a judgement call, and there’s simply no way to say that 100+ licensing schemes are equal or to succinctly illustrate the distinctions. Even if you could, doing so would suggest you make some kind of normative assessment about what those differences imply, which implies a degree of nuance and density that is less obviously useful.

At bottom, perhaps the most challenging aspect of comparability simply has to do with the vast variation in how open data regimes are conceptualized, institutionalized and implemented across countries. Land data is good example. I don’t even know how to succinctly describe the tremendous differences in the agencies that collect it, their legal mandates for collecting it (or not), the conceptual frameworks on which the data is composed, or the technical characteristics of the data. But all these differences confound a simple assessment of whether “land data is open.”

What’s it mean for open data

Open data measurement initiatives like the Open Data Index and the comparable Open Data Barometer make a big splash in the international open data community when they come out, arguably because the visualizations are nice and open data enthusiasts tend to be already excited. It’s less clear whether the rankings have provoked meaningful advances in open data policy, and what role the comparative component has played when they have.

Don’t get me wrong, it’s clear that governments notice how they are ranked and that rankings provoke a response. In my experience as a national researcher for the Open Data Barometer and the Open Government Partnership IRM for Norway, I’ve seen government reference their own ranking a lot, generally to deflect criticism. I’ve heard representatives of other countries with less favorable rankings denounce the whole affair and express an intention to not participate in future research. What I haven’t seen is a case where ranking countries’ open data against one another has motivated progressive policy (except maybe the UK, who seems really to enjoy being #1?). I’d love to hear comments or corrections on this.

But even if we have examples to the contrary, I think the open data community needs to think carefully about what kind of advocacy and progressive change we are seeking to support, and make sure that our methodologies and our research outputs align with this. My experience with advocacy targeting bureaucrats tells me that deep contextual analysis can be a lot more meaningful that rankings. And there’s enough anecdotal evidence and conference gossip about advocacy initiatives whose overzealous presentation of their data led to being discredited by statistical offices or smear campaigns, that I think we want to be careful with tricky issues like compatibility.

Refining open data measurement

So what’s this mean for measuring the quality of open data across countries in a meaningful and productive way? Well there’s a lot to do, and leading organizations like open knowledge should prioritize learning from early cases on how rankings have provoked, both within the open data community and from related fields like governance assessment.

It will be especially important to think carefully about who the government actors are that these efforts are meant to influence and how. Ranking can be a great attention-getter, but it might not be best to drive reform. And for attention getting, you don’t need nuanced metrics, like comparisons of licensing modalities or the bureaucratic hurdles to access. These details should be researched for our own learning, and presented to support advocacy and dialogue between national actors, but they don’t need to be comparative.

If a ranking makes sense in the open data context, it could include only a handful of more crude and objective core indicators, while each country report could have a handful of satellite indicators and analyses addressing the details of #open that can be used in national media and civil society debates, without compromising comparative methodologies or making the index unnecessarily vulnerable to critique.

Additionally, a simplified ranking would make challenges of #openwashing particularly important, but combining a simplified ranking with ancillary and non-comparative data, potentially in a traffic light formats, could potentially offer effective tools for managing this.

It’s early days, we’re still learning about how open data is implemented and how to measure it. And Open Knowledge’s effort to solicit feedback is a great step towards doing that better. But there’s a couple things they could do to make that effort even more meaningful.

  • Make an effort to bring methodological experts into the conversation. There’s a lot of experts out there and people behind efforts like the Cingranalli-Richards Human Rights index or the Democracy Barometers or the Media Sustainability Index are only barely on the periphery of the open data movement, and would have a lot to contribute in terms of both methods and advocacy outcomes. The conversation so far is very much driven by the usual suspects.
  • Follow the open data gospel, and make the resources and methods you want comments on easier to access. A google doc listing new and old survey questions was inconspicuously linked in the Open Data blogpost, but I didn’t see the full survey methodology or questionnaire. I know it’s out there, and could dig it out of the ODI website, but making these easily accessible for comment would make it easier to get comments from people who aren’t ODI insiders.
  • Reduce redundancy and participation costs. There are a lot of redundant international research efforts around open government and open profoundly similar) there’s the Open Government Partnership Independent Research Mechanism and a handful of similar annual assessments in the field of democracy promotion and development. In addition to that there’s ad hoc studies by OECD, the World Bank, the ITU and others. As a national researcher I’ve been refused an interview explicitly out of assessment fatigue, but it’s a waste of resources as well. As a community established around the coordinating power of digital communications tools, we should do better than this.

Add Comment

Methodical Snark critical reflections on how we measure and assess civic tech