Methodical Snark critical reflections on how we measure and assess civic tech

Designing useful civic tech research at scale: why methods matter


The Hewlett Foundation has asked for help in crowdsourcing research design for citizen reporting on public services. This is great; it’s a fantastic way to design useful research, and shows that Hewlett is maintaining the strong commitment to evidence and rigorous question asking that is so important to this field. The post has already generated some useful discussion, and I’m sure that they are going to be asking the right questions. This post has some methodological recommendations on how those questions get asked.

The problem

The tech for transparency and accountability community has this funny intellectual tick. We are constantly insisting on the need for evidence, but we don’t think about what kind of evidence we want or how to produce it. We say we want to know what works and we want to demonstrate the impact our work is having. We say we want evidence to design better programs. We snarkily question whether the hipster tech emperor has any clothes. And so a significant amount of time, energy and resources get poured into relevant research. There’s been MAVC, the Harvard/R4D RCTs, the host of OGP-funded work, and a host of research-focused boutique NGOs supported by this general hype.

Ironically, though, none of this research is really structured to produce the kinds of knowledge we actually need, either to demonstrate impact or to inform project design,  because they are simply too specific and unrelated.

Demonstrating impact

Contemporary research efforts fail to provide concrete evidence that cutting edge approaches “work” because they tend either to carefully measure fine details of program design (Guy Grossman’s experimental work on citizen reporting platform use in rural Uganda is powerful and obviously useful for the design of citizen reporting platforms in rural Uganda, and it’s maybe useful  in rural Kenya, but less clearly so in urban Indonesia or Argentina slums), or they are unstructured and method-less case studies that play fast and loose with attribution. Either way, we fall short (and almost alway will fall short) of the big categorical statements that tech for transparency technique A results in outcome X, with less than two dozen caveats and qualifiers.

This is a problem of external validity and generalizability. It’s old news to stuffy academics, and poses just as significant a challenge RCTs as it does to country researcher storytelling and action research.

Insights for project design

The problem with producing insights for project design stems from the same factors, but it’s more ironic, given how much this community emphasises user-focused design and meeting users “where they are.” One might expect this focus to translate seamlessly into thinking about research on tech and accountability. How will such evidence be used, and how can research be designed to be useful to project development?

But to my knowledge, there is not a single survey or diagnostic effort to determine what kinds of research outputs tech for accountability practitioners want, or would even use. I have yet to meet an accountability tech project lead who reads research articles for fun. I have yet to meet a someone with a research mandate in this community who has tried to explore the vast landscape of relevant evidence. It’s all just too much work. But as we know from accountability programming, we can’t just build it and expect that users will come. Research to inform programming has to be designed on the basis of what the people developing programs will actually read and be able to use.

So, end screed. But what to do about it? Well it turns out that tech for accountability research is poised at a pretty classic stage in disciplinary evolution. It’s facing a field-wide challenge to methodological design, and it’s a challenge that an initiative like Hewlett’s is well positioned to jump into.

The Design Challenge

Firstly, large scale efforts to produce useful research should prioritize useful evidence for project design over evidence of impact. There is enough micro level evidence being generated on outputs (if not outcomes) to justify innovative work, especially given how much innovation is taking place in the M&E field. Though I’m a big fan using home-grown monitoring indicators for adaptive programing, at the field level, monitoring and evaluation should be left to the M&E (RL) community, they’re making great advances. Focusing on inputs to program design will get large actors like Hewlett more bang for their buck in advancing the field.

For Hewlett’s work on citizen reporting, I think this has some significant implications, which I’ll frame below as two recommendations for prep-work and four recommendations for case study design

Methodological prep work
1. Conduct research on what platform practitioners will actually use.

This is sounds simple (it can easily be tagged on to the kind of crowdsourcing Hewlett is doing now), but is easy to mess up. It’s important to structure this kind of work to ensure you the quality of the data you get, and to watch out for response bias (no one wants to tell a funder how little time or interest they have for evidence). Getting a clear picture should probably be focused more on objective indicators about what kind of evidence and information already gets used, than subjective questions about what might get used in a perfect world–though allowing users to “blue sky” respond can also reveal important preferences. Special attention should be paid to format and delivery mechanisms. My hunch is that for organizations lacking a research or evidence mandate, most of the information that influences project design is locally articulated or picked up anecdotally at conferences. This matters.

2. Build a typology of contextual factors that can inform consistent research design across cases, and open for comparative insights

The next big step for tech and transparency research as a field is likely matching it’s ever-abundant hordes of loosely structured case studies with the rigor suddenly sexy quantitative methods. To do this we need typological theories that sketch out the kinds of things that matter for determining the relationship between inputs (reporting platforms use first names, deploy USSD, publicly display reported data, etc) and outputs (more reports, efficacy, government response, trust, etc). This should be drawn from a combination of insights and questions from practitioners, and from the body of literature that already exists on these dynamics. Importantly, this body of literature stretches way beyond what is often referenced in tech for accountability circles, and should include e-participation, public administration and political communication studies.

Case study design

The background research work described above can directly inform a smart set of country studies that carefully balance consistent design for comparability with bespoke methods to target context. This where the rubber really hits the road, and four points are particularly important.

1. Using a consistent framework with flexibility.

Whether a  large research project focuses on 3 cases or 25, applying a consistent analytical framework based on the typological theory described above will open up for understanding how contextual factors affect different design choices. There’s a balance here. The more cases and the more similar they are, the easier to draw comparative conclusions, and the more specific and less generalization those conclusions will be. But it’s a manageable balance, and having a typological theory driving it allows you to look for patterns across cases, while still allowing for methodological variation to target the most relevant data in each case.

2. Choose smart cases

There’s always a temptation to choose cases that are representative of the entire universe. As implied above, that comes at a cost, and decisions on the set of cases should put just as much emphasis on which cases will produce useful data, as which will represent every contingency. Modest wins, like producing useful insights on a specific type of monitorial platform in a specific type of context, would still be a huge win if external validity is built into the research design.

3. Use multimethod design for cases

Mixed methods are fashionable, but they’re quite specifically important in the context of tech for accountability research. Combining qual and quant methods doesn’t add any value in and of itself, but combining them specifically as a mechanism for identifying and explaining causal mechanisms is precisely where large scale research can add value to program design. In this sense, experimental or quasi-experimental research testing specific factors from the the typological framework will likely reveal interesting results of the variety: X input correlates with Y output. Hopefully, it will also identify other factors as confounders or mediating variables, but it necessarily won’t tell us what is causing what or how it is happening. Stats simply do not do that. But rigorous qualitative methods can do that, and they can do it best when guided by qualitative analysis. Depending on how results populate, qualitative analysis can identify the case studies most likely to tell us how and why contextual factors are influencing outcomes. This might involve looking at the outliers or most exemplative cases (people, institutions or interactions). It might involve designing qualitative follow-up research that targets and prioritizes some contextual factors over others. Quant work can guide this and dramatically increase the insights.

4. Make qualitative work rigorous

There are a million and one ways to do this, and even more ways to not do it at all while saying that you do. As a community, we’ve gotten very good at name dropping methodologies without actually learning how to implement them. So there’s a lot to say here. But starting with a specific hypothesis, rigorously interrogating alternative explanations, and carefully balancing deductive and inductive methods are a good start. And people who know methods should review work, not just people who know how to reference methods

The implications

So that’s a lot. And it was the consolidated and simplified version. The good news is that the preliminary work should be relatively easy to conduct by engaging with the people Hewlett’s research aims to serve. The last bit of running a set of theoretically consistent, mutlimethod case analyses is more demanding, but potentially game-changing. If there’s anyone in the wild west landscape of civic tech evidence with the intellectual rigor and ambition to do it, it’s likely Hewlett.

Of course there will be other challenges along the way too: the perennial balance between having practitioners conduct research to build their capacity, and watchdogging them to ensure rigor; the inevitable delays and pitfalls of data coordination and version management; problems of attrition and localization. No good thing ever came easy.

But this initiative might just have the stature, ambition, capacity and access to be a success. Hewlett has consistently been asking the tough and important questions, and they look posed to do so again, through close collaboration with the initiatives they hope to serve. I’m sure they’ll ask the right research questions. Careful attention to how those questions are asked can make all the difference.


Leave a Reply to David Sasaki Cancel reply

  • Christopher, before I respond with a more substantive reply, including how we’ve incorporated your feedback into our grantmaking plans, let me first say thank you for taking the time and for such constructive input. Also, I’m glad to come across your website — a great resource for our field. More soon.

Methodical Snark critical reflections on how we measure and assess civic tech


Get in touch

Suggest research to be reviewed or mini-lit reviews. Ask questions or tell me why I'm wrong.