Ethical Applications of AI to Public Sector Problems

In 2021, while working for Hangar, an investment studio building public-sector startups, I started seeing the early proposals to use AI in the public sector. In retrospect, this was the beginning of huge technical improvements in AI, and the subsequent hype bubble. Some of these showed clear paths towards public good (predicting wildfire behavior, helping underfunded public defenders sift through mountains of evidence), but others were snake oil, or just terrifying (there literally were – and still are – people claiming to be able to predict crime¹).

I took some notes at the time about what are and aren’t ethical applications of AI in the public sector. These thoughts aren’t particular unique – plenty of other people have made similar points about AI. And the technology certain has changed dramatically since 2021 – back then I would never have predicted modern LLMs or diffusion models. But I’m still happy with where I landed: there’s some nuance here that I think is sometimes missing from modern discussions of AI, and I still find the distinctions about the different applications of AI useful. So I’ve turned these notes into this post. Do remember that I mostly sketched this out in 2021, so it retains that perspective mostly, though I have noted a few places where modern AI tooling changes my thinking.

To be clear: though I started thinking about this as an employee of Hangar, this is my position and doesn’t reflect Hangar’s position, then or now.

Ethical Applications of AI to Public Sector Problems

There have been massive developments in AI in the last decade, and they’re changing what’s possible with software. There’s also been a huge amount of misunderstanding, hype, and outright bullshit. I believe that the advances in AI are real, will continue, and have promising applications in the public sector. But I also believe that there are clear “right” and “wrong” ways to apply AI to public sector problems.

Background

To start, I’ll build off Arvind Narayanan’s excellent How to recognize AI snake oil (PDF). The term “AI” is – mostly unfortunately – an umbrella term for a bunch of only vaguely related technologies. To understand AI better, I’ll break it into the categories developed by Narayanan:

Perception, e.g. image recognition, facial recognition, speech to text, etc. Progress in this area is real, and rapid, and there are obvious and immediate applications to public sector problems.
Automated judgement, e.g. spam detection, automated grading, content recommendation, etc. Progress here has also been real and rapid, though AIs will never be perfect because these tasks involve judgement, and reasonable people will disagree in some cases.
Predicting social outcomes, e.g. predicting job performance, predictive policing, predicting criminal recidivism, etc. It’s impossible to predict the future, and this area is mostly filled with snake oil. Narayanan’s presentation goes into much more depth about why this is.

To Narayanan’s list I’d add a fourth category:

Generative algorithms like LLMs and diffusion models that can create “new” content (photos, text, video, audio) from short prompts. Progress here has been astounding, though practical applications are still somewhat unclear, being obscured by us being in the middle of a massive hype bubble. This is a large enough category that I suspect we’ll find both ethical and unethical applications of generative AI to public sector problems, and I think they’ll break down along the same lines as other areas, more on that below.

Predicting Outcomes is Unethical

It’s not ethical to predict social outcomes — and it’s probably not possible. Nearly everyone claiming to be able to do this is lying: their algorithms do not, in fact, make predictions that are any better than guesswork. Companies claiming to do this are in fact using algorithms that are deeply racist, sexist, classist, and so forth – for some examples and details, see Carina C. Zona’s The Consequences of an Insightful Algorithm (video, slides here). Even if these predictions were accurate — which they are not — they’re tainted by their deeply flawed training data. Organizations acting in the public good should avoid this area like the plague, and call bullshit on anyone making claims of an ability to predict social behavior.

Assistive versus fully-automated AI

The remaining categories (perception, judgement, generation) are not snake oil, and can offer some pretty tantalizing glimpses at real public good. For example, assisted diagnosis from medical scans could reduce costs, and increase accuracy. However, there are still serious ethical and applicability concerns in these areas. The framework I came up with to delineate ethical from unethical applications is to break them into two categories:

Assistive AI, where AI is used to process and consume information (in ways or amounts that humans cannot) to present to a human operator. For example, consider blind spot warnings in a car: the car’s computer is processing information a human literally can’t see, and providing a warning when the computer thinks it sees a car in the blind spot. But the human driver takes the ultimate action.
Automated AI, where the AI both processes and acts upon information, without input or oversight from a human operator. For example, consider a fully self-driving car without a steering wheel.

Unlike the prediction, there’s no snake oil here – both categories are possible and real. We’re confronted with examples of both regularly. Although self-driving cars are still partially snake oil, it’s not hard to imagine self-driving cars that are safer than human drivers. Looking broadly, neither category is by its nature more or less ethical than the other: a self-driving car isn’t inherently “good” or “bad”; that’s determined by how much safer it is (or isn’t) than the human driver it replaces.

We should use Assistive AI in the public sector

However, when it comes to the public sector, I believe that fully automated AI is unethical. All too often, AI algorithms encode human bias. And in the public sector, failure carries real life or death consequences. In the private sector, companies can decide that a certain failure rate is OK and let the algorithm do its thing. But when citizens interact with their governments, they have an expectation of fairness, which, because AI judgement will always be available, it cannot offer. Algorithms can augment and optimize human judgement, but they shouldn’t replace it.

Going further

I think it’s likely that this could apply more broadly, beyond public sector uses. For example, I find the assistive/automated distinction helpful in thinking about ethical applications of generative AI . I’m not too concerned about using LLMs to help me proofread a blog post — I did it for this one! — but I wouldn’t dream of publishing a post written entirely by an algorithm.

That said, I do think the framework is somewhat limited; it’s not right to say that one category is always “better” than another. It’s overly simplistic to say “assistive is more ethical than automated.” So that’s why I’ve couched this specifically in the context of public-sector applications and avoided making much broader conclusions.

What do you think?

I’d love to hear your thoughts! Please get in touch if you agree, disagree, or want to riff.

My “this algorithm isn’t racist” T-shirt has people asking a lot of questions etc. ↩︎