Making Middleware Work

Unlike many other proposals to curtail platform power, middleware does not violate the First Amendment of the U.S. Constitution. In the United States, that makes middleware a path forward in a neighborhood full of dead ends. Before we can execute on the middleware vision, however, at least four problems must be solved. First, how technologically feasible is it for competitors to remotely process massive quantities of platform data? Second, how is everyone going to get paid? Third, who will bear the enormous costs of content curation? And fourth, how does middleware address privacy issues regarding users’ friends’ data? We have to solve these problems to make middleware work—but this is also what makes the concept so promising.

This essay is a part of an exchange based on Francis Fukuyama’s “Making the Internet Safe for Democracy” from the April 2021 issue of the Journal of Democracy.

I am very much a fan of what my Stanford University colleague Francis Fukuyama has called “middleware”: content-curation services that could give users more control over the material they see on internet platforms such as Facebook or Twitter. Building on platforms’ stores of user-generated content, competing middleware services could offer feeds curated according to alternate ranking, labeling, or content-moderation rules.

The hope is that this more expansive set of options might break the chokehold that a very small number of platforms have on today’s information ecosystem, and instead allow different communities to draw on trusted sources for explanatory labels or enforce divergent rules for speech. Users might choose a racial-justice–oriented lens on YouTube from a Black Lives Matter–affiliated group, layer a fact-checking service from a trusted news provider on top of Google News, or browse a G-rated version of Facebook from Disney.

Fukuyama’s work, which draws on both competition analysis and an assessment of threats to democracy, joins a growing body of proposals that also includes Mike Masnick’s “protocols not platforms,” Cory Doctorow’s “adversarial interoperability,” my own “Magic APIs,” and Twitter CEO Jack Dorsey’s “algorithmic choice.” As Fukuyama explains, middleware can solve problems that traditional competition remedies, such as breaking up companies, probably cannot. Adding a middleware layer to platforms is, as I see it, broadly analogous to unbundling requirements for telecommunications providers. Both mechanisms aim at bringing competition into markets that are subject to network effects (wherein a service becomes more valuable to each consumer as its user base grows) by requiring incumbent players to license hard-to-duplicate [End Page 168] resources to newcomers. For platforms, the resource in question is user data or content. A robust model for sharing such content would unlock technical and commercial innovation, starting with useful tools such as customizable apps to aggregate our various social-media feeds.As Fukuyama writes, however, more than just competition is on the line. He and a team of Stanford collaborators put the issue starkly, comparing platforms’ control over public discourse to a loaded gun. “The question for American democracy,” they write, “is whether it is safe to leave the gun on the table.”¹ Middleware would reduce both platforms’ own power and their function as levers for unaccountable state power, as governments increasingly pressure platforms to “voluntarily” suppress disfavored speech.²

For those in the United States, middleware has another big selling point: Unlike many other proposals to curtail platform power, it does not violate the First Amendment. That makes middleware a path forward in a neighborhood full of dead ends. The First Amendment precludes lawmakers from forcing platforms to take down many kinds of dangerous user speech, including medical and political misinformation. That dooms many platform-regulation ideas from the political left. The Amendment also means that Congress cannot strip platforms of editorial control by compelling them to spread particular messages. That sinks many proposals from the right. Faced with these barriers, policy makers seeking bold change to platforms’ content-moderation practices may latch on to middleware as the best way forward.

But we have not yet reached that point. As Steve Jobs famously said, “real artists ship.” Before we can execute on the middleware vision, I see at least four problems to be solved. Two of those concern matters beyond my ken, but I will flag them here for others to consider.

First, how technologically feasible is it for competitors to remotely process massive quantities of platform data? Can newcomers really offer a level of service on par with incumbents? Twitter is actively working on a model for this, and technologist Stephen Wolfram told Congress that it can be done, so I will mark myself a cautious optimist.³

Second, how is everyone going to get paid? Without a profit motive for middleware providers, the magic will not happen, or it will not happen at large enough scale. Something about business models—or, at a minimum, the distribution of ads and ad revenue—will have to change.

That leaves the two thorny issues I do know a fair amount about: curation costs and user privacy.

Curation Costs

Facebook deploys tens of thousands of people to moderate user content in dozens of languages. It relies on proprietary machine-learning and other automated tools, developed at enormous cost. We cannot expect [End Page 169] comparable investment from a diverse ecosystem of middleware providers. And while most providers presumably will not handle as much content as Facebook does, they will still need to respond swiftly to novel and unpredictable material from unexpected sources. Unless middleware services can do this, the value they provide will be limited, as will users’ incentives to choose them over curation by the platforms themselves.

But not all content curators should need to do so much work. We want middleware providers to exercise unique judgment about what content is appropriate or what priority to give it on a page. This does not mean that they need to be doing redundant work figuring out basic information. When a risqué Brazilian pop song goes viral, not every provider should pay to get that song’s lyrics translated. Nor should every provider separately retain experts capable of explaining Kurdish-militant slang, the political significance of shirt colors in Thailand, or the meaning of Hawaiian shirts and A-OK hand gestures on the U.S. far right. In theory, that kind of information could be gathered once and then shared. Middleware providers could look up uniquely identifiable content in a database, review the information, and then use it to inform their own, independent judgments about how to rank, remove, or label that content.

That is the theory. No one has yet built this database, and the reality is complicated, as always. The line between objective information and subjective judgments can be fuzzy. So it is hard to say what information properly would belong in the database, or how to structure the database in order to make it useful. And this celestial reference book could never approach completion. New content and cultural contexts are endlessly resupplied by internet users, making ongoing assessment and curation costs significant no matter what happens.

More troublingly, there is a risk that this model will end up spreading not just information, but major platforms’ assessments of content. This could reimpose the very speech monoculture from which middleware was supposed to save us. Many critics have raised this concern with regard to existing cross-platform coordination, such as the controversial database used by platforms to identify violent-extremist content. If we cannot afford real, diverse, and independent assessment, we will not realize the promise of middleware.

Similar projects—such as the well-intentioned Internet Content Rating Association (ICRA), which failed in the early 2000s—have foundered on moderation costs before. ICRA’s mission, like Fukuyama’s, was to move control over content away from centralized gatekeepers, out closer to the edges of the network. The plan was to do so in part using customized browser settings, and in part by letting users choose from an ecosystem of trusted third-party curators. A user might subscribe to a block-list from her church, preventing her browser from displaying certain websites. Or she might block most sexual content, but use an add-list from Planned Parenthood to preserve access to health information. [End Page 170]

ICRA’s third-party curators never showed up. Assessing all that content was hard and thankless, even when the internet was much smaller. Of course, many things were different then—and, significantly, ICRA offered curators no way to make money. But its demise is a sobering reminder that curation costs matter.

Getting Privacy Right

If I want to stop seeing content curated by Facebook, and start seeing that same content curated by a middleware vendor, I can give the vendor permission to access and organize my own private data. The harder question concerns my friends’ data. When my cousin posts breastfeeding pictures visible only to a limited list of friends on Facebook, or explains why covid-19 vaccines are a hoax in her comments on my privately shared post, does my middleware provider get to see those things?

Fukuyama’s answer is no. Middleware providers will not see privately shared content from a user’s friends. This is a good answer if our priority is privacy. It lets my cousin decide which companies to trust with her sensitive personal information. But it hobbles middleware as a tool for responding to her claims about vaccines. And it makes middleware providers far less competitive, since they will not be able to see much of the content we want them to curate. On Facebook, they may miss out on content from over 80 percent of users (though good data on this are hard to come by).⁴ Even on Twitter, which is generally more public-facing, about 13 percent of posts are private.⁵ A vendor lacking access to such large shares of content will have trouble offering useful curation, and even more trouble competing with Facebook itself, which can see, assess, rank, and label everything in a user’s feed.

Middleware is not alone in confronting this problem. Many a great interoperability idea has foundered on it. A company called Power Ventures, for example, allowed users to aggregate feeds from different social-media platforms in a single interface. Facebook sued this service out of existence, saying that it violated criminal hacking statutes. European regulators are now trying to make interoperability along these lines possible, but they are running up against the same dilemma outlined above about users sharing their friends’ data.

Journalists and public-interest researchers encounter this problem, too. In 2020, election researchers at New York University designed a tool that worked much as middleware would: a browser extension to report on campaign ads in consenting users’ Facebook accounts. Facebook threatened to block researchers’ access, because the tool could also “see” posts from those users’ friends.

Some of these legal barriers could be surmounted with a stroke of the legislative pen. Changing the hacking laws that hinder interoperability, for example, would not be that hard. But the underlying privacy question [End Page 171] is not so easily resolved. Should we be able to hand our friends’ data over to a startup operating via a Facebook API? That is, after all, exactly what happened in the 2018 Cambridge Analytica scandal. There are compelling privacy reasons why the law should not sanction that kind of profligacy with other people’s data. There are compelling competition and speech-policy reasons why it should.

Many proposed interoperability mandates gloss over this question. Others suggest clunky fixes. We could share data only from friends who consent—meaning, if middleware takes off, users may receive a constant barrage of requests for such consent from companies that they have never heard of. Or users could sue middleware providers once the horse is already out the barn door, users’ data are loose on the internet, and the provider itself is perhaps safely outside the relevant jurisdiction. Or an incumbent such as Facebook could audit and police its own competitors—a less-than-ideal setup for a new market. Or perhaps there are technical solutions. I personally have not seen convincing ones so far, however.

Every year, in my platform-regulation class, I draw a Venn diagram on the board with three interlocking circles: privacy, speech, and competition. Then we identify all the issues that fall at the intersection of two or more circles. Interoperability, including for content-moderation purposes, is always smack in the middle. It touches every circle. This is what makes it hard. We have to solve problems in all those areas to make middleware work. But this is also what makes the concept so promising. If—or when—we do manage to meet this many-sided challenge, we will unlock something powerful.

NOTES

1. Francis Fukuyama et al., Middleware for Dominant Digital Platforms: A Technological Solution to a Threat to Democracy, Stanford Cyber Policy Center, 3, https://fsi-live.s3.us-west-1.amazonaws.com/s3fs-public/cpc-middleware_ff_v2.pdf.

2. Daphne Keller, “Who Do You Sue? State and Platform Hybrid Power Over Online Speech,” Hoover Institution, Aegis Series Paper No. 1902, 29 January 2019, www.hoover.org/research/who-do-you-sue.

3. U.S. Senate Subcommittee on Communications, Technology, Innovation, and the Internet, “Optimizing for Engagement: Understanding the Use of Persuasive Technology on Internet Platforms,” 25 June 2019, www.commerce.senate.gov/2019/6/optimizing-for-engagement-understanding-the-use-of-persuasive-technology-on-internet-platforms.

4. Michael Bailey et al., “Peer Effects in Product Adoption,” National Bureau of Economic Research Working Paper 25843, May 2019, www.nber.org/papers/w25843, Table 1, (in 335 million user-weeks of data, the average user had 53.9 friends with “public” statuses out of 328 total friends).

5. Emma Remy, “How Public and Private Twitter Users in the U.S. Compare—and Why It Might Matter for Your Research,” Medium, 15 July 2019, https://medium.com/pew-research-center-decoded/how-public-and-private-twitter-users-in-the-u-s-d536ce2a41b3.

‍

Making Middleware Work

Curation Costs

Getting Privacy Right

NOTES

Get our updates.

Start taking control of your enterprise AI systems today