Architecture of TPB and WikiLeaks19 Jan 2021
For obvious reasons, I recently got interested in how to build websites that are widely accessible but also resistant to censorship. Naturally, my first instinct was to run off and come up with my own blue-sky designs of the most resilient, censorship-resistant website in the world. But censorship is not new and I realized it would be smart to learn from the past: in particular, The Pirate Bay and WikiLeaks, which both continue to operate even under immense pressure to shutdown.
The first problem to sort out for any website is the domain name, and both these websites have had recurring issues with their domain names being seized. TPB tried a ‘hydra’ approach, where they registered thepiratebay under many different regional TLDs with different small registrars, and they would direct users to any that would work. They hoped this would help protect them from domain seizures, as well as circumvent ISPs that blocked their main .org domain. But unfortunately, their regional TLDs were consistently seized, leaving them with only their .org domain. TPB is able to keep their .org domain because they do technically follow US copyright law: they respond to DMCA takedowns . WikiLeaks is also a .org domain, but has had legal help from the EFF with keeping their domain operating .
The Pirate Bay
The Pirate Bay uses Cloudflare as its edge network and splits traffic between a number of cloud providers. Any one cloud provider suspending them won't cause them to go down because they can easily move the traffic to other clouds or add new ones . It's also unlikely to cause data loss, as they can replicate data between clouds. In addition, Cloudflare's edge network functions like an anonymizing layer, preventing the general public from identifying which cloud providers they use and pressuring them, which almost certainly makes it a lot easier for them to retain services.
WikiLeaks on the other hand has chosen to build out what's essentially their own edge network . They rent space in datacenters to colocate their own machines, and these machines serve their website's traffic. New documents are submitted through a Tor hidden service, where they likely get stored in the cloud for review. Since document submission is anonymous and relatively low traffic, it would be possible to also anonymize payment for these services by getting someone sufficiently far removed from the project to pay and reimbursing them in cash. Note that I don't have any evidence they do this, but it makes sense to me that they would because keeping documents in the cloud prevents them from being physically seized, and keeping payment sufficiently far removed from key project members prevents their systems from being identified through financial records.
- Operating as a non-profit under a .org domain seems most likely to minimize domain seizures.
- Anonymity is centrally important to censorship-resistance. It prevents retribution against the clients of a service, and allows the service itself to blend in among its vendors' customers.
- Censorship-resistance is one of many network effects of edge CDNs, and this is because they have diversity in their points of presence. Even if content is taken down in one region, it will still likely be available in others, and therefore accessible even where it’s blocked through a VPN.