Imgur Architecture and Technology Stack
Adam Rifkin stashed this in Imgur!
Stashed in: Scaling
The clusters we have are: WWW, API, Upload, HAProxy, HBase, MySQL, Memcached, Redis, and ElasticSearch, for an average total of 80 instances. Each cluster handles the job that its name describes, all working together for the common goal of giving you your daily (hourly?) dose of image entertainment.
A walk through the typical Imgur request:Every request for Imgur first has to go through the HAProxy cluster. The first thing that happens when it reaches this cluster is Nginx checks if it already has a cached version of the response available. Every single page on Imgur is cached for 5 seconds, a technique commonly called microcaching. If you’re not signed into Imgur and you’re accessing a popular page, then chances are this is where your request will end. If no cached version of the page is available, then the request goes to HAProxy which decides which cluster will handle the rest of it. If you’re accessing imgur.com then you’ll go to the WWW cluster, api.imgur.com will go to the API cluster, and if you’re uploading or editing an image, you’ll go to the Upload cluster.
When you hit the WWW cluster you’ll be round-robin’d to an instance which will handle the request. This cluster is hooked up to the Memcached, Redis, MySQL, HBase, and ElasticSearch clusters. Since the site is coded in PHP, you’ll first reach Nginx which will send you off to php-fpm. Unless all the data for the page is cached in Memcached (highly likely), then you’ll probably be getting the data from the MySQL cluster. If your request is for a gallery search, then you’ll get the data from the ElasticSearch cluster, and some specific data is also stored in Redis and HBase. By this time, the request should have has everything it needs to form the page. It pieces it all together, travels back out to the HAProxy cluster, is microcached by Nginx, and your browser renders the page. All of this happens in mere milliseconds.
For image optimization they use GraphicsMagick: "It’s just like ImageMagick except it’s been much faster for us. All images are processed through GM to strip metadata, resize, and even crop."
Microcaching only works for logged out users, since every logged in user page is personalized.
They use just two instances of Redis, a master and a slave:
Redis is a bit different from Memcached in that it’s persistent. We use memcached for caching temporary things. We wouldn’t care if the entire memcached cache was cleared (although the site would be slow for a few minutes while the cache built up again). Redis is for things that we don’t want to lose, such as rate limiting for the API and keeping track of how many times a user did something.
For CDN, they used to use EdgeCast:
But now they use Cloudflare:
By the way, only 189 top-10,000 sites are using Cloudflare:
One of Cloudflare’s partners in testing Railgun over the last few months was Imgur, the massively popular photo-hosting site. As Imgur founder and CEO Alan Schaaf told me, “speed is an important feature at Imgur, and the entire site is designed to run as fast as possible. We’re really excited to use Railgun to take it even further. Not only has it helped speed up the delivery of our HTML content so far, it’s saving us 50% on our HTML bandwidth.”