Recently in Operations Category

Web operations and infrastructure

  1. Apr05

    Facebook's photo storage rewrite

    Facebook logo

    This week Facebook will complete its roll-out of a new photo storage system designed to reduce the social network's reliance on expensive proprietary solutions from NetApp and Akamai. The new large blob storage system, named Haystack, is a custom-built file system solution for the over 850 million photos uploaded to the site each month (500 GB per day!). Jason Sobel, a former NetApp engineer, led Facebook's effort to design a more cost-effective and high-performance storage system for their unique needs. Robert Johnson, Facebook's Director of Engineering, mentioned the new storage system rollout in a Computerworld interview last week. Most of what we know about Haystack comes from a Stanford ACM presentation by Jason Sobel in June 2008. Haystack will allow Facebook to operate its massive photo archive from commodity hardware while reducing its dependence on CDNs in the United States.

    The old Facebook system

    Facebook photo serving architecture 2008

    Facebook has two main types of photo storage: profile photos and photo libraries. Members upload photos to Facebook and treat the transaction as digital archive with very few deletions and intermittent reads. Profile photos are a per-member representation stored in multiple viewing sizes (150px, 75px, etc). The past Facebook system relied heavily on CDNs from Akamai and Limelight to protect its origin servers from a barrage of expensive requests and improve latency.

    Facebook profile photo access is accelerated by Cachr, an image server powered by evhttp with a memcached backing store. Cachr protects the file system from new requests for heavily-accessed files.

    The old photo storage system relied on a file handle cache placed in front of NetApp to quickly translate file name requests into a inode mapping. When a Facebook member deletes a photo its index entry is removed but the file still exists within the backing file system. Facebook photos' file handling cache is powered by lighttpd with a memcache storage layer to reduce load on the NetApp filers.

    No need for POSIX

    Facebook photographs are viewable by anyone in the world aware of the full asset URL. Each URL contains a profile ID, photo asset ID, requested size, and a magic hash to protect against brute-force access attempts.

    /[pvid]_[key]_[magic]_[size].jpg

    Traditional file systems are governed by the POSIX standard governing metadata and access methods for each file. These file systems are designed for access control and accountability within a shared system. An Internet storage system written once and never deleted, with access granted to the world, has little need for such overhead. A POSIX-compliant node must specifically contain:

    • File length
    • Device ID
    • Storage block pointers
    • File owner
    • Group owner
    • Access rights on each assignment: read, write execute
    • Change time
    • Modification time
    • Last access time
    • Reference counts

    Only the top three POSIX requirements matter to a file system such as Facebook. Its servers care where the file is located and its total length but have little concern for file system owners, access rights, timestamps, or the possibility of linked references. The additional overhead of POSIX-compliant metadata storage and lookup on NetApp Filers led to 3 disk I/O operations for each photo read. Facebook simply needs a fast blob store but was stuck inside a file system.

    Haystack file storage

    Facebook Haystack diagram

    Haystack stores photo data inside 10 GB bucket with 1 MB of metadata for every GB stored. Metadata is guaranteed to be memory-resident, leading to only one disk seek for each photo. Haystack servers are built from commodity servers and disks assembled by Facebook to reduce costs associated with proprietary systems.

    The Haystack index stores metadata about the one needle it needs to find within the Haystack. Incoming requests for a given photo asset are interpreted as before, but now contain a direct reference to the storage offset containing the appropriate data.

    Cachr remains a first line-of-defense to Haystack lookups, quickly processing requests and loading images from memcached where appropriate. Haystack provides a fast and reliable file backing for these specialized requests.

    Reduced CDN costs

    The high performance of Haystack combined with new data center presence on the east and west coasts of the United States reduces Facebook's reliance on costly CDNs. Facebook does not currently have the points of presence to match a specialist such as Akamai, but the combined latency of speed of light plus file access should be performant enough to reduce CDN in areas where Facebook already has existing data center assets. Facebook can partner with specialized CDN operators in markets such as Asia where it has no foreseeable physical presence to boost its access times for Asian market files.

    Summary

    Facebook has invested in its own large blob storage solution to replace expensive proprietary offerings from NetApp and others. The new server structure should reduce Facebook's total cost per photo for both storage and delivery moving forward.

    Big companies don't always listen to the growing needs of application specialists such as Facebook. Yet you can always hire away their engineering talent to build you a new custom solution in-house, which is what Facebook has done.

    Facebook has hinted at releasing more details about Haystack later this month, which may include an open-source roadmap.

    Update April 30, 2009: Facebook officially announced Haystack and further details.

  2. Mar29

    Facebook's growing infrastructure spend

    Facebook logo

    On Thursday BusinessWeek reported Facebook is seeking new financing for its data center operation growth in 2009. Facebook continues to add new members and their associated content at an extremely fast pace, with most new growth coming from international markets. Facebook needs to expand its abilities to serve these markets by bolstering current infrastructure offerings and cutting latency to its members through new international points of presence. In this post I will take a deeper look at Facebook's current computing infrastructure and related expenses and examine likely new areas of investment in 2009.

    Facebook members

    Facebook currently has over 160 million members in its top 30 markets. Facebook enjoys a 24% market penetration across all 30 countries, including complete domination in Chile and Turkey, where 76% and 66% of all Internet users are members of Facebook. Facebook member numbers are taken from Facebook.com; total Internet users for each country is as reported by the CIA World Factbook.

    CountryFacebookInternet%
    United States54,739,960223,000,00024.55%
    United Kingdom17,308,04040,200,00043.05%
    Canada11,117,60028,000,00039.71%
    France8,888,14031,295,00028.40%
    Turkey8,720,68013,150,00066.32%
    Italy8,469,22032,000,00026.47%
    Australia5,245,04011,240,00046.66%
    Colombia4,404,90012,100,00036.40%
    Chile4,211,7005,570,00075.61%
    Spain4,125,42019,690,00020.95%
    Argentina3,388,0209,309,00036.40%
    Venezuela2,533,4005,720,00044.29%
    Indonesia2,311,34013,000,00017.78%
    Belgium2,041,1605,220,00039.10%
    Mexico2,006,84022,812,0008.80%
    Denmark1,998,6003,500,00057.10%
    Sweden1,937,6007,000,00027.68%
    Germany1,883,34042,500,0004.43%
    Norway1,729,6403,800,00045.52%
    Hong Kong1,584,6003,961,00040.01%
    India1,559,00080,000,0001.95%
    South Africa1,390,4405,100,00027.26%
    Switzerland1,344,2804,610,00029.16%
    Greece1,287,6202,540,00050.69%
    Egypt1,182,3608,620,00013.72%
    Malaysia1,143,36015,868,0007.21%
    Philippines1,011,3605,300,00019.08%
    Singapore1,001,8003,105,00032.26%

    Facebook's current infrastructure serves North America with minimal latency. Future expansion into Europe and southeast Asia are likely as Facebook tries to expand its international audience.

    Data centers

    Facebook data center map March 2009

    Facebook currently operates out of four data centers in the United States: three on the west coast and one on the east coast. Facebook leases at least 45,000 square feet of data center space.

    Switch & Data's PAIX at 529 Bryant Street in Palo Alto is just around the corner from the Facebook offices and a long-time home to Facebook servers. It's unclear how much of the 100,000 square-foot, liquid-cooled data center is currently occupied by Facebook.

    Facebook has been with Terremark's NAP West in Santa Clara since November 2005. Facebook originally leased 10,000 square feet but may have grown larger over the years. Facebook is still listed as a Terremark customer in Santa Clara but the company might be consolidating its operations into its new local data center.

    Facebook geo-distributed its web operations in 2008 with DuPont Fabros' ACC4 in Ashburn, Virginia. Facebook leased 10,000 square feet in 2007 and occupied the space in 2008 after extensive reworking of the Facebook backend. Facebook shares ACC4 with MySpace, Google, and other competitors.

    In January 2009 Facebook moved into its first exclusive data center, Digital Realty Trust's 1201 Comstock Street in Santa Clara. The 24,000 square feet of data center space operates at a PUE of 1.35, a respectable mark against reported 1.22 marks for Google and Microsoft. Facebook leases the center from Digital Realty as its sole occupant.

    Facebook is rumored to be adding an additional 20,000 square feet of data center space in Ashburn, Virgnia in DuPont Fabros' ACC5. Facebook is expected to move into ACC5 in September 2009 and place new servers online by the end of the year.

    Facebook recently announced an international headquarters in Dublin, Ireland that will include "operations support" across Europe, the Middle East, and Africa. A European data center is a likely expansion point for Facebook as they try to solidify their European offerings.

    Server loans

    Facebook loan growth

    Facebook paid for part of its infrastructure expansion through specialized debt financing from TriplePoint Capital. Facebook drew down $30 million in 2007 followed by another $60 million in 2008. BusinessWeek reports Facebook is currently trying to secure as much as $100 million in debt financing for its next round of growth.

    Debt financing against physical assets such as servers and office buildings offer lower rates than a traditional venture capital round. Facebook's server expenditures have a recoverable resale value mapped over a depreciating lifespan, unlike direct and unrecoverable payments to employees and service providers. Lenders such as TriplePoint are a specialized type of real estate investor, a market with huge risk premiums in the current market. Facebook's $100 million debt financing is bigger than TriplePoint's typical investments, placing Facebook's new expansions beyond the investment strategy's scope during a time of real-estate investment turmoil. Facebook needed to look to other financing operations for bigger infrastructure loans, an expected move for the growing company.

    Facebook spent $68 million on Rackable servers in 2007 and early 2008, likely as a result of their Virginia data center build-out. Facebook is also rumored to be a large consumer of premium-priced proprietary hardware NetApp storage appliances and Force10 networking.

    Facebook's debt financing agreement with TriplePoint Capital expired a few months ago, leading the company to seek new sources of financing for its new Santa Clara data center and other expansion plans. Facebook is in discussions with Bank of America for additional loans against this capital expenditure according to BusinessWeek.

    How many servers?

    Facebook had over 10,000 servers as of August 2008 according to Wall Street Journal coverage of a presentation by Jonathan Heiliger, Facebook's VP of Technical Operations. Facebook signed an infrastructure solutions agreement with Intel in July 2008 to optimally deploy "thousands" of servers based on Intel Xeon 5400 4-core processors in the next year.

    memcached
    ~800 memcached servers supplying over 28 TB of memory.
    Hadoop
    ~600 servers with 8 CPUs and 4 TB of storage per server. That's 4800 cores and about 2 PB of raw storage!
    Storage
    Facebook adds more than 850 million photos and 7 million videos to its data store each month. That's a lot of Filers.

    Facebook uses Akamai and other CDN providers to serve static content to visitors around the world. It's an expensive service offering not covered by Facebook's server debt financing.

    Summary

    Facebook's faces difficult infrastructure challenges as the company tries to keep up with explosive growth around the world. Current shocks in the real estate investment market have made property financing difficult for all companies, including Facebook. New infrastructure moves from Facebook coming online this year should lower total operating costs per server thanks to new efficiencies in the cost of power and a decline in leasing price per square foot as Facebook buys in bulk. I expect new deals with foreign governments such as Ireland will lead to new expansion by Facebook heavily influenced by the ex-Googlers on staff who have paved this path before.

    Facebook is a privately-held company, offering limited insights into its expenses and other operations. The company seems to be repricing its server debt financing each year and has just crossed into the capital lending realm of big banks not easily able to take big risks in their property portfolios at the moment.

Niall Kennedy Niall Kennedy is a web technologist in San Francisco, California in the United States. I am very interested in the world of... MORE »

Search this weblog:

Subscribe:

Recently Popular

Archives: Popular Categories

Sites: More from Niall