To search through my Blog Postings....

Loading...

July 8, 2011

Facebook's Database Architecture: Think ExaData; Exalogic ?



Good Morning,

While parsing through my blog-feeds and the twitterati - one posting caught my eye.  More so, perhaps because I've been a database guy for a few years, so it catches my attention.

The issue is the MySQL database architecture of Facebook, and comments made by Michael Stonebraker have been echoed all over the blogosphere.  This article by Derrik Harris caught my eye and goes into details of sharding, servers and transactions.

By the way, this is typical of every startup I've known. Costs have to be kept down, and we need something to "hold data and transactions", until the  hockey stick catches up. Great problem to have from a business standpoint, but often the technical limits of a system get pushed.


So, every "Like" on Facebook is probably a database entry.  Can you image the user population of Facebook, all over the world, posting pictures, comments, "Liking" etc.  Even if you kept pictures outside the database and maintain some kind of bridge function (pointers if you will), the sheer amount of transactions make it a very interesting database challenge, perhaps even a  distributed database challenge.  Since FB connects Friends, and you cannot have some friends offline while some are online.  Thus it would seem the entire Facebook database architecture would be in one-site,   (Disclaimer: I haven't worked first-hand into the FB database architecture, and my comments are based on my reading on the twitterati and blogosphere)

Coming from the Oracle Database world, where row-level locking, atomic nature of transacations and liberal use of archive logging, rollback, and HA features like Oracle RAC are table-stakes, I can't help wonder how ExaData would fare in a replacement project for Facebook's Database.  MySQL is also shephered by Oracle, but fundamentally MySQL and the Oracle RDBMS are different products and I do not know of any intersection paths between the two.  

Would a "few" Oracle ExaData boxes  clustered together by Infiniband and RAC take Facebook to a newer technology level.  There is a healthy mix of solid state and traditional disk lookups.  Perhaps the storage may be a problem even with hybrid columnar compression. Thus we may have to architect an external storage solution.  One could also argue that it may be optimal to blend the database architecture with a mix of ExaData and MySQL.  Leave the really-fast, high throughput transactions in ExaData and some of the less critical in MySQL.   Sounds like an interesting challenge right...Here's to hoping someone is already designing these challenges.

Looking at the Application Architecture, Web and Application Server scaling do not seem to be a problem here.  In most cases, TomCat and Apache scaling does not seem to be a problem, and my impression on the level of Java used in Facebook, seems to be relatively light - POJO's, no Java Beans, if at all.  Thus ExaLogic may or may not have a play here.  But it really depends on how the Application and architecture is re-written or re-engineered and that can change quickly.  

However,  ExaData and ExaLogic to handle Facebook together may be a really elegant solution - minimize sprawl, provide a high-speed Infiiniband connection, and rock together. 

I do not know of any company of the stature of Oracle (maybe IBM or Microsoft - just maybe) putting significant investment capital into a database solution that can handle transactions of this nature as Facebook.

This thread promises an interesting watch across the different channels. The often-heard viewpoint of Facebook wanting to be open-source may come in the way.

3 comments:

social media planner said...

Something I have learned about social media in the past 5 months I've been involved is that it deals with a lot of karma.

Technogies said...

Nice post. I want to recommend to publish your New Technology post on Technogies

Technosalons said...

Like it something useful to read about facebook!!!