Facebook unveils Presto: exabyte scale query engine

facebook-logoAt a conference for developers at Facebook headquarters on Thursday, engineers working for the social networking giant revealed that it’s using a new homemade query engine called Presto to do fast interactive analysis on its already enormous 250-petabyte-and-growing data warehouse. Now more than 850 Facebook employees use Presto every day, scanning 320 TB each day.

Facebook created Hive several years ago to give Hadoop some data warehouse and SQL-like capabilities, but it is showing its age in terms of speed because it relies on MapReduce. Scanning over an entire dataset could take many minutes to hours, which isn’t ideal if you’re trying to ask and answer questions in a hurry.

With Presto, however, simple queries can run in a few hundred milliseconds, while more complex ones will run in a few minutes. It runs in memory and never writes to disk, making possible to analyze the data warehouse wich is 4000 times bigger than it was 4 years ago.

Read the entire article on Gigaom.

Advertisements
This entry was posted in Big Data and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s