Metamarkets Open Sources Druid, Streaming Real-Time Data Store

Groundbreaking In-Memory, Distributed Columnar Data Store Processes Billions of Records With Sub-Second Response Times

SAN FRANCISCO, CA--(Marketwire - Oct 24, 2012) - Metamarkets, the leader in big data analytics for web-scale companies, announced today that it is open sourcing Druid, the streaming, real-time data store component of its analytics platform.

Delivering powerful data analytics and interactive dashboards, the Metamarkets solution is built on a big data stack for processing, querying, and visualizing high volume, high frequency event streams. The data store component, Druid, enables ongoing analysis of streaming big data, and is architected as an in-memory, distributed columnar data store. With the Druid data store, Metamarkets' Software-as-a-Service offering provides sub-second query response times across billions of records.

While many companies have embraced Hadoop as their first big data tool, they are finding that even though Hadoop is ideal for batch processing large data volumes, it does not support real-time data queries.

"Druid is the industry's first open source, fully distributed analytical data store," said Metamarkets' CEO Mike Driscoll. "By sharing the Druid data store with the open source community, we feel we're contributing a critical missing piece to the big data ecosystem. Releasing Druid as an open source project is a natural step for Metamarkets, as open source has been an integral part of our culture."

"When we started building Metamarkets' analytics solution, we tried several commercially available data stores, but they could not deliver sub-second queries at the volumes seen by our online advertising customers -- upwards of hundreds of billions of events per month. It became clear that Metamarkets needed to innovate and build our own data store," said Lead Architect Eric Tschetter. "Now we are excited to see how the open source community will apply Druid to their own applications."

Key Capabilities of Druid

  • Real-time. Druid was architected from the ground up for real-time data queries as well as continuous data updates. Most existing data stores were built for data uploads on a periodic basis. Druid was designed to support the ingestion and analysis of constant incoming streams of big data.

  • Distributed. Druid is the first open source fully distributed analytical data store. This architecture enables horizontal scalability, fault tolerance, and enables running in the cloud or across multiple data centers.

  • Operational simplicity. Druid's architecture is self-healing, enables rolling restarts, and allows simple scale up and scale down by easily adding or removing nodes.

Metamarkets has engaged with multiple large internet businesses, like Netflix and Riot Games, by providing early access to the code for evaluation purposes. Metamarkets anticipates that a complete open sourcing of Druid will help other organizations also solve their real-time data analysis and processing needs.

Availability
Druid is currently available as an open source project on GitHub at https://github.com/metamx/druid. Developers can freely download and use the data store, as well as contribute features and code back to the project.

Additional Resources

About Metamarkets
Metamarkets is pioneering a revolutionary approach to big data analytics. The groundbreaking solution offers business users the power and scale of the cloud to achieve fast insights across big data. Metamarkets enables a dramatic increase in revenue, improved user engagement, and the ability to avoid operational surprises by providing its customers with real-time insight across billions of records, while eliminating the need to integrate multiple disparate software solutions. To find out more about the San Francisco-based company's industry-leading analytics platform, please visit www.metamarkets.com.