By Andy Feng (@afeng76)
As reported in our earlier blog posts, Yahoo has adopted Storm as a key technology for low-latency, big-data processing. We are very pleased that Storm has been accepted as an Apache incubator project this week.
Yahoo started to play with Storm in Q3 2012, and we found its design attractive for several use cases. We were concerned that Storm was being managed by a single individual on GitHub though, and thus suggested Nathan Marz that Storm could move to Apache. Nathan was extremely supportive of the suggestion, especially after our meeting with Doug Cutting in November 2012. To prepare for the move, Storm’s contributor agreement document was revised to be compliant with the Apache CLA (per suggestion from Gil Yehuda, Yahoo). In August 2013, Nathan asked me to help out with the Storm proposal for Apache incubation. I was glad to have assistance from many Apache veterans at Yahoo (including Bobby Evans, Nathan Roberts and Gil Yehuda) in the effort.
On September 4, Nathan Marz submitted the Storm proposal to Apache. The voting proposal was put forward on September 12 by Doug Cutting, who also serves as the Champion for Storm incubator. Apache incubator PMC members casted many “+1” votes and no “-1” votes, and Storm was accepted for Apache incubation on September 18.
In my opinion, this is a great step forward for the Storm community. In the big-data processing ecosystem, Storm is a very popular low-latency platform, while Hadoop is the primary platform for batch processing. It will help to growe the big-data community further by having both Hadoop and Storm aligned within the Apache Software Foundation.
With Storm accepted in Apache, Yahoo is looking forward to working with the community to enhance Storm further. In the near-term, we plan to make our following Storm enhancements available for community:
- Netty based messaging (instead of 0MQ)
- Authentication (Kerberos, Digest, etc.), authorization and audit
- Multi-tenancy support with user-based resource isolation
Yahoo also plans to move Storm-on-YARN code from github.com/yahoo/storm-yarn to be a subproject of Apache Storm project in the near future. Storm-on-YARN is currently licensed under Apache License 2.0 and receives contributions under an Apache style CLA. Storm-on-YARN enables Storm topologies to be deployed onto Hadoop nodes with access to Hadoop data sets in HDFS/HBase etc.
Andy Feng is a Distinguished Architect at Yahoo, and a Core Contributor/Committer of Storm. He leads the design and architecture for Yahoo’s big-data platform.