IBM fully endorses Apache’s Spark open-source initiative
Share on Twitter.
Get the most reliable SMTP service for your business. You wished you got it sooner!
June 15, 2015
IBM said earlier this morning that it fully endorses Apache’s new Spark open-source cluster computing framework.
To be sure, Spark will form the basis of all IBM’s enterprise analytics and commerce platforms, and its
Watson Health Cloud. The new services will be sold as services on its Bluemix cloud.
To be sure, Big Blue will commit more than 3,500 of its own researchers and developers to Spark-related
projects and it also promised a Spark Technology Center in San Francisco, where data science and developers
can work with IBM's top designers and system architects.
The IT behemoth also committed to release under open source terms its System-ML family machine-learning
Spark was invented by researchers at the University of California at Berkeley in 2009, under Matei
Zaharia and was then donated to the Apache Foundation in 2013.
Written in Java, Scala and Python, Spark is an in-memory system for processing large data
It consists of scheduling and dispatching an SQL-style programming language, a machine-learning framework
and distributed graphics processing framework.
Spark can scale to more than 8,000 production nodes and, while it works with Hadoop and MapReduce,
it is claimed to also be faster on certain workloads.
Up until late 2014, Spark had just 465 contributors. Now it's above the 1000 mark.
The weight of IBM can make or break just about any open-source project, and this one is no
exception. Big Blue adopted the Eclipse framework early on, making it the basis of its Rational
Overall, serving as the foundation of IBM’s tools helped establish Eclipse as one of the industry’s
biggest development environments, behind Microsoft’s Visual Studio, and guaranteed an entire ecosystem
of ISVs building Eclipse plug-ins.
And it’s been a virtuous circle, literally-- IBM is freed from having to maintain the IDE integration,
ISVs and developers got an open, pluggable tools platform, and Big Blue benefits from technology advances.
On the other extreme, you have Harmony (also an Apache project) for an independent alternative to
Java from the now defunct Sun Microsystems (it was acquired by Oracle several years ago).
IBM made sure it was present because it vied with Sun for stewardship over Java back in the early
When Sun ceased to exist, IBM withdrew from Harmony in October 2010 to join the OpenJDK project
with Apple and Oracle.
Understandably, drained of its biggest backer, Harmony shut down 12 months later, as many had
Oracle then sought to make amends with Apache in 2011 by making its OpenOffice productivity
suite available under an open source arrangement.
Today, by announcing its backing for Apache's new Spark initiative, IBM paints the project as a
platform for data and analytics, the analogy being Linux, which IBM also contributes to, as a platform
for enterprise apps. The parallel, though, would seem closer to Eclipse, if you look at it more closely.
Get the most dependable SMTP server for your company. You will congratulate yourself!
Share on Twitter.