Information Technology News.

Gartner offers the IT community some tips on handling Hadoop

Share on Twitter.

Sponsered ad: Get a Linux Enterprise server with 92 Gigs of RAM, 16 CPUs and 8 TB of storage at our liquidation sale. Only one left in stock.

Sponsered ad: Order the best SMTP service for your business. Guaranteed or your money back.

February 21, 2017

Research firm Gartner says that your best attempts at putting Hadoop to work might not go as expected, and you'll be partly to blame. But all is not lost. There is hope.

That's basically the conclusion of a talk delivered by Gartner Research director Nick Heudecker at the company's recent 2017 Sydney Data & Analytics Summit.

Heudecker opened the podium with the bleak prediction that an average of about 70 percent of all Hadoop deployments made this year will fail to deliver either the expected cost savings or the hoped-for new revenue.

He said the lack of trained and experienced people will be to blame for most of those misses and will also mean some technical issues once Hadoop is up and running.

Heudecker asserted that the first question he often hears from new Hadoop users is how they can actually get data into and out of their new cluster.

He also felt the need to advise attendees to sort out their data quality and security plans before starting any implementation, as retro-fitting them is common and ill-advised, in his words.

According to Heudecker, various organizations get into Hadoop with very optimistic expectations about what they can do. Hadoop is a replacement for databases or existing analytics tools, he asserted.

“For example, one enterprise client calls me every seven months and says they are replacing their data warehouse with Hadoop and then I reply I hope you have your CV ready” Heudecker half-joked.

To succeed with Hadoop, simply learn what it's good at and give it a new role your current analytics tools don't do well. But be stern with developers too as they are “always chasing the new shiny thing” with little regard for wider concerns. The bottom line is you may not even need Hadoop.

It's very good at doing extract, transform and load (ETL) operations quickly, but its SQL-handling features are less than stellar. It also chokes on machine learning and other advanced analytics tasks because it is storage-centric.

This means it's rather expensive to implement on-premises, where you'll need to acquire memory, compute and storage together. In the cloud, by contrast, you can buy compute and storage separately and save some money.

Heudecker therefore believes the cloud is the natural place to run Hadoop, although not everybody might agree on that statement.

The same goes for Spark, a very similar application/software that resembles Hadoop a lot, which is designed for in-memory processing and therefore makes for pricey hardware. But it's also good for machine learning, a workload other analytics tools just weren't designed to handle.

Another factor to consider is that Spark evolves quickly, with point releases arriving in as little as five weeks. Adopting it can therefore also mean performing frequent upgrades in order to stay secure.

Update on your schedule, not your vendor's, Heudecker advises. And don't let them push you around, he recommends.

One dead-end for young players that he identified is letting vendors sell you the complete Hadoop or Spark stacks, which comprise multiple and complex packages, not all of which are necessary for basic operations.

Paying for just the bits you need is so advisable that leading distributions of both tools now include pared-back bundles, and that's what Heudecker basically advises users.

There's another risk there, he said, because Red Hat remains alone among pure-play open source businesses to crack the billion-dollar revenue mark. Volatility is therefore to be expected in the Hadoop and Spark segment.

But once you train your people, find a worthy project, get on top of cloud vs on-premises cost constraints, master internet security and data quality, get your developers to become sensible and work out a good relationship with a stable vendor, you have got a decent chance of succeeding.

Hadoop isn't for everyone, but if you take the time to follow the above recommendations, you are placing the chances of success on your side. That way, you might not have to update and your CV...

Source: Gartner.

Sponsered ad: Get a Linux Enterprise server with 92 Gigs of RAM, 16 CPUs and 8 TB of storage at our liquidation sale. Only one left in stock.

Sponsered ad: Order the best SMTP service for your business. Guaranteed or your money back.

Share on Twitter.

IT News Archives | Site Search | Advertise on IT Direction | Contact | Home

All logos, trade marks or service marks on this site are the property of their respective owners.

Sponsored by Sure Mail™, Avantex and
by Montreal Server Colocation.

       © IT Direction. All rights reserved.