Get the lowest-cost and the best server colocation service in the business. Learn more.
Information Technology News.

Human error at cloud operator Joyent causes system crash

Share on Twitter.

Install your server in Sun Hosting's modern colocation center in Montreal. Get all the details by clicking here.

Do it right this time. Click here and we will take good care of you!

Click here to order our special clearance dedicated servers.

Get the most reliable SMTP service for your business. You wished you got it sooner!

May 28, 2014

Cloud systems operator Joyent went through a catastrophic failure late yesterday when an absent-minded administrator brought down an entire data center's computing assets.

The cloud services provider began reporting "transient availability issues" for its US-East-1 data center at around 6.30 PM, EST.

"Due to an internal operator error, all computing nodes in our US-East-1 data center were simultaneously rebooted," Joyent wrote.

"Some computing nodes are already backed up, but due to the very high loads on the control plane, this is taking some time to reboot the whole system. We are dedicating all operational and engineering resources to getting this issue resolved, and will be providing a full report on this failure once every computing node and customer virtual machine is back online and operational," to company added.

A percentage of the issues were fixed an hour or so later. A datacenter-wide forced reboot on all servers is just about the worst thing that can happen to a provider aside from the deletion of customer data, or multiple data centers going down simultaneously.

"While the immediate cause was operator error, there are broader systemic issues that allowed a 'fat finger' to take down a datacenter," explained Joyent's chief technology officer Brian Cantrill.

"As soon as we reasonably can, we will analyze how this was architecturally possible, what exactly happened, how the system recovered, and what improvements we will be making to both the software and to operational procedures to assure that this doesn't happen again in the future," added Cantrill.

Joyent has service-level agreements in place that will compensate customers for downtime. In going through such a stomach-churning fault, Joyent has joined an illustrious group of service providers that includes Rackspace, Microsoft, Google, and Amazon which have all had similarly catastrophic failures.

"Anything that allows you to administer many servers and VMs will allow you to do this," Cantrill added. "There was a silver lining here in the sense that it was an opportunity to see how the system behaved. There are lots of ways it could have been much worse."

"The system admin that made the error is mortified, there is nothing we could do or say for that operator that is going to make it any worse, frankly," Cantrill said.

The goal for the company is to learn from the problem and get better. "You don't teach dolphins with a shock collar," Cantrill explained. As to what will happen to that system admin is anybody's guest for now.

In other IT news

Facebook's wish that its open compute project (OCP) could bring hyperscale-style innovation to the internet community is somewhat bearing fruit, with an Australian company revealing a range of converged infrastructure and virtual SAN products using its server designs.

The company in question, Infrx, is a very small business with just four people. But that hasn't stopped it from working with Facebook, striking up a relationship with server makers Quanta and Wiwynn and releasing a range of products.

They include a pre-configured SAN based on VMware's vSAN, plus stack-in-a-box rigs running either Hyper-V, OpenStack or Hadoop, with Cumulus in the background handling software-defined networking chores.

The products have been designed in close collaboration with software vendors-- Whithouse said senior VMware staff assisted with the design of the vSAN equipment while Infrx's Metacloud offering is based on templates used to deploy the stack at Disney and Australian telco Telstra.

Founder Mark Whithouse says that users in Australia know of OCP, appreciate the low acquisition and operating costs it offers and feel it represents a chance to improve their operations.

And of course, price is obviously a big factor. Whitehouse is a veteran of a few enterprise storage vendors and says in his experience, Australian companies pay $1.98 per gigabyte. Infrx can deliver at 30 cents a gigabyte, he claims.

Whithouse says he expects a couple of sales in the next week, although there's been none so far.

The prospects operate at substantial, but not hyper scale, reflecting Infrx's belief that OCP equipment can make it into smaller data centres.

Perhaps as interesting as Infrx's offerings is Whitehouse's trip to visit Facebook to research the company.

On that trip he says he saw an assembly facility where Intel personnel told him they were installing 1,000 CPU sockets each week to feed Facebook's server farms.

Infrx is selling in Australia and New Zealand for now, but Whitehouse says the company's links in the OCP community means sales beyond the South Pacific may be possible.

The company is currently financed from the founders' pockets and while discussions with investors are welcome, they're not being actively pursued as Whitehouse and his colleagues feel that OCP-based infrastructure's main attraction is price.

If investors become involved, he fears they'll force higher prices on the company and destroy the advantages OCP confers.

Infrx is not alone offering stack-in-a-box products-- NetApp and Cisco's FlexPod, Oracle's engineered systems and VCE all have similar products.

The likes of Scale Computing do likewise with a 'white (no name) server' offering. We'll keep you posted on this and other stories.

In other IT news

According to various reports we've seen in the blogosphere this morning, IBM will reportedly end its contract agreement with NetApp.

Citing ďan internal memo reviewed by Bloomberg, the newswire says IBM has simply decided to offer enterprise customers its own solutions rather than continuing to resell the N-series network attached storage devices it gets from NetApp.

IBM's data storage sales aren't exactly that high, so it makes sense for the company to concentrate on shifting its own infrastructure and taking as much profit margin as it can rather than outsource with NetApp.

That its own v3500, Storwize v5000 and Storwize V7000 Unified are reasonable replacements for the N3000 Express, N6000 and N7000 it gets from NetApp means that the decision can't have been all that difficult in the first place.

Understandably, that doesn't make the decision any easier for NetApp, which draws about two percent of its revenue and more than a little credibility from its IBM alliance.

With its balance sheet challenged in several ways, losing some easy revenue is perhaps the last thing it needs, however.

It's also worth reflecting about what IBM's move says about the NAS market in general. The likes of Dropbox for business offer NAS-like functions (as end-users perceive them) without all the hassle of maintaining a device.

Naturally, such services aren't going to become less sophisticated any time soon, so it represents a real threat to those who need to access files alone.

To be sure, SaaS (Software-as-a-Service) also poses a parallel threat by removing the need for data storage solutions capable of serving the transactional needs of on-premises applications.

In fact, those new and emerging technologies threaten both IBM and NetApp at the same time. IBM at least has a cloud offering that it can use to compensate.

But NetApp looks instead like less of a good catch after years of acquisition speculation. Only time will tell how this will pan out in the next year or two.

Click here to order the best dedicated server and at a great price.

In other IT news

Workers at Microsoft Research (MSR) have implemented a new method to automatically check code for compliance with privacy laws, and Microsoft claims that its simple to use.

Legalease is to specify restrictions on how data is handled. One of the main drivers behind its development was that software developers and those setting companiesí privacy policies donít share a common language.

As an example, MSR says that more than 20 percent of the code in its Bing search engine changes on a daily basis, with changes made by thousands of programmers.

Even some small changes in code might affect how data is used or who views it, potentially violating company, government or regulatory privacy policies.

Keeping tabs on changes in very large systems, like the Bing search engine, using manual audits is difficult and very time consuming.

According to MSR, automated testing is the best way to verify compliance with privacy rules and laws on the massive scale demanded in environments like Bing.

Legalease uses allow/deny rules, with exceptions. This reflects privacy policy frameworks like the U.S. Health Insurance Portability and Accountability Act (HIPPA).

Grok, meanwhile, annotates existing code using a system that cross-references information from different sources, based on varying levels of confidence.

According to Microsoft, pattern-matching to column names across a database results in a low-confidence score, while annotations made manually by developers are deemed to be more trustworthy and thus get a high-confidence score.

MSR says it developed Grok for use on Bing but found writing suitable polices very difficult, and this was what led to Legalese. Both were tested on Bing and are now running on the data analytics pipeline.

MSR presented Legalese and Grok at the 35th IEEE Symposium on Security and Privacy in San Jose, California this week.

Source: Joyent Inc.

Get the most dependable SMTP server for your company. You will congratulate yourself!

Share on Twitter.

Need to know more about the cloud? Sign up for your free Cloud Hosting White Paper.

IT News Archives | Site Search | Advertise on IT Direction | Contact | Home

All logos, trade marks or service marks on this site are the property of their respective owners.

Sponsored by Sure Mailô, Avantex and
by Montreal Server Colocation.

       © IT Direction. All rights reserved.