Get the lowest-cost and the best server colocation service in the business. Learn more.
Information Technology News.

Google admits that users of its Persistent Disks storage system have lost data

Share on Twitter.

Get the most reliable SMTP service for your business. You wished you got it sooner!

Click here to order the best deal on a HP enterprise dedicated server and at a great price.

August 19, 2015

Late yesterday, Google has finally admitted that some of its customers running its Persistent Disks storage system have lost data, and added that a combination of lightning and old storage disks was to blame.

For now, Google says it's mostly users in the Europe-West-1-B region that appear to have been affected the most.

The service outage hit last Friday and left some users totally unable to connect to Persistent Disks, a storage system that exists independently of a virtual machine.

The issue lasted for several hours, and problems persisted across the weekend. Google has now published its analysis of the outage and says that on August 13th, “four successive lightning strikes on the electrical grid of a European datacenter caused a brief loss of power to storage systems which host disk capacity for GCE instances in the Europe-West-1-B zone.”

“Although automatic auxiliary backup systems restored power fairly quickly, and the storage systems are designed with battery backup, some recently written data was located on storage systems which were more susceptible to power failure from extended or repeated battery drain,” Google admitted.

“In almost all cases, the data was successfully committed to stable storage, although manual intervention was required in order to restore the systems to their normal serving state. But in a few cases, recent writes were unrecoverable, leading to permanent data loss on the Persistent Disk system.”

About five to six percent of disks in the data centre recorded “at least one I/O read or write failure” during the incident.

Overall, read failures persisted into Monday for about 0.05 percent of its users, and Google now says that about 0.000001 percent of disk space has proved impossible to recover.

Several customers were understandably inconvenienced by this mishap, and a few voiced their concerns.

“This outage is wholly Google's responsibility,” the document continues, but then goes on to say “to highlight an important reminder for our customers-- GCE instances and Persistent Disks within a zone exist in a single Google datacenter and are therefore unavoidably vulnerable to datacenter disasters.”

In other words, should lightning strike twice, you should remember that a single datacentre can't beat two. “Full data protection, integrity and redundancy is critical to most business operations, and a disaster recovery solution is absolutely a must in these conditions,” says Jonathan Price, vice president of data center technology at Sun Hosting, a major data center services provider and disaster recovery specialist located in Montreal, Canada.

But Google's confessional also says the company “has an ongoing program of upgrading to storage hardware that is less susceptible to the power failure mode that triggered this incident. Most Persistent Disk storage is already running on this hardware.”

Google adds that it's conducted a review of the incident and that “several opportunities have been identified to increase physical and procedural resilience.”

Source: Google.

Get the most dependable SMTP server for your company.

Share on Twitter.

IT News Archives | Site Search | Advertise on IT Direction | Contact | Home

All logos, trade marks or service marks on this site are the property of their respective owners.

Sponsored by Sure Mail™, Avantex and
by Montreal Server Colocation.

       © IT Direction. All rights reserved.