The California Department of Motor Vehicles does not appear to have had an adequate disaster recovery system in place before a computer meltdown wiped out most operations for several days, two information technology experts say.
The outage began Monday and, at its peak, two-thirds of the DMV's 188 offices around the state were unable to process driver's license or vehicle registration matters. By late Thursday, six offices still had limited services available.
Based on limited information the DMV provided in response to questions submitted by The Associated Press, the experts said it appears the DMV's technology infrastructure falls short of the best industry practices.
In particular, they said, the DMV should have a plan to recover from a catastrophic failure that involves distinct computer systems that are physically separated with independent power supplies. That way, if one data center — or even one section of a data center — overheats or experiences a power surge, backup systems aren't affected.
The California DMV ran primary and backup systems side by side in the same hardware cabinet.
"If their definition of disaster recovery is having primary and backup systems in the same hardware chassis, that's grotesque," said Richard Fichera, a vice president of Forrester Research who advises large companies on servers, storage and data centers. "That is completely inadequate for a critical statewide agency like the DMV."
The DMV said Friday that three offices were still experiencing problems but it expected all to return to normal operations by the end of the day. The department says its systems were not hacked or targeted and it did not lose customer data.
DMV officials have said the outage was triggered by the failure of hard drives in both primary and backup systems at one of its two server facilities, but have not said what caused them to fail.
The DMV's system was designed to withstand failures in the primary or the backup system but not both, said Jaime Garza, a DMV spokesman. The ability to recover from a disaster was degraded because the systems in two separate locations were used simultaneously, he said.
"The computer system has redundancy," Garza said. "Unfortunately, with the loss of several hard disks in a short time period, that redundancy was lost."
Maintaining primary and backup drives within the same physical location and even the same equipment cabinet is "insane," said Stuart Lipoff, president of IP Action Partners, a technology consultant based in Newton, Massachusetts, who reviewed the DMV's explanation of the failure.
"It's inexcusable that the entire system would come down," Lipoff said. "They were not observing what are good practices and good guidelines that I think any modern IT department would practice."
The outage forced DMV customers to wait out the troubles. Some said their lives were put on hold while they waited to replace stolen identification cards or renew vehicle registrations on the verge of expiring.
DMV officials say the department may waive late fees for customers affected by the outage. They'll have to fill out a form or write a letter explaining why they're late.
Assemblyman Mike Gatto, D-Los Angeles, said the DMV's computer issue is unlikely to have put personal information at risk, but is unsettling nonetheless.
"This is an agency that starts with a very skeptical public," Gatto said. "This is way beyond the typical frustrations that people expect from the DMV."
The DMV's issue illustrates ongoing concerns about disjointed state cyber security protocols, he said. He advocates for agencies to make government contracts "more accessible" to Silicon Valley.
AP writer Alison Noon contributed to this report.