BlackBerry maker Research In Motion Ltd. (RIM) today blamed this week's massive BlackBerry outage on "the introduction
of a new, non-critical system routine that was designed to provide better optimization of the system's cache."
In a seven-paragraph statement, RIM said the diagnostic analysis of the BlackBerry service interruption -- which saw mobile email grind to a halt -- is progressing, and more information will be released as it becomes available.
Around 8 p.m. EST Tuesday, millions of North American BlackBerry users found themselves unable to send or receive mobile emails on their BlackBerry devices. Lack of service or spotty service continued well into Wednesday morning. Once service was fully restored, most users were greeted by backlogged emails that had been on hold throughout the blackout.
Several mobility experts and BlackBerry users complained Wednesday that RIM did nothing to notify them of the outage. As of Friday morning, both the RIM and BlackBerry Web sites had no mention of the outage.
It was still unclear how many of BlackBerry's 8 million worldwide users were affected, but BlackBerry has roughly 5 million users in the U.S. alone.
In a Web poll conducted Wednesday morning by ProfitLine, a telecom expense management firm, 80% of responding enterprise IT and telecom professionals said the BlackBerry outage caused disruption to operations. In addition, 44.5% reported a moderate or substantial impact to enterprise productivity. A smaller number, 18.2%, reported that the outage had no impact.
In a statement, ProfitLine's vice president of mobility strategies said, "These numbers show the critical role that wireless devices play in corporate America. Wireless communication has gone from a travel convenience to a mission-critical communications tool."
RIM added that it was able to definitively rule out security and capacity issues as root causes of the BlackBerry outage. RIM also found that the blackout was not caused by any hardware failure or core software infrastructure.
According to RIM's statement, the new system routine that caused the outage produced an "unexpected impact and triggered a compounding series of interaction errors between the system's operational database and cache. After isolating the resulting database problem and unsuccessfully attempting to correct it, RIM began its failover process to a back-up system."
Further delays in restoring service and processing the message queue were caused by RIM's failover process not working correctly, despite repeated testing.
RIM's statement concludes: "RIM apologizes to customers for inconvenience resulting from the service interruption. RIM's root cause analysis and system enhancement process with respect to this incident is ongoing and RIM has already identified certain aspects of its testing, monitoring and recovery process that will be enhanced as a result of the incident and in order to prevent recurrence."