Wednesday, August 22, 2007
Converged Service Assurance and the Skype Outage


Recently, Rich Tehrani posted an interesting article about the recent Skype service outage over at his VoIP Blog. For those who don't know, the Skype VoIP service was down late last week.  Skype, to it's credit, has gone on the record about what happened on August 16th to cause the outage.

It turns out the outage was caused by a bug in Skype's peer-to-peer network resource allocation algorithm.  When a recent Windows patch was released by Microsoft, it caused millions of users to update their PCs and reboot, which rebooted their Skype software in the process.  This unexpected network load, coupled with the bug in their algorithm, caused the network to crash.  Skype, in no way, blames Microsoft for the issue, and claims to have the bug fixed to prevent further service disruptions of this type.

In his VoIP Blog, Rich Tehrani makes some great points about ongoing testing and monitoring of your IP-based applications.  In the article, Rich states:

While some journalists have come to the conclusion that VoIP is no longer reliable because of this outage the reality is that a software bug stopped the network from functioning properly and this had nothing to do with the inherent reliability of VoIP.
 
Still, when you leverage the benefits of IP communications you must also be aware of the responsibilities that come with the technology. Yes, you now have some responsibilities you may not have been aware of. Much the same way you now know to but a UPS on your e-mail server, you need to ensure you have adequate network management and security in place when you use VoIP on a regular basis basis.
 
In other words, take this outage as a learning experience. Learn to test your VoIP network. Learn to monitor your IP communications network. Learn to have redundancy in your IP telephony network. It is far better to be prepared than to be left without your vital communications systems.
 
With converged services like VoIP and IPTV, it's imperative to continuously monitor the health of the IP communications infrastructure, right down to the customer premise.  Continuous monitoring through passive testing, based on well-established thresholds for key performance metrics, will allow service providers to spot potential problems before they happen.  And, with active testing, problems can be pinpointed and service outages can be avoided.

In the case of the recent Skype outage, a converged service assurance solution may not have been effective in preventing the service disruption, but there may have been some warning that the levels of service were degrading beyond usefulness.  This may have alerted the network engineering team sooner, and perhaps helped pinpoint where the issues were sooner. 

This is a lot of speculation on our part, but more information when it comes to mission-critical services like VoIP is definitely better than less information.  As IP communications extends its reach into more and more organizations, service providers and enterprises alike need to ensure there are ways to monitor the health of their network and their applications.  A comprehensive, end-to-end converged service assurance solution is the only way to effectively ensure the uptime of these applications.

You can check out Rich's VoIP Blog and the article about the Skype outage here.

Author: Author name | posted@ Wednesday, August 22, 2007 9:54 AM

Digg It del.icio.us

Feedback

No comments posted yet.



Title:    
Name:    
Email:  
Url:  
Comments:   


Please add 4 and 7 and type the answer here: