Before I leap to the Leap Second issue, I need to start at the beginning.
If you were watching a Netflix streaming movie last night, you could have noticed a two hour interruption of the movie as Netflix was down. Yes, down as off the air, out of service, dark, AWOL, etc. A Forbes article describes things very nicely.
So were Pinterest and Instagram, two very popular web sites and services.
This is really big stuff for us nerds. You see, all these prestigious and bullet proof web sites and services are hosted by Amazon Web Services (AWS). Amazon has a bullet proof site of its own, and is leveraging their technology to host other businesses that need super reliable service. It is a marriage made in heaven for those really popular sites with super heavy traffic.
What happened to this bullet proof service is that these erstwhile reliable services did not take full advantage of the reliability that the Amazon network offers. You see, Amazon has multiple zones of service, and they mirror their data across these zones so that if one zone crashes for any reason, another zone located far away will pick up the slack and keep customers happy.
However, Netflix and friends got cheap. They bet their business on only one Amazon network zone, the one in Northern Virginia. When the power system faulted, Amazon’s backup generators picked up and then failed. Then, Amazon’s backup system to the first backup system failed. It was the perfect storm for problems. It was Murphy’s law deja vu, all over again.
So, those Amazon customers learned a lesson. No matter what data center you are using, and no matter how many backups it has for power and internet access, it will fail.
Now, on to the second part of the saga.
At midnight on Saturday, the worlds clocks turned from June 30 to July 1. It was also the year that the worlds atomic clocks added one second to the calendar. You might remember that this year is a Leap Year, and it was time to add that second to the worlds atomic clocks. We are talking about the Leap Second. Wired Magazine has a good article on this event.
The problem with this Leap Second adjustment is that a lot of the computerized clocks were not ready for the event, and many software platforms could not handle it. So, we have web sites crashing, too. Things were not down long, but add these outages to the Amazon outage, and you get an Internet Outageddon.
If you remember the hoopla about the Y2K problem, you can transfer that kind of knowledge to the Leap Second problem. Everybody was ready for Y2K, and nothing happened. Everybody knew about the Leap Second stuff, but thought they were OK.
It has been an interesting weekend. Now, back to your regularly scheduled blog posts.