News  -  Press Releases  -  Chat Now  -  Forums  -  IRC Network  -  IRC Help  -  About ETG  -  Contact Us
   Currently On ETG  

 
   
General News  
ETG News - Recent
ETG News - Archives
Press Releases
Forums
About EnterTheGame  
Company Info
Employees
Contacts
EnterTheGame Realms  
Doom III
CounterStrike
Quake IV
Full Directory
IRC Network Information  
News
News Archive
Organizations
How To Join / Chat
ETG Rules & Policies
ETG`s Purpose / Mission
ETG`s Server List
Partners
Network Staff
Network Helpers
How To Join The Staff
How To Link a Server
IRC Help and Guides  
New User Guide
IRC Etiquette
FAQs & Resources
ChanServ Help
VHOST Info
Hostmask Help
mIRC Setup Info
Security
Chat With Admin
Scripting
Explaination of Recent Downtime June 16th, 2007
Greetings,

I would like to take a moment to further explain our recent downtime (in the past 1.5 weeks).

We have been doing some major upgrades to our infrastructure the past couple weeks. We invested in a large managed switch to replace our large unmanaged one we had been using the past 5 or so years. We invested in remote power cycling equipment to be able to cycle power on servers remotely, and we also spent a bit of time to get WOL functionality setup and tested on ChanServ`s main server.

All of these investments and this and other work was to improve our overall uptime and hasten our ability to respond to any downtime. The overall outages we were having during this time was a couple 1-2.5 hour outages during installations and testing.

We still have not finished with all of this work and will continue working on it soon, but put everything on hold after a major stint of downtime we had on Wed 6/13/07. This downtime was completely unexpected and uncontrolable by us.

Long story short the morons at AT&T messed up one their backbone routers affecting many of their customers in our area, giving all of us an inability to connect to the majority of the Internet including most high-traffic sites such as Yahoo etc.

Our main office & ChanServ got caught in this mess and likewise both were unable to connect to the rest of ETG as well. The outage was immediately noticed, pages went out, phone calls were made, and staff was dispatched to the office to see what was wrong. Sadly upon getting there we discovered it was a bad internet outage. We had other plans in place if it dealt with power, but this was something we didn`t have a good answer for.

We called AT&T around 12:30 and talked to them the majority of the early morning hours. After struggling for a long time to get past Tier 1 techs who only knew how to read off questions and answers off a flowchart, we finally got to Tier 2 support that wasn`t much more helpful. They didn`t take our outage very seriously and considered it something needing a machine or router restart on our side despite us saying there had been no changes and the problem layed with them.

We were in the office until around 5:30am when we finally gave up getting AT&T to listen and decided to try again in a couple hours.

Around 6-7am AT&T started getting calls from a number of its other business customers complaining of an outage. ONLY THEN did they get off thier butts and take it seriously enough to even LOOK TO SEE IF THERE WAS AN ISSUE and try to find the issue. This is just completely insane. I understand, yes you are a big company and normally if there is a problem on your side you will get alot of people calling in. But if it is not during standard business hours, guess what, people won`t be at thier office to see it and call in. So maybe, just MAYBE if you didn`t have lazy people, who instead would take the initiative when a problem is reported at 12:30am you wouldn`t have had an outage until 5:30pm.

Anyways, after they found the problem. Then they spent many hours trying to figure out how to fix what they messed up in the first place. =/ Rule #1 in router changes, backup your config before you make changes. If things don`t work you can revert to the old one. There is NO GOOD REASON that it would take almost 12 hours from when they realized there is a problem or about 18 from the second it was first reported to repair the problem. Especially for a big company like AT&T which definately had infrastucture in place they could have switched traffic off to while they repaired it.

Anyways, we will be contacting a customer rep this coming week to make sure our voice on this outage is heard well.

We are very sorry for this downtime, and we hope AT&T will put policies in place to actually investigate outages more proactively as a result. Thank you for your understanding.


VD-WHiZ
3945 characters
Other Site News
Hurricane Sandy & Channel Se...
ETG Celebrates 13 Years!
DNS Outage
Maint on PA Server
IRC is getting an UPGRADE!
MD Server Maint
New Feature for Approved Egg...
Eggdrop Issues
ETG Celebrates 10 Years!!!
Reserved Nicknames
New Security System Enabled
Happy 9th Birthday ETG!
Major Network Maint TODAY!
UT3 Developer Chat TODAY!!!
9/11 Remembered
In The Forums
Site design and contents copyright (c) EnterTheGame & Excelsior Media Studios, Inc. 1999-2017
No portion of this site or any content on it can be copied or otherwise used/reproduced elsewhere without explicit permission by Excelsior Media Studios, Inc.
Site Coded/Constructed/Designed by Dave Sherman @ Excelsior Media Studios, inc.
Original Site Concept & Design by Callidus @ Digital Impurity - www.digitalimpurity.com