| « Back to the future | What the fuck do you do all day? » |
Not so typical day
Since Rogue was whining at me (as per normal) I thought I’d go into what I did on a day a few weeks ago, a kind of “day in the life of CherryPopper”.
Follow up:
Got up at 8.30am did the normal 3 S’s (shit, shower, shave) jumped in the car and set it on 100km/h with cruse control since the pricks have put fixed speed cameras all they way along the Geelong/Melbourne highway.
I got a phone call on the way in that I need to divert to the Exhibition exchange to checkout a Sun V240 that has a suspected failed disk. When I get there I find that it looks as though the disk is having intermittent errors so I had better replace it just to be safe. I head back to the office to grab a spare disk, while I’m in the office one of the more senior IVR engineers lets me know that an IVR in the Exhibition exchange has failed and has had a number of errors recently in regards to memory. So he wants me to replace the entire IVR Sun system (mb, ram, cpu). So I load up with a heap of spares and jump back in the car.
I head back into the exchange with the spares only to be called just before walking through the exchange door by my manager screaming “don’t go in the exchange”. I tell him it makes it hard to fix a severity 1 call in time if I can’t go into the exchange to fix it. He then proceeds to tell me that there are 4 high level Telstra managers (including the manager that is responsible for ALL of Telstra IVR’s/Call Centers and is only a few rungs under the CEO) and they are going to quiz me on why this 1 IVR is having so many issues. Now normally I’m fine with clients wanting to know why shit went wrong but I’m only just starting to learn the IVR’s & how they work. I know Sun systems well enough but the Sun systems in IVR’s are custom built Sun servers, and can be very different to normal Sun architecture. Meaning I can’t bullshit my way through high level managers without getting myself in trouble.
My manager tells me to hang 5 as he is on his way to the exchange so he can run side screen and let him be the punching bag for the managers while I get the job done. We both go up and sure enough they are standing at the IVR waiting for me. I introduce myself and before I even get a chance to unload my arms full of parts I get “So what are we going to do to fix it” (this has got to be the big Telstra manager). I explain we are going to replace the entire system with a new one and he seems a little less ready to beat me to death with his blackberry. I get to work while my manager keeps them off my back.
I get the system replaced and boot it up only to get about 5 mins worth of running time out of it before it displays the same error. Fuck…. I’m now out of ideas & call the guru in the office. He to is out of ideas but as we are talking he was searching the web for answers and asks me to check the serial number of the CPU. He then tells me that there is a dud batch of CPU’s and that serial number is in that range, same with the other CPU I just pulled out & the other spare I brought along. That’s all the spares we have and these CPU’s haven’t been made for 4 years. Fuck…screwed again. By now the managers are getting worried as they see me pull out the replacement system. I explain the situation and again the big manager asks “So how are we going to fix this”. This time I have no answer and he whips his blackberry out of its holder ready to shoot me down with emails to my CEO, but lucky my manager pipes up with the idea to use one of the test systems that have been working for ages.
I like that idea so I go to one of the test systems and start shutting it down when the big Telstra manager stops me and says that’s not the same, pointing at the case. Now there are 2 cases for these systems one is just a slightly different design to give better cooling but everything else is the same. I try to explain that it’s just the case that is different and if he wants I can even swap the cases but no he wants me to use a different test system that our developers are currently using to test a new application that is due in a few days. So now not only am I in Telstra’s sights but our dev team are also taking aim.
I take the other test system the big Telstra manager wants me to use shut it down and make the swap. It boots up ok and is still running after 5 mins. I boot the rest of the systems and the IVR comes back to life. The Telstra guys start doing testing on the IVR to make sure it has come back ok and I start to clean up my mess when all of a sudden the normally very noisy floor of the exchange goes quiet…. Fans stop spinning, server lights go out and everyone is looking at me. FUCK …. “It wasn’t me” I explain to the now very angry managers.
Within 1 minute there are 20 people on the exchange floor with us, all looking as to why they just lost power to half the floor when the exchange has banks of UPS’s and generators. The power can actually trip 4 times as they have 2 redundant power feeds that have battery & generator backups. The first question I get asked “What did you do?” Now all the equipment I was working on is DC powered and it was the AC power that tripped. I hadn’t even plugged my laptop into the mains power yet. They are constantly asking me what I did and searching around where I was working looking for any incriminating evidence. But lucky for my ass I was clean because someone was going to burn for the outage that is never meant to happen.
Within 5 mins the power is restored and that’s when my phone starts ringing. On top of the failed disk I was originally called in to fix, which at this point I haven’t even looked at, I now have 5 other Sun systems that haven’t booted up successfully from the power outage that I have to look at. By this time its already 6pm all the managers have left and now I can look at the first server I came here to fix.
I get the failed servers backup and running as most just needed the partitions to be checked then mounted cleanly. But some of those partitions are very large and take quite a while to check. By the time I get the last server running it’s after 11pm I haven’t had lunch let alone dinner and I’m feeling shit from being in the dry air-conditioned air all day. But they are all working so I head off home.
I get in late the next day and my boss lets me know that the Telstra managers that were busting my balls big time yesterday have sent my boss a very big thank you for the work I did to get the system restored so quickly and my boss is loving me (enough love that he gives me a pay rise a few weeks later)
So maybe it wasn’t such a bad day after all ![]()