Monday, September 1, 2008

Willy and Wally install a SAN

THOSE AMAZING USERS

By Nancy Hand

When we first moved to Notes, Willy and Wally (our hardware guys) put a disk array under the primary mail servers. A couple of years later, it was time for new hardware. The hardware guys were seat-of-the-pants types. To them, documentation wasn't so much read as it was something you used to prop up your monitor.

By then, Willy and Wally were moving all the servers to SANs (Storage Area Networks), trying to pack more into less space. As part of the consolidation, the primary mail servers were going on a SAN with six other servers.


"It was just the idea of putting eight servers on one set of disks that made me queazy."

Being unfamiliar with SAN technology I didn't know what questions to ask. I was assured the new servers and disks would be faster. It was just the idea of putting eight servers on one set of disks that made me queazy.

The change was made. Everything seemed to be running fine. Except.

I watched the Domino console screens, puzzled. On the old servers, the console display scrolled by, too fast to read. Now, I could read where three messages had been delivered to Joe Smith, and Nadine Chung had sent something to AOL. I questioned Willy and was told everything was fine.

I persisted. Something wasn't right. It seemed to me the new, faster servers were actually processing mail slower. I was told to produce numbers -- a performance monitor report -- to show what was wrong. I ran reports, several times over the next few months. They showed nothing. The servers seemed to be running optimally. Mail still didn't look to be running as it should, but Willy and Wally wouldn't listen to my complaints.

Then.

The site added 1,000 temporary employees in preparation for critical work to the physical plant. Regular work would shut down at 2pm on a Tuesday so repairs could be started. Email would be a critical path of communication.

Email worked fine on Monday. It worked fine until almost 4pm Tuesday. Then it slowed to a crawl. Single-line messages took 30 seconds to open. People with large mailfiles waited 10 and 20 minutes for their inboxes to open. A short message sent to 3,000 people caused the Domino servers to panic.

Wednesday morning, my boss stormed into my cubicle wanting to know what was wrong with mail and what I was doing to fix it. Then, one of the other servers on the SAN crashed. Half of a pair of servers holding engineering documents on the plant, it was critical to the current repairs. My boss ran down the hall to confront someone else.

Willy came by to complain about how slow his mail was. I stared at him, trying to think of a way to say "I told you so", without using those exact words. I was saved as my boss dashed in to ask if I'd fixed the problem yet. Before I could answer, he dashed back down the hall to explain the problem to his boss. Wally wandered in, to complain about the mail. Yet another server crash called Willy and Wally away from my desk before I could find my voice.