What must an IT Department do to be successful? Running an IT Department requires diligence across many technology disciplines. Here are some suggestions for IT Management, that if met, will bring IT Operations closer to success.
- When the latest security patches have been applied to all servers. We use Windows Server Update Services (WSUS) to meet this requirement within Catapult. The details of how we evaluate and approve patches is described in our patch procedure.
- When all hardware is operational. There are no known failed components in the infrastructure. A streamlined process is in place to detect and respond to failed components. We also monitor the life cycle of equipment to make sure that critical systems are always under warranty. Microsoft System Center Operations Manager (SCOM) is used within Catapult to meet this requirement.
- When all critical devices are monitored 24/7 IT staff is notified when a failure event occurs. SCOM is used within Catapult to meet this requirement.
- When Line of business applications have sufficient bandwidth to perform their role. A monitoring solution should alert IT when network traffic exceeds 70% – because WAN links become saturated at this level and TCP retransmissions will occur, causing latency within applications. We use NetFlow Analyzer from ManageEngine to meet this requirement.
- When servers have sufficient hard drive space to perform their role. SCOM is used within Catapult to meet this requirement.
- When servers are protected from viruses. We use Microsoft Forefront Client Security to meet this requirement. The Forefront line of products also include specific solutions tailored for Exchange, SharePoint, Office Communications Server. There is also a hosted (cloud) offering that we use called Forefront Online Security for Exchange that scans emails for viruses and spam in the cloud before they ever reach our network.
- When servers are protected from data loss. We use Microsoft System Center Data Protection Manager to perform backup and recovery at Catapult. An additional criteria for success is the ability to restore a server quickly. Since DPM stores is data on disk, you do not need to locate a tape. Also, tape is less reliable than disk, and eventually goes bad after much usage. You don’t want to find out at the time of crisis that the tape is bad. Another criteria for success is the replication of this backup data to another data center, geographically separated to protected against theft or natural disaster.
- When servers are fast or adequately responsive to end user requests. We use SCOM to monitor the responsiveness of our servers and applications against thresholds.
- When servers have sufficient capacity to not only meet existing need, but to handle data and transactional growth for the next twelve months. We use Microsoft System Center Capacity Planner. Additionally, we use the trending capabilities within SCOM to forecast capacity.
- When all servers are provisioned with the lowest surface attack area possible. Windows Server 2008 was designed to limit the surface attack area (especially Server Core).
- When IT can respond to a request to provision a server in minutes. We use System Center Virtual Machine Manager (VMM) to instantly respond to requests for new servers. This leverages our fleet of Windows Server 2008 Hyper-V servers.
- When IT discusses and then tests changes before implementing them in a production environment. Using Virtualization can help reduce the cost of implementing change management.
- When the most critical systems are clustered.
Other questions to be asked include:
- What executive level reporting is available to keep IT accountable to the executives
- Is there a continuous training plan in place to equip IT Staff to handle these diverse requirements?
Please leave a comment below if you have any other suggestions to add to this list.