There is an increasing number of hybrid environments where some of the systems are on-prem and others are running in Azure. For these types of environments, we need to use Operations Manager agents installed on the virtuals in IaaS and have them report to our on-prem Operations Manager environment (for information on how to monitor various types of Azure configuration see this blog post). In this blog post we will provide a method to determine a design which is optimal from a cost perspective depending on how many agents you have running in Azure. Please note, this blog post is based upon the concept that your Operations Manager already exists in your on-prem environment and the goal is to integrate your virtuals running in IaaS into your existing on-prem monitoring environment.
The primary discussions of this blog post will cover are:
- Options available for monitoring IaaS virtuals in Azure
- Costs for VMs and outbound bandwidth in Azure
- Using the Excel calculator to determine costs of options to monitor agents in Azure
- How to capture the amount of data sent outbound from the Operations Manager agent
Options available for monitoring IaaS virtuals in Azure
There are two major methods available for Operations Manager agents running on virtuals in Azure to report to an on-prem Operations Manager environment. In both of these configurations we are working from the assumption that the virtuals in Azure are either connected in a VPN connection or ExpressRoute. Using either of these types of connections, the agents in Azure on IaaS should be able to directly communicate with the on-prem management server(s) on port 5723. So the first option is straightforward and is just to have the virtuals in Azure report directly to the on-prem management servers such as in the example below: (Azure IaaS virtuals reporting directly to on-prem management servers)
Another option is to add two Azure Monitoring Gateway’s (AMGW’s are regular Operations Manager gateways running in Azure) and have them report to the on-prem management servers as shown below: (Azure IaaS virtuals reporting to AMGW’s which in turn report to on-prem management servers)
NOTE: For each of these designs, certificates are not required unless the servers are not part of the same forest that Operations Manager is in.
Costs for VMs and outbound bandwidth in Azure
At this point you may be thinking that this should work as normal design for Operations Manager works – treat this like a branch office and add a gateway (or two gateways for redundancy) if you have more than 10 or so systems in that remote location. However, there is a factor in designing for Azure which may alter this design – cost. In Azure, you are charged for each virtual that you are running in IaaS so the two AMGW’s will have a monthly cost. You are also charged for outbound network traffic from Azure such as the network traffic from the Operations Manager agents reporting to your on-prem Operations Manager environment. To make this more complex to determine the correct answer, there is a level of bandwidth savings from using gateways servers in place of agents directly reporting to management servers.
Cost for Gateway VMs
As an example, from the Azure Pricing Calculator we can see that the monthly costs on the two gateway servers vary depending on the level of hardware required (which will vary depending on how many agents are reporting to them).
For two extra-small basic VM’s:
For two medium VM’s:
For two large basic VM’s:
We want to use two VM’s in an availability set to avoid an outage when one of the two VM’s are down. Since these are Operations Manager gateways that agents are reporting to, we do not need either load balancing or auto-scaling (although auto-scaling could be useful).
According to the Operations Manager sizer, the following is the minimum recommended hardware for up to 2000 agents in Operations Manager (which matches pretty closely to the Large VM configuration in Azure).
Cost for Outbound Bandwidth
From the Azure Pricing Calculator we can see that inbound data to Azure is free and we can also determine that the first 5 GB of outbound data is has no additional costs.
0-5 GB free
If we look at the smallest section available for bandwidth we can see that the cost per gig equates to approximately 8.33 cents per GB beyond the 5 GB level included for free (the attached calculator provides the math backing this statement).
Using the Excel calculator to determine costs of options to monitor agents in Azure
The excel spreadsheet I put together for this takes the costing numbers above and combines them with several other factors (which size VMs to use, how many agents are in Azure, gateway compression percentage, what is the lower outbound and upper outbound data values for your agents) to show approximate costs for monitoring of these systems with and without gateway servers running in Azure.
To customize this spreadsheet for your environment, put in the number of VM’s you are monitoring, choose the VM option you want (extra small, medium, large) and specify what the lower and upper outbound traffic is for your specific environment (the last section of this blog post goes through the process to see how these values look for your specific environment). The most common items to change (VM option chosen and # of VMs monitored) are both shown in Red and the approximate cost ranges with and without gateways are shown in yellow.
To determine our estimated compression for the gateway, we drew from the following two blog posts:
- "In our performance tests we have noticed that the data sent from a gateway server to a management server is compressed by almost 50 percent." From http://blogs.technet.com/b/momteam/archive/2008/02/19/10-reasons-to-use-a-gateway-server.aspx
- The gateway would cut this in half approximately: http://blogs.technet.com/b/momteam/archive/2007/10/22/network-bandwidth-utilization-for-the-various-opsmgr-2007-roles.aspx
If you utilize gateway servers in your environment and you can test before and after traffic impacts you can determine if this percentage is consistent with what you find for your environment.
In my environments, I found that by the total data size for my servers varied strongly depending on what was being monitored on the server (which should not be any surprise). In the example below, the total data resulted in 1118 bytes per minute or .0017 MB/min for a domain controller.
For a SQL server, I had a lower value result of 500 bytes per minute or.00048 MB/min for a SQL server.
How to capture the amount of data sent outbound from the Operations Manager agent:
To identify the amount of traffic I downloaded and installed network traffic view from http://www.nirsoft.net/utils/network_traffic_view.html. Another option would be to install NetMon to gather this information. Run traffic view and pick the appropriate network interface.
NOTE: It may be required to disable IPv6 on the agent so that you can get the IPv4 connection information like I have done in the example below.
Under Options uncheck the Hide Closed TCP Connections option:
Find the name or IP address of the Operations Manager server that the agent is communicating with. It should show port 5723 as the destination port in the example shown below.
Total Data Size: The total size of the data of all packets (in bytes), excluding the Ethernet and TCP/IP headers, for the specified packets group.
Run the data capture for between an hour and 24 hours, add all the different connections from the agent to the MS total data size into a single value. That value is the number of bytes sent from the agent to the management server. By dividing that number by the number of minutes of the capture we can get bytes per minute which is the value we are looking for.
In the example above we add the three values (29494 + 40908 + 5606) to get the total and then we divide by the number of minutes the capture ran (60 in this example). So we have 1266 bytes per minute, .00015 MB/bin (http://www.bing.com/search?q=convert+bytes+to+mb).
The values used in the Excel spreadsheet will change over time especially if there is changes to the costs for VM’s or for outbound data from Azure. Additionally, the total outbound data will vary significantly in each environment so I highly recommend running tests over a 24 hour basis to see what average data traffic looks like for your specific workloads.
Summary: Your decision to either have agents report directly to a management server on-prem or to a gateway on Azure will vary depending upon how much outbound data you have for your environment and how many VM’s you will have running in Azure. In general from what I have seen in my spreadsheet it becomes more and more cost effective to add gateway servers once you have added a significant number of VM’s which are running in Azure (potentially savings hundreds of dollars on a monthly basis due to the data compression provided by the gateway). The spreadsheet discussed in this blog post is available for download at: http://www.systemcentercentral.com/download/excel-spreadsheet-calculate-monitoring-costs-iaas-virtuals-reporting-prem-opsmgr/