Let’s kick start the new year with the 3rd instalment of the “great technologies are not only for large enterprises” series. This time we’ll look into systems management, which helps IT department centrally monitor the numerous servers, switches, routers, etc in the company. In this field, the group so-called Little 4 (OpenNMS, Zenoss, Hyperic, GroundWork) has made great advances in providing solid but free tools. Why are they called Little 4? Because they are the 4 best open source systems managements, in contrast to the 4 most well-known proprietary systems managements called Big 4 (HP, IBM, BMC, and CA)
This is the reason why systems management is needed. Dashboard in a car can tell the driver all the most important indicators of the car’s performance in a glance. Similarly, systems management dashboard quickly tells the IT department about the performance of all their critical equipments
Good systems management software even allow creation of customised dashboards
Wealth of pre-configured metrics allows quicker Return On Investments (ROI) by enabling system administrators to monitor key performance statistics out-of-the-box. Add your router to the list of managed nodes, and, voila, you get router CPU and memory utilisation instantly. Add network printer to this list and you get instant number of pages printed and toner level. Add server to this list and you’ll automatically see the number of users logged in as well as virtual memory usage, and so on.
Never know for sure how many servers, routers, and manageable switches are in your network? Lost track of which servers are acting as web servers and database servers, etc? No problem. With automatic discovery feature, system administrators can just ask the systems management to find that information out
It isn’t that practical to monitor network health by continuously staring at the dashboards. If the network has been properly set up, then most of the time everything will work just fine. System administrators then only have to be alerted when something is out of ordinary. Speedy alert is important here. As the nodes being monitored are the critical ones, any disturbances will more likely than not affect multiple users. The alert will also helps system administrators make preliminary diagnosis in order to solve the disruption. If e-mail is not quick enough, than other alert methods such as instant messaging would also be helpful
Agent-less vs agent-based
Here, the systems management software finds out the current status of important network nodes by regularly querying the Simple Network Management Protocol (SNMP) server software residing within each node. Systems management usually only finds out few aspects of few components in the network node. Each aspect of each component has standardised Object IDentification (OID) within Management Information Base (MIB) catalog.
For security reason, SNMP servers will only answer questions about its status from the same community of servers. By default, this community is called public. Wherever possible, this default community name should be changed in corporate networks and the access level allowed should just be read-only.
Here, the systems management software finds out the current status of important network nodes by communicating with purpose-built agent software installed in each node. The communication uses proprietary method, not the standardised SNMP, so hopefully it is more secure than SNMP, although with few short-coming of its own