Open source systems management tools

If your IT shop has the right skills, open source systems management tools may be a fit for your data center and save money over proprietary solutions. This slides will show features some of the top tools.

Large IT organizations turn to open source systems management tools

Top areas where open source systems management tools used
(Click here for a larger version)

Usenix, a systems administration user group, and Zenoss, an open source systems management vendor, recently completed a survey on open source systems management software use between 2006 and 2009. Respondents were attendees of the organization’s Large Installation System Administrators conference. Nearly all respondents use or plan to use open source systems management tools, with many shops turning toNagios, Cacti, Zabbix, GroundWork and the OpenNMS project. When asked “What are the top areas where you plan to use open source systems management tools?” 90% answered monitoring, around 60% said configuration and around 50% said patch management.

The benefits of open source systems management

Top reason for using open source software
(Click here for a larger image)

When asked the question “Why did you or would you be likely to try open source software?” responding shops said that they have turned to open source systems management tools to reduce costs and increase flexibility. Easy deployment was also a top reason for trying open source. In 2006, only 26% of survey respondents indicated this as a reason for using open source; in 2009, however, 71% of all respondents indicated this as a reason for using open source. This finding may indicate that open source not only removes technical hurdles but also preempts some of the bureaucratic obstacles associated with the traditional technology procurement process.”Open source offerings are newer and often written to be easier to deploy than older systems,” said Michael Coté, an analyst at RedMonk, an industry analyst firm. “An admin can download and install it without asking for funding, agreeing to any terms for a trial or filling out registration forms. Being able to download a piece of software by right-clicking is going to be easier than most other acquisition paths.”

The drawbacks to open source systems management

Top reasons for not using open source
(Click here for a larger image)

So what are the primary reasons IT shops would not use open source tools? Lack of support was the main culprit, and users said proprietary tools had better support and product maturity as well as less risk.”You get the support you pay for,” Coté said. “If you don’t want to pay anything, just download Nagios, OpenNMS or Zenoss Core and go at it alone. You’ll be paying in your time: time to ask questions in forums and wait for answers, time to look through existing write-ups on the Web, and, if you’re of the right kind of mind, time to look through the code yourself. Closed-source offerings can seem to have more support available because you’re required to buy support.”

Ed Bailey, a Unix team lead at a major credit reporting agency, uses the proprietary version Hyperic HQ Enterprise to manage Web applications that drive his company’s revenue. Bailey said he doesn’t have the time to cobble together — let alone develop and maintain — the automation, security and reporting features that ship with the enterprise version. “You can make a reporting system for the open source version of Hyperic HQ. If you have the time, you can make anything. But our company is more focused on things that generate revenue rather than me spending time working on this,” Bailey said. “I used to work at a university and we had time to build something like that, whereas now we have millions of transactions that are making money.”

Special skills to use open source systems management tools?

What skill set do sys admins need to have to deploy systems management software successfully in an IT organization? “Any scripting experience in general is helpful,” said Ryan Matte, a data center admin at Nova Networks Inc. “Basic Python knowledge is very helpful when using Zenoss. I often use Bash scripting as well. A decent understanding of SNMP [Simple Network Management Protocol] is definitely required (since the open source products don’t tend to be as automated as the enterprise products). I often find myself developing custom SNMP monitoring templates for devices, [but] … you should have an understanding of whatever protocols you are working with. An understanding of Linux/BSD [Berkeley Software Distribution] is helpful as well since most of the open source monitoring products that I’ve seen only run on Linux/BSD.”

Virtualization driving proprietary management tool dominance

% of respondents who cite that product features have become the more important advantage of proprietary software
(Click here for a larger version)

Starting in 2009, a much larger percentage of data center managers indicated proprietary systems management software has an advantage over open source tools in advanced product features. In 2009, 33% of all respondents indicated that product features played a bigger part in defining the advantages of commercial tools, versus 10% in the previous year. Though not explicitly spelled out in the survey, you can translate product features to “virtualization management features.” Matte is using Zenoss’ open source offering, Zenoss Core, and said he has evaluated Zenoss’ proprietary enterprise ZenPacks, which have virtual machine management features. “I have taken a look at the enterprise ZenPacks, and there is nothing like the VMware [Management] Pack in the open source community,” Matte said.

Open source systems management profile: Spacewalk

Spacewalk

Spacewalk is an open source Linux systems management tool and the upstream community project from which the Red Hat Network Satellite product is derived. Spacewalk provides provisioning and monitoring capabilities as well as software content management.James Hogarth, a data center admin in the U.K., uses Spacewalk to manage 100 hosts in a CentOS-based environment for an entertainment website built on the Grails distribution. Hogarth said his company’s entire environment is focused on open source software — even migrating server virtualization from VMware to the Red Hat Kernel-based Virtual Machine (or KVM) hypervisor — and that open source focus was a major factor in the decision to use open source systems management tools.

Hogarth said he’s run into some gotchas and issues that needed a workaround, but overall Spacewalk has lightened his support workload. Most of the development is done by Red Hat personnel, and the developers are often available to answer questions and troubleshoot issues. “People are very responsive [on the support forum], and it’s relatively rare that you don’t get a response,” Hogarth said. “Over the last two years, the product has really matured.”

Open source data center automation and configuration tools

Puppet is one option

In the open source space, Cfengine and Puppet are leading data center automation and configuration tools. In 1993, Mark Burgess at Oslo University College wrote Cfengine, which can be used to build, deploy, manage and audit all the major operating systems. Cfengine boasts somelarge customers, including companies such as eBay and Google. Cfengine offers a proprietary commercial version called Cfengine Nova. As an open source-only product, Puppet takes a different approach, and its creators, Puppet Labs, make money through training and support.Puppet founder Andrew Schafer, for example, wrote a column on Puppet and how it works. Also, James Turnbull recently wrote a book on using Puppet in the data center. Turnbull has also written tips on Puppet, including the recent article on using the Puppet dashboard. The Oregon State University Open Source Laboratory uses Cfengine for systems management but planned to move to Puppet. “From a technical point of view, Puppet offers more flexibility and an ability to actually use real code to deal with tasks. Cfengine has its own syntax language, but it’s not really suited for complex tasks,” said OSUOSL administrator Lance Albertson in an interview earlier this year.

Open core versus open source software

Some companies offer what’s considered “open core” systems management software. At the base level is a functional, free open source tool (like Zenoss Core or Hyperic HQ), and there is a separate proprietary enterprise version with special add-ons and features. This business model rankles some open source advocates, but it offers companies the chance to use a tool risk free, and oftentimes organizations can make the free version work.Ryan Matte, a data center admin at Ottawa, Ontario-based Nova Networks Inc., uses Zenoss Core to manage more than 1,000 devices, monitoring Windows, Linux, Solaris and network devices. Matte considered Nagios, Zabbix, and OpenNMS. “In terms of ease of use and setup and having all the monitoring capabilities in the product, Zenoss was the best choice,” he said. “There’s an IRC channel chat room — I’m in there quite a bit. There are always people in there. The [community] support is pretty good, but you have to come in during business hours.”

Using Webmin for data center server management

Webmin

Webmin offers a browser-based interface to Unix and Linux operating systems. It can configure users, disk quotas, services or configuration files as well as modify and control open source apps. Here are some tips on using Webmin:

Using Nagios in the data center to manage servers

Nagios

In many data center environments, Nagios has become the de facto standard for companies in need of an open source, fault-tolerant solution to monitor single points of failure, service-level agreement shortcomings, servers, redundant communication connections or environmental factors. But is this one-size-fits-all open source tool best suited to your data center? Here are some SearchDataCenter.com tips on Nagios:

A Guide to Physical Security for Data Centers

July 26, 2012
A Guide to Physical Security for Data Centers

The aim of physical data center security is largely the same worldwide, barring any local regulatory restrictions: that is, to keep out the people you don’t want in your building, and if they do make it in, then identify them as soon as possible (ideally also keeping them contained to a section of the building). The old adage of network security specialists, that “security is like an onion” (it makes you cry!) because you need to have it in layers built up from the area you’re trying to protect, applies just as much for the physical security of a data center.

There are plenty of resources to guide you through the process of designing a highly secure data center that will focus on building a “gold standard” facility capable of hosting the most sensitive government data. For the majority of companies, however, this approach will be overkill and will end up costing millions to implement.

When looking at physical security for a new or existing data center, you first need to perform a basic risk assessment of the data and equipment that the facility will hold according to the usual impact-versus-likelihood scale (i.e., the impact of a breach of the data center versus the likelihood of that breach actually happening). This assessment should then serve as the basis of how far you go with the physical security. It is impossible to counter all potential threats you could face, and this is where identification of a breach, then containment, comes in. By the same token, you need to ask yourself if you are likely to face someone trying to blast their way in through the walls with explosives!

There are a few basic principles that I feel any data center build should follow, however:

  • Low-key appearance: Especially in a populated area, you don’t want to be advertising to everyone that you are running a data center. Avoid any signage that references “data center” and try to keep the exterior of the building as nondescript as possible so that it blends in with the other premises in the area.
  • Avoid windows: There shouldn’t be windows directly onto the data floor, and any glazing required should open onto common areas and offices. Use laminate glass where possible, but otherwise make sure windows are double-glazed and shatter resistant.
  • Limit entry points: Access to the building needs to be controlled. Having a single point of entry for visitors and contacts along with a loading bay for deliveries allows you to funnel all visitors through one location where they can be identified. Loading-bay access should be controlled from security or reception, ideally with the shutter motors completely powered down (so they can’t be opened manually either). Your security personnel should only open the doors when a pre-notified delivery is arriving (i.e., one where security has been informed of the time/date and the delivery is correctly labelled with any internal references). Of course all loading-bay activity should also be monitored by CCTV.
  • Anti-passback and man-traps: Tailgating (following someone through a door before it closes) is one of the main ways that an unauthorized visitor will gain access to your facility. By implementing man-traps that only allow one person through at a time, you force visitors to be identified before allowing access. And anti-passback means that if someone tailgates into a building, it’s much harder for them to leave.
  • Hinges on the inside: A common mistake when repurposing an older building is upgrading the locks on doors and windows but leaving the hinges on the outside of the building. This makes is really easy for someone to pop the pins out and just take the door off its hinges (negating the effect of that expensive lock you put on it!).
  • Plenty of cameras: CCTV cameras are a good deterrent for an opportunist and cover one of the main principles of security, which is identification (both of a security breach occurring and the perpetrator). At a minimum you should have full pan, tilt and zoom cameras on the perimeter of your building, along with fixed CCTV cameras covering building and data floor entrances/exits. All footage should be stored digitally and archived offsite, ideally in real time, so that you have a copy if the DVR is taken during a breach.
  • Make fire doors exit only (and install alarms on them): Fire doors are a requirement for health and safety, but you should make sure they only open outward and have active alarms at all times. Alarms need to sound if fire doors are opened at any time and should indicate, via the alarm panel, which door has been opened; it could just be someone going out for a cigarette, but it could also be someone trying to make a quick escape or loading up a van! On the subject of alarms, all doors need to have  alarms and be set to go off if they are left open for too long, and your system should be linked to your local police force, who can respond when certain conditions are met.
  • Door control: You need granular control over which visitors can access certain parts of your facility. The easiest way to do this is through proximity access card readers (lately, biometrics have become more common) on the doors; these readers should trigger a maglock to open. This way you can specify through the access control software which doors can be opened by any individual card. It also provides an auditable log of visitors trying to access those doors (ideally tied in with CCTV footage), and by using maglocks, there are no tumblers to lock pick, or numerical keypads to copy.
  • Parking lot entry control: Access to the facility compound, usually a parking lot, needs to be strictly controlled either with gated entry that can be opened remotely by your reception/security once the driver has been identified, or with retractable bollards. The idea of this measure is to not only prevent unauthorized visitors from just driving into your parking lot and having a look around, but also to prevent anyone from coming straight into the lot with the intention of ramming the building for access. You can also make effective use of landscaping to assist with security by having your building set back from the road, and by using a winding route into the parking lot, you can limit the speed of any vehicles. And large boulders make effective barriers while also looking nice!
  • Permanent security staff: Many facilities are manned with contract staff from a security company. These personnel are suitable for the majority of situations, but if you have particularly sensitive data or equipment, you will want to consider hiring your security staff permanently. A plus and minus of contract staff is that they can be changed on short notice (e.g., illness is the main cause of this). But it creates the opportunity for someone to impersonate your contracted security to gain access. You are also at more risk by having a security guard who doesn’t know your site and probably isn’t familiar with your processes.
  • Test, test and test again: No matter how simple or complex your security system, it will be useless if you don’t test it regularly (both systems and staff) to make sure it works as expected. You need to make sure alarms are working, CCTV cameras are functioning, door controls work, staff understands how visitors are identified and, most importantly, no one has access privileges that they shouldn’t have. It is common for a disgruntled employee who has been fired to still have access to a building, or for a visitor to leave with a proximity access card that is never canceled; you need to make sure your HR and security policies cover removing access as soon as possible. It’s only by regular testing and auditing of your security systems that any gaps will be identified before someone can take advantage of them.
  • Don’t forget the layers: Last, all security systems should be layered on each other. This ensures that anyone trying to access your “core” (in most cases the data floor) has passed through multiple checks and controls; the idea is that if one check fails, the next will work.

The general rule is that anyone entering the most secure part of the data center will have been authenticated at least four times:

1. At the outer door or parking entrance. Don’t forget you’ll need a way for visitors to contact the front desk.

2. At the inner door that separates the visitors from the general building staff. This will be where identification or biometrics are checked to issue a proximity card for building access.

3. At the entrance to the data floor. Usually, this is the layer that has the strongest “positive control,” meaning no tailgating is allowed through this check. Access should only be through a proximity access card and all access should be monitored by CCTV. So this will generally be one of the following:

  • A floor-to-ceiling turnstile. If someone tries to sneak in behind an authorized visitor, the door gently revolves in the reverse direction. (In case of a fire, the walls of the turnstile flatten to allow quick egress.)
  • A man-trap. Provides alternate access for equipment and for persons with disabilities. This consists of two separate doors with an airlock in between. Only one door can be opened at a time and authentication is needed for both doors.

4. At the door to an individual server cabinet. Racks should have lockable front and rear doors that use a three-digit combination lock as a minimum. This is a final check, once someone has access to the data floor, to ensure they only access authorized equipment.

The above isn’t an exhaustive list but should cover the basics of what you need to consider when building or retrofitting a data center. It’s also a useful checklist for auditing your colocation provider if you don’t run your own facility.

In the end, however, all physical security comes down to managing risks, along with the balance of “CIA” (confidentiality, integrity and access). It’s easy to create a highly secure building that is very confidential and has very high integrity of information stored within: you just encase the whole thing in a yard of concrete once it’s built! But this defeats the purpose of access, so you need a balance between the three to ensure that reasonable risks are mitigated and to work within your budget—everything comes down to how much money you have to spend.

About the Author

David Barker is technical director of 4D Data Centres. David (26) founded the company in 1999 at age 14. Since then he has masterminded 4D’s development into the full-fledged colocation and connectivity provider that it is today. As technical director, David is responsible for the ongoing strategic overview of 4D Data Centres’ IT and physical infrastructure. Working closely with the head of IT server administration and head of network infrastructure, David also leads any major technical change-management projects that the company undertakes.

About 10 “must haves” your data center needs to be successful.

The evolution of the data center may transform it into a very different environment thanks to the advent of new technologies such as cloud computing and virtualization. However, there will always be certain essential elements required by any data center to operate smoothly and successfully.  These elements will apply whether your data center is the size of a walk-in closet or an airplane hanger – or perhaps even on a floating barge, which rumors indicate Google is building:

Figure A

floatingbarge_google.jpg
 Credit: Wikimedia Commons

1. Environmental controls

A standardized and predictable environment is the cornerstone of any quality data center.  It’s not just about keeping things cool and maintaining appropriate humidity levels (according to Wikipedia, the recommended temperature range is 61-75 degrees Fahrenheit/16-24 degrees Celsius and 40-55% humidity). You also have to factor in fire suppression, air flow and power distribution.  One company I worked at was so serious about ensuring their data center remained as pristine as possible that it mandated no cardboard boxes could be stored in that room. The theory behind this was that cardboard particles could enter the airstream and potentially pollute the servers thanks to the distribution mechanism which brought cooler air to the front of the racks. That might be extreme but it illustrates the importance of the concept.

2. Security

It goes without saying (but I’m going to say it anyhow) that physical security is a foundation of a reliable data center. Keeping your systems under lock and key and providing entry only to authorized personnel goes hand and hand with permitting only the necessary access to servers, applications and data over the network. It’s safe to say that the most valuable assets of any company (other than people, of course) reside in the data center. Small-time thieves will go after laptops or personal cell phones. Professionals will target the data center. Door locks can be overcome, so I recommend alarms as well. Of course, alarms can also be fallible so think about your next measure: locking the server racks? Backup power for your security system? Hiring security guards? It depends on your security needs, but keep in mind that “security is a journey, not a destination.”

3. Accountability

Speaking as a system administrator, I can attest that most IT people are professional and trustworthy.  However, that doesn’t negate the need for accountability in the data center to track the interactions people have with it. Data centers should log entry details via badge access (and I recommend that these logs are held by someone outside of IT such as the Security department, or that copies of the information are kept in multiple hands such as the IT Director and VP). Visitors should sign in and sign out and remain under supervision at all times. Auditing of network/application/file resources should be turned on. Last but not least, every system should have an identified owner, whether it is a server, a router, a data center chiller, or an alarm system.

4. Policies

Every process involved with the data center should have a policy behind it to help keep the environment maintained and managed. You need policies for system access and usage (for instance, only database administrators have full control to the SQL server). You should have policies for data retention – how long do you store backups? Do you keep them off-site and if so when do these expire? The same concept applies to installing new systems, checking for obsolete devices/services, and removal of old equipment – for instance, wiping server hard drives and donating or recycling the hardware.

5. Redundancy

fordpinto.image002.jpg
 Credit: Wikimedia Commons

The first car I ever owned was a blue Ford Pinto. My parents paid $400 for it and at the time, gas was a buck a gallon, so I drove everywhere. It had a spare tire which came in handy quite often. I’m telling you this not to wax nostalgic but to make a point: even my old breakdown-prone car had redundancy. Your data center is probably much shinier, more expensive, and highly critical, so you need more than a spare tire to ensure it stays healthy. You need at least two of everything that your business requires to stay afloat, whether this applies to mail servers, ISPs, data fiber links, or voice over IP (VOIP) phone system VMs. Three or more wouldn’t hurt on many scenarios either!

It’s not just redundant components that are important but also the process to test and make sure they work reliably – such as scheduled failover drills and research into new methodologies.

6. Monitoring

Monitoring of all systems for uptime and health will bring tremendous proactive value but that’s just the beginning. You also need to monitor how much bandwidth is in use, as well as energy, storage, physical rack space, and anything else which is a “commodity” provided by your data center.

There are free tools such as Nagios for the nuts and bolts monitoring and more elaborate solutions such as Dranetz for power measurement. Alerts when outages or low thresholds occur is part of the process – and make sure to arrange a failsafe for your alerts so they are independent of the data center (for instance, if your email server is on a VMWare ESX host which is dead, another system should monitor for this and have the ability to send out notifications).

7. Scalability

So your company needs 25 servers today for an array of tasks including virtualization, redundancy, file services, email, databases, and analytics? What might you need next month, next year, or in the next decade? Make sure you have the appropriate sized data center with sufficient expansion capacity to increase power, network, physical space, and storage.  If your data center needs are going to grow – and if your company is profitable I can guarantee this is the case – today is the day to start planning.

Planning for scalability isn’t something you stop, either; it’s an ongoing process. Smart companies actively track and report on this concept. I’ve seen references in these reports to “the next rivet to pop” which identifies a gap in a critical area of scalability that must be met (e.g., lack of physical rack space) as soon as possible.

8. Change management

You might argue that Change Management falls under the “Policies” section, a consideration which has some bearing. However, I would respond that it is both a policy and a philosophy. Proper guidelines for change management ensure that nothing occurs in your data center which hasn’t been planned, scheduled, discussed and agreed upon along with providing backout steps or a Plan “B.” Whether it’s bringing new systems to life or burying old ones, the lifecycle of all elements of your data center must fall in accordance with your change management outlook.

9. Organization

I’ve never known an IT pro who wasn’t pressed for time. Rollout of new systems can result in some corners being cut due to panic over missed deadlines – and these corners invariably seem to include making the environment nice and neat.

A successful system implementation doesn’t just mean plugging it in and turning it on; it also includes integrating devices into the data center via standardized and supportable methods. Your server racks should be clean and laid out in a logical fashion (production systems in one rack, test systems in another). Your cables should be the appropriate length and run through cabling guides rather than haphazardly draped. Which do you think is easier to troubleshoot and support; a data center that looks like this:

cablemess.jpg
 Credit: Wikimedia Commons

Or THIS:

cables.neat.jpg
 Credit: Wikimedia Commons

10. Documentation

The final piece of the puzzle is appropriate, helpful, and timely documentation – another ball which can easily be dropped during an implementation if you don’t follow strict procedures. It’s not enough to just throw together a diagram of your switch layout and which server is plugged in where; your change management guidelines should mandate that documentation is kept relevant and available to all appropriate personnel as the details evolve – which they always do.

Not to sound morbid, but I live by the “hit by a bus” rule. If I’m hit by a bus tomorrow, one less thing for everyone to worry about is whether my work or personal documentation is up to date, since I spend time each week making sure all changes and adjustments are logged accordingly. On a less melodramatic note, if I decide to switch jobs I don’t want to spend two weeks straight in a frantic braindump of everything my systems do.

The whole ball of wax

The great thing about these concepts is that they are completely hardware/software agnostic.  Whether your data center contains servers running Linux, Windows or other operating systems, or is just a collection of network switches and a mainframe, hopefully these will be of use to you and your organization.

To tie it all together, think of your IT environment as a wheel, with the data center as the hub and these ten concepts as the surrounding “tire”:

dc-diag.png
 Credit: Wikimedia Commons

Devoting time and energy to each component will ensure the wheels of your organization turn smoothly.  After all, that’s the goal of your data center, right?