Data Center Generators

Generators are a key to data center reliability. Supplementing a battery-based uninterruptible power supply (UPS) with an emergency generator should be considered by all data center operators. The question has become increasing important as super storms such as Hurricane Sandy in the Northeast United States knocked out utility power stations and caused many downed power lines, resulting in days and weeks of utility power loss.

data-center-generator-delivery
Data Center Generator Delivery

Beyond disaster protection, the role of a backup generator to provide power is important when utility providers consider summer rolling blackouts and brownouts and data center operators see reduced utility service reliability. In a rolling blackout, power to industrial facilities is often shut down first. New data center managers should check the utilities contract to see if a data center is subject to such utility disconnects.

Studies show generators played a role in between 45 and 65 percent of outages in data centers with an N+1 configuration (with one spare backup generator). According to Steve Fairfax, President of MTechnology, “Generators are the most critical systems in the data center.” Mr. Fairfax was the keynote speaker at the 2011 7×24 Exchange Fall Conference in Phoenix, Arizona.

What Should You Consider Before Generator Deployment?

  • MTU-Onsite-Energy-Data-Center-Gas-Generators
    MTU Onsite Energy Gas Generator

    Generator Classification / Type. A data center design engineer and the client should determine if the generator will be classified as an Optional Standby power source for the data center, a Code Required Standby power source for the data center, or an Emergency back-up generator that also provides standby power to the data center.

  • Generator Size. When sizing a generator it is critical to consider the total current IT power load as well as expected growth of that IT load. Consideration must also be made for facility supporting infrastructure (i.e. UPS load) requirements. The generator should be sized by an engineer, and specialized sizing software should be utilized.
  • Fuel Type. The most common types of generators are diesel and gas. There are pros and cons to both as diesel fuel deliveries can become an issue during a natural disaster and gas line feeds can be impacted by natural disasters. Making the right choice for your data center generator depends on several factors. The fuel type needs to be determined based upon local environmental issues, (i.e. Long Island primarily uses natural gas to protect the water aquifer under the island), availability, and the required size of the standby/emergency generator.
  • Deployment Location. Where will the generator be installed? Is it an interior installation or an exterior installation? An exterior installation requires the addition of an enclosure. The enclosure may be just a weather-proof type, or local building codes may require a sound attenuated enclosure. An interior installation will usually require some form of vibration isolation and sound attenuation between the generator and the building structure.
  • Cummins-Lean-Burn-Industrial-Gas-Generators
    Cummins Lean-Burn Gas Generator

    Exhaust and Emissions Requirements. Today, most generator installations must meet the new Tier 4 exhaust emissions standards. This may depend upon the location of the installation (i.e. city, suburban, or out in the country).

  • Required Run-time. The run-time for the generator system needs to be determined so the fuel source can be sized (i.e. the volume of diesel or the natural gas delivery capacity to satisfy run time requirements).

 

What Should You Consider During Generator Deployment?

  • Commissioning The commissioning of the generator system is basically the load testing of the installation plus the documentation trail for the selection of the equipment, the shop drawing approval process, the shipping documentation, receiving and rigging the equipment into place. This process also should include the construction documents for the installation project.
    Generac-industrial-gas-generators
    Generac Generator

     

     

  • Load Testing Typically, a generator system is required to run at full load for at least four (4) hours. It will also be required to demonstrate that it can handle step load changes from 25% of its rated kilowatt capacity to 100% of its rated kilowatt capacity. If the load test can be performed with a non-linear load bank that has a power factor that matches the specification of the generator(s) that is the best way to load test. Typically, a non-linear load bank with a power factor between 75% and 85% is utilized.
  • Servicing The generator(s) should be serviced after the load test and commissioning is completed, prior to release for use.

 

What Should You Consider After Generator Deployment?

  • Caterpillar Industrial Diesel GeneratorsService Agreement. The generator owner should have a service agreement with the local generator manufacturer’s representative.
  • Preventative Maintenance. Preventative Maintenance should be performed at least twice a year. Most generator owners who envision their generator installation as being critical to their business execute a quarterly maintenance program.
  • Monitoring. A building monitoring system should be employed to provide immediate alerts if the generator and ATS systems suffer a failure, or become active because the normal power source has failed. The normal power source is typically from the electric utility company, but it could be an internal feeder breaker inside the facility that has opened and caused an ATS to start the generator(s) in an effort to provide standby power.
  • Regular Testing. The generator should be tested weekly for proper starting, and it should be load tested monthly or quarterly to determine that it will carry the critical load plus the required standby load and any emergency loads that it is intended to support.
  • bloom-energy-server
    The Bloom Box by Bloom Energy

    Maintenance. The generator manufacturer or third party maintenance organization will notify the generator owner when important maintenance milestones are reached such as minor rebuilds and major overhauls. The run hours generally determine when these milestones are reached, but other factors related to the operational characteristics of the generator(s) also apply to determining what needs to be done and when it needs to be done.

PTS Data Center Solutions provides generator sets for power ratings from 150 kW to 2 MW. We can develop the necessary calculations to properly size your requirement and help you with generator selection, procurement, site preparation, rigging, commissioning, and regular maintenance of your generator.

To learn more about PTS recommended data center generators, contact us or visit (in alphabetical order):

To learn more about PTS Data Center Solutions available to support your Data Center Electrical Equipment & Systems needs, contact us or visit:

Link Source: http://computer-room-design.com/strategic-data-center solutions/electricalequipmentandsystems/data-center-generators/

Best Practices for data center monitoring and server room monitoring

1. Rack Level Monitoring
Based on a recent Gartner study, the annual cost of a Wintel rack averages around $70,000 USD per year. This excludes the business cost of a rack. Risking losing business continuity or your infrastructure due to environmental issues is not an option. What are the environmental threats at a rack level?A mistake often made is to only rely on monitoring the conditions at a room level and not at a rack level. The American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) recommends no less than 6 temperature sensors per rack in order to safeguard the equipment (top, middle, bottom at front & back of rack). When a heat issue arises, air conditioning units will initially try to compensate the problem. This means that with room level temperature monitoring, the issue will only be detected when the running air conditioning units are no longer capable of compensating the heat problem. It may be too late then.We recommend monitoring temperature per rack at a minimum of 3 points: at the bottom front of the rack to verify the temperature of the cold air arriving to the rack (combined with airflow monitoring); at the top front of the rack to verify if all cold air gets to the top of the rack; and finally one at the top back of the rack which is typically the hottest point of the rack. Intake temperature should be between 18°-27°C / 64°-80°F. Outtake temperature should typically be not more than 20°C / 35°F of the intake temperature.

What is the impact of temperature on your systems? High end systems have auto shutdown capabilities to safeguard themselves against failures when temperature is too high. However before this happens systems will experience computation errors at a CPU level resulting in application errors. Then system cooling (fan) will be stressed reducing equipment life time expectance (and as such their availability and your business continuity).

   2. Ambient room monitoring
Ambient room monitoring is the environmental monitoring of the room for its humidity and temperature levels. Temperature and humidity sensors are typically deployed in:

  • potential “hot zones” inside the server room or data center
  • near air conditioning units to detect failure of such systems.When multiple air conditioning systems are available in a room, then a failure of one system will initially be compensated by the others before it may lead to a total failure of the cooling system due to overload. As a result temperature / airflow sensors are recommended near each unit to get early failure detection.Humidity in server rooms should be between 40% and 60% rH. Too dry will result in the build up of static electricity on the systems. Too humid and corrosion will start slowly damaging your equipment resulting in permanent equipment failures.

    When using cold corridors inside the data center, then ambient temperature outside the corridor may be at higher levels. Temperatures of 37°C / 99°F are not uncommon in such setups. This allows to significantly reduce the energy cost. However this also means that temperature monitoring is of utmost importance as a failing air conditioning unit will have a way faster impact on the systems lifetime and availability (fans stress, CPU overheating, …) and running a room at higher temperatures may also affect non rack mounted equipment.

    When using hot corridors it is important to monitor temperature across the room to ensure that sufficient cold air gets to each rack. In this case however one can also rely on rack based temperature sensors in addition of temperature and humidity sensors close to each air conditioning unit.

    3. Water & Flooding Monitoring

    Water leakage is a less known threat for server rooms & data centers. The fact that most data centers and server rooms have raised floors makes the risk even bigger as water seeks the lowest point.

    Two type of sensors for water leakage can be commonly found: spot and water snake cable based. Spot sensors will trigger an alert when water touches the unit. Water rope or water snake cable sensors use a conductive cable whereby contact at any point on the cable will trigger an alert. The latter type is recommended over the first one due to its higher range and higher accuracy.

    If using a raised floor, then one should consider putting the sensor under the raised floor as water seeks the lowest point.

    The four main sources of water in a server room are:

  • leaking air conditioning systems: a water sensor should be placed under each AC unit
  • water leaks in floors or roof above the data center & server room: water sensors should be put around the perimeter of the room at around 50cm/3ft from the outer walls
  • leaks of water pipes running through server rooms: a water sensor should be placed under the raised floors
  • traditional flooding: same as second point for water leaks from roof or above floors applies                                                                                                                                                                                                                                               4. Sensors DeploymentAll sensors connect to our Sensorgateway (base unit). A base unit supports up to 2 wired sensors, or up to 8 with the optional sensor hub.
    Application Location Setting SKU Sensor Package
    Rack Level Monitoring
    Sensors to monitor intake temperature Front – Bottom of rack for room or floor cooling, top of rack for top cooling 18-27°C / 64-80°F 182668 Temperature probes*
    Sensors to monitor outtake temperature Back – Top of rack (hot air climbs) less than 20°C / 35°F difference from inlet temperature (typically <40°C / 105°F) 182668 Temperature probes*
    Ambient Monitoring
    Temperature & humidity monitoring in server room small server rooms: center of the room data centers: potential hot zones – furthest away from airco units Temperature depends on type of room setup
    Humidity: 40-60% rH
    306166 Temperature & Humidity Sensor Probe*
    Airconditioning Monitoring
    Early detection of failing air conditioning units next to airco units Temperature depends on setting of airco
    Humidity: 40-60% rH
    306166 Temperature & Humidity Sensor Probe*
    Water Leaks / Flooding
    Detecting water leaks coming from outside of room Around outside walls of server room / data center and under raised floor
    best is to keep a 30-50cm / 10-20″ from outer wall
    180004 Flooding Sensor Probe* with 6m/20ft water sensitive cable
    Detecting water leaks from air conditioning units Under each air conditioning unit 180004 Flooding Sensor Probe* with 6m/20ft water sensitive cable

    * External probes need to be connected to a Sensorgateway (SKU 311323) in order to operate. One Sensorgateway has a built-in temperature probe and can support up to 2 external probes.

Source from: https://serverscheck.com/sensors/temperature_best_practices.asp

 

 

Proper Data Center Staffing is Key to Reliable Operations

The care and feeding of a data center
By Richard F. Van Loo

Managing and operating a data center comprises a wide variety of activities, including the maintenance of all the equipment and systems in the data center, housekeeping, training, and capacity management for space power and cooling. These functions have one requirement in common: the need for trained personnel. As a result, an ineffective staffing model can impair overall availability.

The Tier Standard: Operational Sustainability outlines behaviors and risks that reduce the ability of a data center to meet its business objectives over the long term. According to the Standard, the three elements of Operational Sustainability are Management and Operations, Building Characteristics, and Site Location (see Figure 1).

Figure 1. According to Tier Standard: Operational Sustainability, the three elements of Operational Sustainability are Management and Operations, Building Characteristics, and Site Location.

Management and Operations comprises behaviors associated with:

• Staffing and organization

• Maintenance

• Training

• Planning, coordination, and management

• Operating conditions

Building Characteristics examines behaviors associated with:

• Pre-Operations

• Building features

• Infrastructure

Site Location addresses site risks due to:

• Natural disasters

• Human disasters

Management and Operations includes the behaviors that are most easily changed and have the greatest effect on the day-to-day operations of data centers. All the Management and Operations behaviors are important to the successful and reliable operation of a data center, but staffing provides the foundation for all the others.

Staffing
Data center staffing encompasses the three main groups that support the data center, Facility, IT, and Security Operations. Facility operations staff addresses management, building operations, and engineering and administrative support. Shift presence, maintenance, and vendor support are the areas that support the daily activities that can affect data center availability.

The Tier Standard: Operational Sustainability breaks Staffing into three categories:

• Staffing. The number of personnel needed to meet the workload requirements for specific maintenance
activities and shift presence.

• Qualifications. The licenses, experience, and technical training required to properly maintain and
operate the installed infrastructure.

• Organization. The reporting chain for escalating issues or concerns, with roles and responsibilities
defined for each group.

In order to be fully effective, an enterprise must have the proper number of qualified personnel, organized correctly. Uptime Institute Tier Certification of Operation Sustainability and Management & Operations Stamp of Approval assessments repeatedly show that many data centers are less than fully effective because their staffing plan does not address all three categories.

Headcount
The first step in developing a staffing plan is to determine the overall headcount. Figure 2 can assist in determining the number of personnel required.

Figure 2. Factors that go into calculating staffing requirements

The initial steps address how to determine the total number of hours required for maintenance activities and shift presence. Maintenance hours include activities such as:

• Preventive maintenance

• Corrective maintenance

• Vendor support

• Project support

• Tenant work orders

The number of hours for all these activities must be determined for the year and attributed to each trade.

For instance, the data center must determine what level of shift presence is required to support its business objective. As uptime objectives increase so do staffing presence requirements. Besides deciding whether personnel is needed on site 24 x 7 or some lesser level, the data center operator must also decide what level of technical expertise or trade is needed. This may result in two or three people on site for each shift. These decisions make it possible to determine the number of people and hours required to support shift presence for the year. Activities performed on shift include conducting rounds, monitoring the building management system (BMS), operating equipment, and responding to alarms. These jobs do not typically require all the hours allotted to a shift, so other maintenance activities can be assigned during that shift, which will reduce the overall total number of staffing hours required.

Once the total number hours required by trade for maintenance and shift presence has been determined, divide it by the number of productive hours (hours/person/year available to perform work) to get the required number of personnel for each trade. The resulting numbers will be fractional numbers that can be addressed by overtime (less than 10% overtime advised), contracting, or rounding up.

Qualification Levels
Data center personnel also need to be technically qualified to perform their assigned activities. As the Tier level or complexity of the data center increases, the qualification levels for the technicians also increase. They all need to have the required licenses for their trades and job description as well as the appropriate experience with data center operations. Lack of qualified personnel results in:

• Maintenance being performed incorrectly

• Poor quality of work

• Higher incidents of human error

• Inability to react and correct data center issues

Organized for Response
A properly organized data center staff understands the reporting chain of each organization, along with their individual roles and responsibilities. To aid that understanding, an organization chart showing the reporting chain and interfaces between Facilities, IT, and Security should be readily available and identify backups for key positions in case a primary contact is unavailable.

Impacts to Operations
The following examples from three actual operational data centers show how staffing inefficiencies may affect data center availability

The first data center had two to three personnel per shift covering the data center 24 x 7, which is one of the larger staff counts that Uptime Institute typically sees. Further investigation revealed that only two individuals on the entire data center staff were qualified to operate and maintain equipment. All other staff had primary functions in other non-critical support areas. As a result, personnel unfamiliar with the critical data center systems were performing activities for shift presence. Although maintenance functions were being done, if anything was discovered during rounds additional personnel had to be called in increasing the response time before the incident could be addressed.

The second data center had very qualified personnel; however, the overall head count was low. This resulted in overtime rates far exceeding the advised 10% limit. The personnel were showing signs of fatigue that could result in increased errors during maintenance activities and rounds.

The third data center relied solely on a call in method to respond to any incidents or abnormalities. Qualified technicians performed maintenance two or three days a week. No personnel were assigned to perform shift rounds. On-site Security staff monitored alarms, which required security staff to call in maintenance technicians to respond to alarms. The data center was relying on the redundancy of systems and components to cover the time it took for technicians to respond and return the data center to normal operations after an incident.

Assessment Findings
Although these examples show deficiencies in individual data centers, many data centers are less than optimally staffed. In order to be fully effective in a management and operations behavior, the organization must be Proactive, Practiced, and Informed. Data centers may have the right number of personnel (Proactive), but they may not be qualified to perform the required maintenance or shift presence functions (Practiced), or they may not have well-defined roles and responsibilities to identify which group is responsible for certain activities (Informed).

Figure 3 shows the percentage of data centers that were found to have ineffective behaviors in the areas of staffing, qualifications, and organization.

Figure 3. Ineffective behaviors in the areas of staffing, qualifications, and organization.

Staffing (appropriate number of personnel) is found to be inadequate in only 7% of data centers assessed. However, personnel qualifications are found to be inadequate in twice as many data centers, and the way the data center is organized is found to be ineffective even more often. Although these percentages are not very high, staffing affects all data center management. Staffing shortcomings are found to affect maintenance, planning, coordination, and load management activities.

The effects of staffing inadequacies show up most often in data center operations. According to the Uptime Institute Abnormal Incident Reports (AIRs) database, the root cause of 39% of data center incidents falls into the operational area (see Figure 4). The causes can be attributed to human error stemming from fatigue, lack of knowledge on a system, and not following proper procedure, etc. The right, qualified staff could potentially prevent many of these types of incidents.

Figure 4. According to the Uptime Institute Abnormal Incident Reports (AIRs) database, the root cause of 39% of data center incidents falls into the operational area.

Adopting the proven Start with the End in Mind methodology provides the opportunity to justify the operations staff early in the planning cycle by clearly defining service levels and the required staff to support the business.  Having those discussions with the business and correlating it to the cost of downtime should help management understand the returns on this investment.

Staffing 24 x 7
When developing an operations team to support a data center, the first and most crucial decision to make is to determine how often personnel need to be available on site. Shift presence duties can include a number of things, including facility rounds and inspections, alarm response, vendor and guest escorts, and procedure development. This decision must be made by weighing a variety of factors, including criticality of the facility to the business, complexity of the systems supporting the data center, and, of course, cost.

For business objectives that are critical enough to require Tier III or IV facilities, Uptime Institute recommends a minimum of one to two qualified operators on site 24 hours per day, 7 days per week, 365 days per year (24 x 7). Some facilities feel that having operators on site only during normal business hours is adequate, but they are running at a higher risk the rest of the time. Even with outstanding on-call and escalation procedures, emergencies may intensify quickly in the time it takes an operator to get to the site.

Increased automation within critical facilities causes some to believe it appropriate to operate as a “Lights Out” facility. However, there is an increased risk to the facility any time there is not a qualified operator on site to react to an emergency. While a highly automated building may be able to make a correction autonomously from a single fault, those single faults often cascade and require a human operator to step in and make a correction.

The value of having qualified personnel on site is reflected in Figure 5, which shows the percentage of data center saves (incident avoidance) based on the AIRs database.

Figure 5. The percentage of data center saves (incident avoidance) based on the AIRs database

Equipment redundancy is the largest single category of saves at 38%. However, saves from staff performing proper maintenance and having technicians on site that detected problems before becoming incidents totaled 42%.

Justifying Qualified Staff
The cost of having qualified staff operating and maintaining a data center is typically one of the largest, if not the largest, expense in a data center operating budget. Because of this, it is often a target for budget reduction. Communicating the risk to continuous operations may be the best way to fight off staffing cuts when budget cuts are proposed. Documenting the specific maintenance activities that will no longer be performed or the availability of personnel to monitor and respond to events can support the importance of maintaining staffing levels.

Cutting budget in this way will ultimately prove counterproductive, result in ineffective staffing, and waste initial efforts to design and plan for the operation of a highly available and reliable data center. Properly staffing, and maintaining the appropriate staffing, can reduce the number and severity of incidents. In addition, appropriate staffing helps the facility operate as designed, ensuring planned reliability and energy use levels.

Source link: https://journal.uptimeinstitute.com/data-center-staffing

6 steps to better data centers

Review existing data centers for improvement opportunities like power consumption and effective heating and cooling.

Management of data storage and processing are a part of every business, with a requirement for data centers and IT facilities common across nearly all business types. Data centers provide centralized IT systems, requiring power, cooling, and operational requirements above and beyond typical design parameters. This large density of power and cooling drives the need for continuous improvements; the goal for any system design or redesign should be to optimize performance of existing equipment, and prioritize replacement and reorganization of outdated systems.

This article provides a number of steps to lead the evaluation of an existing facility and proposes targeted improvements for reducing energy use and CO2 emissions into our environment.

Why improve performance of an existing data center? There are several reasons.

Operational enhancement: Improving the performance of data center systems will offer great benefits to the bottom line and allow for greater flexibility in future expansion:

  • Decreased operating and energy costs
  • Decreased greenhouse gas emissions, critical in anticipation of a future carbon economy.
Increased reliability and resilience: Continuity of supply and zero downtime in IT services result in greater resilience for the business. Improving resilience and reliability results in increased accessibility and facility use, and provides for adaptability into the future.

Consider how critical the data center applications and services are to an operation: What will it cost if no one can send e-mail, access an electronic funds transfer system, or use Web applications? How will other aspects of the business be affected if the facility fails?

Greater system dynamics: Assessment of an existing facility will lead to increased integration of all system components. Increasing data processing potential cannot be considered without understanding the implications on cooling and power demand, and the management systems behind the processes. All aspects of the data center system must be looked at holistically to achieve the greatest results.

Review and improve

Compared to similar-sized office spaces, data center facilities typically consume 35 to 50 times the amount of energy in normal operation and contribute CO2 into our environment. Power demand for IT equipment greater than 100W/sq ft is not uncommon, and as we move into the future, the requirement for data storage and transfer capability is only going to rise.

Whether the driver for improvements is overloaded servers, programmed budget, or corporate energy-saving policy, an analysis of the energy use and system management will have benefits for the business. The assessment process should be to first understand where energy is being used and how the system currently operates; then to identify where supply systems, infrastructure, and management of the facility can be optimized.

1: Review computer rack use

Levels of data storage and frequency of application use will fluctuate in a data center, as users turn on computers and access e-mail, Internet, and local servers. Design of the supporting power and cooling systems for data storage and processing is typically sized with no diversity in IT demand, and therefore rarely will be required in its entirety.

Figure 3 illustrates typical server activity across a normal office week. At different times of the day each server may encounter near-maximum use, but for the majority of time, utilization of racks may be only at 10% to 20%.

Low use of servers results in inefficient and redundant power consumption for the facility. For many server and rack combinations, the power consumed at 50% use is similar to that consumed at 100% use. For racks up to and exceeding 3 kW, this energy consumption can be quite large, when the subsequent cooling and other facility loads are also considered. To improve the system’s energy use, a higher level of use in fewer racks should be achieved.

Consolidation of the servers allows multiple applications and data to be stored on fewer racks, therefore consuming less “redundant” power. Physical consolidation of servers can be further improved when implemented with virtualization software. Virtualization allows a separation between the computer hardware (servers) and the software that they are operating, eliminating the physical bonds of certain applications to dedicated servers. Effective application leads to markedly improved use rates.

2: Review power consumption, supply

Reducing power consumed by a facility requires an understanding of where and how the energy is being used and supplied. There are many possibilities for inefficiency, which need to be known to improve the data center energy statistics.

Power that enters a data center can be divided into two components:

  • IT equipment (servers for data storage, processing, and applications)
  • Supporting infrastructure like cooling, UPS and switchgear, power distribution units (PDU), lighting, and others.

Figure 6 provides an example of the split for power demand across a facility. For this example, 45% of total data center power is utilized by supporting infrastructure and therefore not used for the core data processing applications. If a facility is operating at 100 W/sq ft IT power demand, energy used for the supporting infrastructure alone would result in an additional 80 W/sq ft of energy, energy costs, and the associated CO2 emissions.

To compare performance of one data center’s power usage to another’s, a useful metric is the power usage effectiveness, or PUE. This provides a ratio of total facility power to the IT equipment power:

PUE = total facility power / IT equipment power

The optimal use of power in a data center is achieved as the PUE approaches 1. Studies show that on average, data centers have a PUE of 2.0 to 2.5, with goals of 1.5 and even 1.1 for state-of-the-art-facilities. For example, a facility with PUE nearing 3.0 will consume greater than 200% more power than a facility operating at a PUE of around 1.3.

Effective metering of a data center should be implemented to accurately understand the inputs and outputs of the facility. Accurate data measurement will allow continuous monitoring of the PUE and also allow effective segregation of power used by the data center from other facilities in the building.

To improve the total system efficiency and PUE of a site, the first step is to reduce the demand for power. Table 1 highlights the strategies for demand reduction with a basic description of each.

Following the reduction of power consumption, the second step toward improving the facility’s efficiency and performance is to improve the supply of power. Power supply for data center systems will typically rely on many components. Each of these components has an associated efficiency of transmission (or generation) of power. As the power is transferred from the grid, through the UPS, PDUs, and to the racks, the system efficiency will consecutively decrease. Therefore, all components are important for entire system efficiency.

A review of the manufacturer’s operational data will highlight the supply equipment’s efficiency, but it is important to note that as the equipment increases in age, the efficiency will decrease.

An effective power supply system should ensure that supply is always available, even in the event of equipment failure. Resilience of the supply system is determined by the level of redundancy in place, and the limitations of single-points-of-failure. The Uptime Institute’s four-tier classification system should be consulted, with the most suitable level selected for the site.

In most locations, reduction of demand from grid supply will result in higher efficiency and reduced greenhouse gas emissions. The constant cooling and electrical load required for the site can provide an advantage in the implementation of a centralized energy hub, possibly using a cogeneration/trigeneration system, which can use the waste heat from the production of electrical power to provide cooling via an absorption chiller.

3: Review room heat gains

As is the case with power consumption, any heat gains that are not directly due to IT server equipment represent an additional energy cost that must be minimized. Often, improvements from reduction in unnecessary heat gains can be implemented with little costs, resulting in short payback periods from energy savings.

Computing systems use incoming energy and transform this into heat. For server racks, every 1 kW of power generally requires 1 kW of cooling; this equates to very large heat loads, typically in the range of 100 W/sq ft and larger. These average heat loads are rarely distributed evenly across the room, allowing excessive hot spots to form.

The layout of the racks in the data center must be investigated and any excessively hot zones identified. Isolated hot spots can result in over- or under-cooling and need to be managed. For a typical system with room-wide control, any excessively hot servers should be evenly spaced out by physically or virtually moving the servers (Figure 8). If the room’s control provides rack-level monitoring and targeted cooling, then isolated hot spots may be less of an issue.

Room construction

A data center does not have the same aesthetic and stimulating requirements as an office space. It should be constructed with materials offering the greatest insulation against transmission of heat, both internally and externally.

Solar heat gains through windows should be eliminated, and any gaps that allow unnecessary infiltration/exfiltration need to be sealed.

Switchgear, UPS, other heat gains

Associated electrical supply and distribution equipment in the space will add to the heat gain in the room, due to transmission losses and inefficiency in the units. Selection of any new equipment needs to take this heat gain into account, and placement should be managed to minimize infringement on core IT systems for cooling.

4: Review cooling system

Data center cooling is as critical to the facility’s operation as the main power supply. The excessive heat loads provided by server racks will result in room and equipment temperature rising above critical levels in minutes, upon failure of a cooling system.

Ventilation and cooling equipment

Ventilation is required in the data center space for the following reasons only:

  • Provide positive air pressure and replace exhausted air
  • Allow minimum outside airflow rates for maintenance personnel, as per ASHRAE 62.1
  • Smoke extraction in the event of fire.

Ventilation rates in the facility do not need to exceed the minimum requirement and should be revised if they are exceeding it, in order to reduce unnecessary treatment of the excess makeup air. The performance of the cooling system in a data center largely affects the total facility’s energy consumption and CO2 emissions. There are various configurations of ventilation and cooling equipment, with many different types of systems for room control. Improvements to an existing facility may be restricted by the greater building’s infrastructure and the room’s location. Recent changes to ASHRAE 90.1 now include minimum requirements for efficiency of all computer room air conditioning units, providing a baseline for equipment performance.

After reviewing the heat gains from the facility (step 3), the required cooling for the room will be evident.

  • Can the existing cooling system meet this demand or does it need to be upgraded?
  • Is cooling provided by chilled water or direct expansion (DX)? Chilled water will typically offer greater efficiency but is restricted by location and plant space for chillers.
  • Does the site’s climate allow energy savings from economizers and in-direct cooling?
  • What type of heat removal system is used in the data center space? Can it effectively remove the heat from the servers and provide conditioned air to the front of the racks as required?

Effective cooling: removing server heat

Mixing of hot and cold air should be minimized as much as possible. There should be a clear path for cold air to flow to servers, with minimal intersection of the hot return air. The most effective separation of hot and cold air will depend on the type of air distribution system installed.

Underfloor supply systems rely on a raised floor with computer room air conditioning (CRAC) located around the perimeter of the room. Conditioned air is supplied to the racks via floor-mounted grilles; the air passes through the racks, then returns to the CRAC units at high level. To minimize interaction of the hot return air with cold supply air, a hot and cold aisle configuration will provide the most effective layout. The hot air should be drawn to CRAC units located in line with the hot aisles with minimal contact with the cold supply air at the rack front.

An in-row or in-rack cooling system provides localized supply and will also benefit from hot and cold aisle configuration. Airflow mixing is less likely because this type of system will supply conditioned air directly to the front of the rack and draw hot air from the rear of the rack. If the system does not use an enclosed rack, the implementation of hot aisle and cold aisle containment will ensure that airflows do not mix.

For data center facilities that are created in existing office buildings and fitouts, the cooling system may not be stand-alone but rather rely on central air handling unit systems, wall-mounted split air conditioners, or even exhaust fans. These nonspecific cooling systems typically will not offer the same efficiency as a dedicated CRAC system or in-row cooling, and will be limited in the potential for improvement.

To optimize this type of cooling system and ensure that conditioned air is delivered to the rack inlet, consider the following:

  • Racks should be arranged into hot and cold aisles.
  • Air distribution units should be placed in line with the cold aisles.

Reduce short circuiting

Improved airflow through racks and reducing the opportunities for “short circuiting” of conditioned air into hot aisles will enable improved control of server temperature.

  • Provide blanking plates at any empty sections of server cabinets to prevent direct mixing of hot and cold air.
  • Use server racks with a large open area for cold air intake at the front, with a clear path for the hot air to draw through at the rear.
  • Cable penetrations should be positioned to minimize obstruction of supply air passing through the racks. Any penetrations in raised floor systems should be sealed with brushes or pillows.
  • Cable trays and cable ties will manage these to ensure they do not impinge on the effective airflow.

Associated equipment

Any pumps or fans for cooling in the data center should be as efficient as possible. Installation of variable speed drives (VSD) will reduce the power consumed by the electric motors when operating at part load. If large numbers of VSDs are selected, a harmonics analysis is recommended for the site’s power supply.

Temperature, humidity control

Without appropriate control, the data center equipment’s performance will weaken; it will be detrimentally affected if the room conditions are not within its tolerance. However, studies have shown that the tolerance of data communication equipment is greater than that proposed for offices and human comfort.

According to ASHRAE, the ideal inlet conditions for IT equipment are:

  • Dry bulb temperature: 64.4 to 80.6 F
  • Dew point: 41.9 to 59 F.

Temperature and humidity sensors need to be placed effectively around the room to actively measure the real conditions and adjust the cooling supply accordingly. Optimal placement for measurement and monitoring points is at the front of the rack, to actively measure the inlet condition.

CRAC units and in-row coolers can be controlled with various sensor locations, setpoints, and strategies. Control strategies for regulation of cooling load and fan speed can be based on air conditions entering or leaving the unit.

  • Generally, supply air control systems will allow higher air temperatures in the room, resulting in improved efficiency for the cooling system.
  • Improvements in CRAC unit fans also allow for reductions in energy use as fans can be cycled up and down in response to underfloor static pressure monitoring, and therefore reduce power demand.
  • Ensure effective communication from all sensors and equipment with the building management system (BMS) for monitoring and analysis.

5: Optimize monitoring and maintenance

Designing an energy-efficient data center is not the final step for ensuring an efficient system. A data center may have been designed to operate within tight boundaries and conditions. During construction, commissioning, and hand-over, inefficiencies in power distribution, heat gains, and cooling provision can easily arise from poor training and ineffective communication of design intent.

Studies of existing facilities have shown that many data centers have faults and alarms that the facility’s management are not aware of. Effective monitoring of the facility allows the system to optimize operation, providing ideal room conditions and IT equipment use.

Rack-centric monitoring

Traditionally, control of the data center’s heat loads (cooling requirement) and power supply has been largely centralized, with the BMS connecting the components (Figure 12).

This type of system allows easy response for any center-wide faults and alarms but minimizes the control and management of individual equipment. To improve the efficiency of the data center, management of temperature and power distribution should lead toward a rack-centric approach, with sensors and meters at each rack, maximizing the operator’s ability to analyze how the system is performing on a micro scale.

Systematic checking and maintenance

The power usage, heat loads, and IT equipment should be reviewed on a regular basis to ensure the data center is operating as designed. The prevention of future failures in either the power or cooling systems will save the business large amounts of time and money.

6: Implement

This article has identified a range of performance measures and improvements to lead toward greater system efficiency.

The level of improvement that an existing facility is capable of depends on a number of factors, including its ability to finance, the total space available and expectations of future growth, and the age of the existing system. Therefore, the assessment and improvement process should consider which level to choose. Table 2 highlights the options for each aspect of assessment and ranks each based on the ease and cost of implementation (low to high).

Hallett is a mechanical engineer with Arup. He has been involved in the design and review of a number of significant data center projects, with energy and environmental footprint reduction a major part of the design process. Hallett’s experience includes design of greenfield sites and refurbishments, using modeling and simulation to optimize energy use and cooling applications.


References

[1] ASHRAE. 2007. Ventilation for Acceptable Indoor Air Quality. ANSI/ASHRAE/IESNA Standard 62.1-2007.

[1] ASHRAE. 2009. Thermal guidelines for data processing environments, Second Edition. ASHRAE Technical Committee 9.9.

[1] ASHRAE. 2010. Energy standard for buildings except low-rise residential buildings. ANSI/ASHRAE/IESNA Standard 90.1-2010.

[1] Dunlap, K. 2006. Cooling audit for identifying potential cooling problems in data centers. APC White Paper #40.

[1] Ebbers, M., A. Galea., M. Schaefer, and M.T.D. Khiem. 2008. The green data center: Steps for the Journey. IBM Redpaper.

Emerson Network Power. 2009. Energy Logic: Reducing data center energy consumption by creating savings that cascade across systems. White Paper.

[1] Green Grid. 2008. Green Grid data center efficiency metrics: PUE and DCIE. White Paper #6.

Source: http://www.csemag.com/single-article/6-steps-to-better-data-centers/fedb0b6440e1c5673616a36a85051dd2.html