Preparing for worst-case scenarios

With data centres often regarded as mission-critical for business operation, PFM looks to industry experts for how to avoid disaster, including Tony Cohen, General Manager of FSI Cloud, sister company of FSI.

There are a variety of connotations of the saying "Assumption is the mother of all mistakes", emphasising the danger of not making sufficient effort to ensure that any given situation is as we hope it to be.

Where data centres are concerned, this can be a particularly risky approach to take regarding the various aspects of security, energy management and system continuity. There are, of course, numerous other attributes to consider but the overall message is that mission-critical systems are deserving of high levels of attention and support to make sure they continue to operate at optimum effectiveness.

The majority of companies will have to outsource the specialist tasks associated with the managing and maintaining of data centre operations, as this will be more cost effective than employing a specialist team. It is therefore essential that both FMs and service providers have a clear understanding of what the contract needs to deliver, including the actions to take to ensure prompt reaction to worst-case scenarios to avoid - or at least keep to a minimum - any disruption to business operations.

We asked a number of key industry providers for their thoughts on this topic, receiving a number of helpful tips for FMs to consider to help to avoid the type of issues that frequently appear as headline news items within the national media.

FSI Cloud general manager Tony Cohen says it is essential for FMs to ask how their provider can provide the required level of support, as increasingly more companies move their operations into the cloud. Among the questions to ask, he advises enquiring whether the data centre used will provide sufficient levels of security and robustness.

"Your due diligence comes into play. Take time to collate questions about the service you wish to buy. What would you need to ask as a minimum to satisfy your risk assessment," he says.

Questions to be considered include: where is the geographical location of the data centre? And is it located close to any natural or human made hazards?

"The data centre should ideally be located in the same country and easily assessable should you need to go there. Some companies legally have to have their data located within the same country they reside," says Mr Cohen.

"All data centres have a rating of one to four, with four being the highest. You should be looking for a Tier 3 or greater rating."

He also advises asking how the data centre can be accessed: "If access to a data centre is difficult from a locational standpoint, this could pose potential issues if access is required in an emergency."

A majority of data centres will resell co-location services through third parties, Mr Cohen continues. "Ask to look at the contracts or confirm that your SLA's mirror that of the data centre operator and the third party services you are purchasing.

"Security is a hugely important aspect of data centre operations. Make sure you understand the physical and data security controls in place. A good starting point is to review the data centres accreditation, such as ISO27001:2013."

Access control is paramount to the security of a data centre and no one should be allowed to wander around between data halls. Access should only be granted to the particular data halls used and if there are no restrictions, the alarm bells should be ringing.

"Make sure that when you chose a data centre it has multiple network access points and power feeds. The data centre must be protected by UPS and have a good generator system which can provide power for at least 72 hours on full load," Mr Cohen concludes.

With high levels of energy required in the operation of data centres, Carel CRAC application manager Enrico Boscaro says the continuous pursuit of energy-saving in data centre cooling is leading to the adoption of a mix of HVAC technologies coordinated to achieve the best efficiency.

"The increasing need to monitor different devices, receive alarm notifications and centrally set parameters, makes monitoring solutions essential. The consequence raises a legitimate question about the best control type; on the unit or centralised," says Mr Boscaro.

"The units must work independently for resiliency reasons and that's why, rather than a central control, a distributed intelligence with different layers interconnected is better. Big improvements can be obtained through serial communication by just changing a few parameters (on-stand by, set point, air flow) and this could be defined as Integration level," he says.

In a traditional CRAC cooling system there is communication between similar units for synergy and backup; each device shares data with a system such as BMS or DCIM, he continues, and often this level is non-specific for cooling but the only centralisation of information.

"The extended communication capabilities of recent controllers allows the sharing of data both to the DCIM and the integration level: the latter coordinates different units (CRACs, AHUs, chillers, humidifiers, etc) to optimise energy consumption and also shares directly information with the DCIM, which does not need to retrieve data from every single device, but can be a coordinator of subsystems (cooling, power, alarms, access control, servers, etc).

"Controllers with integrated Ethernet allows the contemporaneous data sharing with different clients on the same network making it easier to have multiple levels of coordination; for this architecture the security issue has to be kept into account," says Mr Boscaro.

Additional thoughts on avoiding worst-case scenarios are provided by Siemens Building Technologies UK data centre account manager Chris Downing, who reminds us that data centres require the highest level of fire safety, due to the presence of a constant ignition source of electricity and a plentiful supply of combustible materials.

"Data centre downtime can mean losing thousands of pounds per minute and negatively impact a company's image and reputation. Approximately 6% of infrastructure failures in data centres are related to fire," he says.

To mitigate the risk of fire, data centres require advanced detection, offering the earliest possible warning. Fires typically start slowly before erupting into flame and detecting smoke early is essential, however the high level of ventilation necessary to prevent equipment overheating also disperses smoke, making it more difficult to detect and enhancing the risk of fast fire distribution.

"Data centres are radically different from conventional environments in terms of airflow and cooling techniques," Mr Downing continues, "which mean that traditional fire detectors are too slow and insensitive to respond to a fire event as high air flow will cause heat and smoke to cool and disperse.

Unlike conventional smoke detection, aspirating smoke detectors actively draw smoke to the detector through holes within a piping system, offering significantly higher levels of protection, as standard systems can only respond if smoke can actually reach the detection element, which can be too late. 

Fire risks in data centres are rated as extremely high. Causes include massive energy loads, extensive cabling and a plethora of potential ignition sources.

"The central management of all fire safety devices will improve operational performance, increase visibility, protect the integrity of the data, maintain maximum uptime and deliver business continuity," Mr Downing concludes.