Magazine Button
Automating data centre operations – a step too far?

Automating data centre operations – a step too far?

DataData CentresDeep DiveIndustry ExpertPower & CoolingThought LeadershipTop Stories
Automating data centre operations – a step too far?

Many data centre sites face considerable operational challenges, which means they are not always ready for full automation. Dean Boyle, CEO, EkkoSense, explains an approach which will ensure automation can be beneficial for data centre teams.

With data centres continuing to experience unprecedented service demand, it’s hardly surprising that many organisations are thinking hard about the role that automation tools should play in simplifying their data centre management. However, before rushing down the automation route, it’s important for data centre teams to consider whether their operations are ready for such an approach.

Today’s reality is that many sites continue to face considerable operational challenges, still requiring significant optimisation before they can be confident that their processes are suitable for full automation. Thermal-related issues, for example, account for a third of unplanned outages – and we estimate that some 15% of data centre racks remain outside of ASHRAE guidelines for inlet temperatures. And, despite a massive over-provision of cooling across the industry, average cooling utilisation only sits at 38%. This excessive cooling remains a huge consumer of energy – representing around 35% of a data centre’s overall energy consumption and proving a barrier to carbon reduction initiatives.

So, for many data centres, there is a lot of work that remains to be done in terms of thermal optimisation before automation initiatives can really be put in place. Unfortunately, while there are many traditional software toolsets – each with solid use cases – that can help support operational management data centre environments, they tend to be narrowly focused on specific requirements. This also means that they aren’t typically used to tackle and support real time thermal optimisation.

Building Management Systems (BMS), for example, are a key platform designed to alert on hard faults or SLA breaches, but with no data analytics those alerts are often too late and very reactive. Electrical Power Management Systems take a similar approach but are focused on power distribution monitoring, while Computational Fluid Dynamics (CFD) systems can provide analytics but are primarily focused on new build or major design changes so have very little capability for real time optimisation. And while DCIM and Asset Management systems are definitely capable – at a cost – of showing more granular datasets, they’re predominantly IT-focused and are rarely driven by the M&E team.

I believe that data centre teams need a toolset that can blend the strengths of these different platforms so that they can directly address the very real requirement for real time thermal optimisation. Removing thermal risk is, of course, a fundamental requirement for any data centre operation. Unfortunately, many critical facilities teams still seem unaware of just how quickly thermal risks can place their data centre operations in danger. Cooling plant failure can easily escalate into a thermal runaway situation, transforming a room that’s operating normally into a site that’s got real problems!

A conventional approach here might be to use a BMS. Unfortunately, thermal issues such as cooling and airflow problems typically don’t trigger BMS alerts early enough as there is no hard SLA breach or fault. And when they do, it’s often too late to prevent an SLA breach from taking place.

That’s why at EkkoSense we’ve been working to help organisations take an AI and Machine Learning-led approach to their M&E software-based optimisation – one that takes advantage of the latest capabilities to enable true real time cooling optimisation and airflow management. The key here is in bringing together a mix of technologies – from SaaS systems and scalable cloud infrastructure to new low-cost sensor technologies and IoT-enabled comms – to facilitate the crunching of multiple complex data sets to support instant optimisation decisions. The result is a 3D visualisation and analysis toolset that’s particularly easy for operations staff to use and understand, helping them to visualise airflow management improvements, quickly highlight potentially worrying trends in cooling performance and effectively remove risk from their white space.

Creating a Digital Twin of your data centre layout

Combining the power of Artificial Intelligence with real time data from a ‘fully-sensed’ room enables the creation of a Digital Twin of your data centre layout – one that not only visually represents current thermal conditions, but also provides tangible recommendations for thermal, power and capacity optimisation. This level of decision support can help operations teams take things to the next level, as the software continually ‘learns’ the environmental changes in the room and provides ongoing optimisation recommendations via the Digital Twin.

So instead of simply automating systems and trusting AI to get on with managing the sensitive security and controls needed for critical data centre cooling duty performance, we believe in a more productive approach. Gather cooling, power and space data at a granular level, visualise that complex data to make it easier to compare changes, highlight trends and anomalies, and then use Machine Learning and AI to provide actionable insights to your data centre team.

At EkkoSense we have put this approach into practice with Cooling Advisor, the industry’s first advisory tool embedded within a thermal optimisation solution. By offering focused cooling performance recommendations and advisory actions we can help organisations unlock 10%+ cooling energy savings, just by acting on this advice. And these actionable changes – such as suggested optimum cooling unit set points, changes to floor grille layouts, checking that cooling units are running to specification, fan speed adjustments or advice on optimum rack locations – are all presented each time for human auditability. You make the suggested changes and then use the software tool to confirm that the changes are delivering the expected results.

In adopting this approach we’re recognising that data centres never stay the same, so this kind of ongoing optimisation allows teams to keep up with their ever-changing environment. Additionally, by deploying cooling duty sensors to track cooling loads, we’re also now able to identify previously undetected cooling unit faults before they are even picked up by BMS alerts. By capturing entirely new levels of data centre cooling data – going beyond basic temperature measurements to also include energy usage and airflow distribution – you can map zone-by-zone cooling analytics within the data centre to help with resiliency and capacity planning decisions. This creates live cooling ‘zones of influence’ that can group racks into clusters along with the cooling unit that is providing the cooling. Additional information can also be overlaid such as floor vent airflows.

So, while it’s certainly possible to automate many of these processes and feedback performance updates into the Machine Learning loop, I question whether most data centres are ready to take this leap. That’s why we remain focused on providing a software-based optimisation approach that delivers intelligent decision support based auditable, actionable cooling, power and space recommendations – rather than simply automating what is always going to be an evolving process.

Click below to share this article

Browse our latest issue

Magazine Cover

View Magazine Archive