The future of AI inference lies in striking the right balance between cost, speed and availability. From Edge versus cloud benefits to regulatory challenges and sustainability practices, organisations must think carefully about their decisions when it comes to AI. We spoke to Robin Ferris, Enterprise Architect, AI Lead at Pulsant, who talks about AI inference and how to create a digital infrastructure that’s right for you.

Which do you believe offers more advantages, Edge inference or Cloud inference – and can you outline the benefits of each?
It comes down to what each of them can deliver for those working with AI models. One of the biggest things is local vs. remote. Edge is much more local to the model and where the data is generated, whereas remote is elsewhere and the data has to travel to it. This therefore requires thought to be given to latency and privacy issues. That’s where the conversation begins around what is being ingested, how that data is being dealt with and what the outcome is.
Some of the models we’ve seen are ingesting real time live imagery compared with someone that’s ingesting a data feed of information which could just be numbers. You’d almost have competing strategies there. It’s about asking the right questions and considering the various elements such as factoring in latency, predicting the outcome and whether it’s a system that is used to keep people safe, such as monitoring somebody’s health.
When carrying out AI inference, how do you create a digital infrastructure that helps strike the right balance between cost, speed and availability?
My experience has shown me that it depends where people are on their AI journey. A lot of the R&D and initial processes can be carried out locally now. However, as people scale up and their workloads come to production, they’ll often go through public clouds like Azure, AWS and GCP, but they see their costs begin to increase. The focus then shifts to reducing this cost, which is when a private cloud is considered.
There is no right answer and everyone is at a different stage in the development journey. It’s about sourcing the information required to make the right decisions at the right time.
What are the key regulations organisations should be aware of when practicing AI inference?
Currently, the UK has a principle-based approach to AI, which means the government has laid down guidelines for regulatory authorities to help them understand what needs to be taken into consideration as we progress.
AI is still relatively new when it comes to this industry, despite it being around for some time. If you were to look at it purely as data, there is nine times as much private data as there is publicly available data and the majority of AI systems are now dealing with private data. Data and the laws around it have been in existence for a while and it’s those fundamental principles of keeping things secure – such as encryption at-rest and encryption in-transit.
It’s not necessarily about the regulations around AI, but instead it’s about how AI will be manipulated and used going forward.
What steps should organisations take when planning for scaling?
This is a really hot topic for us at Pulsant because when working with our existing clients who are on their AI journey, as well as with new clients, it’s really key to understand that if you’re running AI workloads, you’re fundamentally running a GPU which uses a lot of electricity. Organisations must question the level of scalability they desire and how much power they require to get there, as well as the length of the journey.
We have AI, we have some amazing data scientists, but it’s understanding what the impact is going to be – one of the fundamentals is how much power is going to be needed to drive your AI going forward, and when I say power I mean electricity because that’s essentially what we’re talking about.
What best practices can companies implement to ensure AI inference is as environmentally friendly as possible?
It all comes down to electricity. If you were to take a country that is predominantly generating electricity from coal or heavy polluting processes like fossil fuels, you could suggest that an AI model developed in a country like that is going to be more environmentally unfriendly than a model developed in a country with a more environmentally friendly electricity infrastructure. I’m not claiming it would have zero environmental impact, but it’s certainly possible to draw some conclusions there.
In the science industry in particular, scientists would generally have used a supercomputer – if they could access it – to run their trials of data through to try and find out the information they’re processing and gather information from their data. However, now what we’re seeing is that if you shift that same process and do it on AI, you save days of time. What a supercomputer can do in three days, AI can do in three hours. The AI is still using a lot of power, but to get to the same answer that we would have traditionally done with a supercomputer, it’s far more efficient. As we’re transitioning to newer, faster and more efficient technologies, we’re saving power.
Once we can process AI faster, we’re going to give it bigger programs and bigger problems to solve. But for now, it’s definitely moving in the right direction.