Understanding image processing with edge and AI

Grant Powell MBCS speaks to Mats Thulin, Director AI and Analytics Solutions at Axis Communications, and uncovers how edge computing and AI techniques like small language models can reshape physical security.

Summary:

Network video is shifting to real‑time, on‑device intelligence, reducing cloud dependence and enabling richer scene interpretation directly on cameras
Small language models add lightweight semantic understanding at the edge, improving metadata, context, and responsiveness without heavy processing
Increasing model complexity heightens the need for explainability and responsible use, with clear limits, real‑world testing, and human‑impact awareness

Physical security has traditionally involved the use of standalone video camera technology. Through advancements in network connectivity these technologies have become powerful digital tools, using data and advanced analytics to dramatically improve surveillance and detection capabilities. Now, as artificial intelligence and machine learning continue to evolve, their impact on network video is becoming increasingly significant, while also presenting many new ethical and technical challenges.

How has the integration of artificial intelligence and machine learning transformed what network cameras can do today, and what further developments can we expect in the near future?

The integration of AI and machine learning has been a game changer for network cameras. When deep learning technologies first entered the analytics space, we saw a significant leap in robustness and accuracy. These advancements allowed us to build much more reliable applications, particularly as edge devices became powerful enough to support these technologies directly.

Today, we’re seeing mature deep learning models deployed on edge devices, enabling real-time detection and analysis without relying heavily on centralised processing. But the next big shift is already on the horizon. We are beginning to explore the use of small language models (SLMs) on edge devices. While it’s currently possible to run SLMs at the edge, there’s still a trade off between model size and overall capability. As hardware continues to evolve, we expect to see more powerful models running locally, enabling more flexible metadata generation and advanced analysis of video data.

This evolution will open up new possibilities, not just in threat detection, but in how systems interpret and respond to complex environments. It’s an exciting time, and we’re just at the beginning of this next chapter.

Will the balance between cloud and edge-based processing shift over time?

The decision to process analytics at the edge or in the cloud depends on several factors, including computational requirements, bandwidth availability and cost. Initially, more advanced analytics were only feasible in server or cloud environments due to the heavy processing demands. But as edge devices have become more capable, we’ve seen a shift with many deep learning applications now being run directly at the edge

Having said that, it’s not a one-size-fits-all solution. We’re increasingly seeing hybrid architectures where some processing, such as basic object detection and analysis, is done at the edge, while more complex tasks are handled in the cloud or on servers. This approach can be very beneficial, particularly from a financial perspective, because streaming full video feeds to the cloud can become expensive.
I believe the balance will continue to shift as edge devices become even more powerful, and the use of AI becomes more prevalent. However, hybrid systems will remain essential, allowing us to scale efficiently while maintaining performance and cost-effectiveness.

How can companies ensure that AI-driven analytics are both effective and ethically designed?

Ethical design has to be embedded into the DNA of any organisation working with AI. It needs to be part of the development process from the very beginning. At Axis, we focus heavily on education and awareness. We ensure that our engineers and developers are trained to think ethically. It’s about fostering a culture of responsibility.

For you

Be part of something bigger, join BCS, The Chartered Institute for IT.

We also take a holistic approach to responsible development and deployment. This means thinking through the entire lifecycle of a product from the initial design to its practical use in the real world. We test our systems in real environments and make sure that we document their limitations clearly. Understanding the use case is critical. Ethical considerations and effectiveness go hand in hand, and both depend on a deep understanding of how the technology will be used in practice.

How can more sustainable technologies help businesses reduce total cost of ownership?

Sustainability and cost-efficiency go hand in hand. One of the most effective ways to reduce total cost of ownership for customers is by leveraging hybrid architectures. By performing some processing on edge devices, we can significantly lower the computational load on central servers or cloud infrastructure.

Another area where AI can contribute is in system management. Using dedicated AI tools for tasks like configuration, health monitoring and lifecycle management can streamline operations and reduce the need for manual intervention. These processes aren’t compute-intensive — they don’t require constant video processing — so they’re a great way to optimise system performance without adding significant overheads.

There’s also a misconception that training large models is always resource heavy. In many cases, training is a one-time process, and the models can then be deployed in a lightweight, efficient manner. Ultimately, it’s about using technology to do more with less — more insight, more automation and more value, all while consuming fewer resources.

Finally, as AI models become more sophisticated, what do you see as the biggest technical and ethical challenges facing the network video industry in the next decade?

One of the biggest challenges we face is understanding both the capabilities and limitations of some of the increasingly complex AI models that are being developed. Foundational models are incredibly powerful — they can interpret scenes, detect activities and even anticipate events — but their complexity also makes them harder to test. Explainable AI is a critical area of focus. We need to be able to understand and communicate how these models make decisions, especially when they’re used in sensitive or high-stakes environments. That’s not easy. The only way to truly grasp a model’s limitations is to test it in real-world conditions, which can be resource intensive.

From an ethical standpoint, there is also a significant challenge around ensuring that we don’t lose sight of the human impact. As capabilities grow, so do the risks. We must remain vigilant about how these technologies are used, who they affect and what unintended consequences might arise. Balancing innovation with responsibility will be key.