18 Apr 2024

Uncovering the Vulnerabilities of Object Detection Models: A Collaborative Effort by Advai and the NCSC

Object detectors can be manipulated. -The car is no longer recognised as a car. -The person is no longer there. ...As the use of these detection systems becomes increasingly widespread, their resilience to manipulation becomes increasingly important.

The purpose of this work is to both demonstrate vulnerabilities of these systems and to showcase how manipulations might be detected and ultimately prevented.

In this blog, we retell of our technical examination of five advanced object detectors' vulnerabilities, with sponsorship and strategic oversight from the National Cyber Security Centre (NCSC).

Words by
Alex Carruthers
Ncsc Listing

Uncovering the Vulnerabilities of Object Detection Models: A Collaborative Effort by Advai and the NCSC

Introduction

Artificial Intelligence (AI) object detection models have become an indispensable tool across various industries. From autonomous vehicles to surveillance systems, these models are revolutionising the way we interact with and understand our environment. However, as the complexity and sophistication of these models grow, so do the potential security risks they face.

Advai led the technical examination of five advanced object detectors' vulnerabilities, with sponsorship and strategic oversight from the National Cyber Security Centre (NCSC).

The researchers aim to shed light on the challenges and opportunities that lie ahead in ensuring the robustness and reliability of these critical AI systems by subjecting these models to adversarial attacks, using images from the COCO dataset.

Cars A

The Achilles' Heel of Deep Learning

Despite their impressive performance deep learning models have been shown to exhibit significant vulnerabilities to adversarial attacks. These attacks involve carefully crafted perturbations to input images that can lead to drastic changes in the model's output, while remaining imperceptible to the human eye (try to tell the difference in Figure 2). 

We used this vulnerability to test a range of open-source models by subjecting five different object detectors (each with their own strengths and weaknesses) to adversarial attacks to evaluate their robustness.

The perturbations introduced to the images were minuscule, changing each pixel by just 1% of the pixel range (0-255), however, across a wide range of images the results were striking.

Some models proved more robust than others, with DETR showing the greatest resilience and YOLOv5 being the least resilient. Faster R-CNN fell somewhere in the middle. We should highlight that results could vary in other situations and users should assess the model robustness in relation to their use case.

Cars A

The Achilles' Heel of Deep Learning

Despite their impressive performance deep learning models have been shown to exhibit significant vulnerabilities to adversarial attacks. These attacks involve carefully crafted perturbations to input images that can lead to drastic changes in the model's output, while remaining imperceptible to the human eye (try to tell the difference in Figure 2). 

We used this vulnerability to test a range of open-source models by subjecting five different object detectors (each with their own strengths and weaknesses) to adversarial attacks to evaluate their robustness.

The perturbations introduced to the images were minuscule, changing each pixel by just 1% of the pixel range (0-255), however, across a wide range of images the results were striking.

Some models proved more robust than others, with DETR showing the greatest resilience and YOLOv5 being the least resilient. Faster R-CNN fell somewhere in the middle. We should highlight that results could vary in other situations and users should assess the model robustness in relation to their use case.

Figure 3A

Figure 3 illustrates the contrast in performance between YOLOv5 and DETR when subjected to adversarial perturbations. The red bar, representing the performance on adversarially perturbed data, is significantly lower for YOLOv5 compared to DETR, highlighting the latter's superior robustness.

Fighting Fire with Fire

To combat these vulnerabilities, the researchers explored the use of JPEG compression* as a defence mechanism.

*JPEG reduces the file size of an image by discarding some information while maintaining acceptable visual quality.

Applying JPEG compression to the input images effectively destroyed the subtle adversarial perturbations, providing a degree of protection against malicious inputs.

This approach is particularly promising as JPEG compression cannot be easily incorporated into the differentiable pipeline used to create adversarial attacks, unlike other image transformations such as rotation and translation.

The researchers demonstrated that comparing the predictions from compressed and uncompressed images can reveal manipulated inputs, or simply act as a passive filter applied to all incoming images. This method proved highly effective in recovering the performance of object detectors under attack, even for models with vastly different vulnerabilities to adversarial attack – as shown in Figure 3.

However, it is important to note that this defence is only effective against gradient-based attacks, where the attacker relies on the powerful machinery developed for training deep learning models through backpropagation.

In a closed-box scenario, where the attacker does not need access to the model's gradients, this defence may not be as effective.

The Road Ahead

Our findings highlight the importance of developing robust defence mechanisms against adversarial attacks.

As companies and governments seek to deploy AI systems in the coming months and years, the need for constant vigilance and ongoing research becomes paramount.

Whilst there are steps that can be taken to protect machine learning components from adversarial attacks they offer only partial solutions.

The battle between attackers and defenders is an arms race, requiring continuous development of defence methods and a deep understanding of attack techniques to keep ML-based systems secure.

As we move forward, the ability to regularly red-team deployed AI systems will be an essential skill for organisations that seek to harness the power of AI responsibly. We can work towards a future where the benefits of AI can be realised without compromising security and trust by proactively identifying and addressing vulnerabilities.

Conclusion

The collaborative effort between Advai and the NCSC has provided valuable insights into the vulnerabilities of object detection models and the potential defences against adversarial attacks.

Ensuring robustness and reliability is of the utmost importance for models that become increasingly integrated into critical systems.

By exploring innovative defence mechanisms such as JPEG compression, and continuously advancing our understanding of adversarial techniques, we can pave the way for a future where AI systems are not only powerful but also secure. The road ahead is challenging but, through collaborative research and a commitment to responsible AI development, we can both unlock the full potential of these transformative technologies and safeguard against their inherent risks.

1A

Figures

Figure 1a

1B

Figure 1b

mAP – Mean Average Precision (mAP) scores are a metric for evaluating the accuracy of models in object detection, taking into account both precision (correct predictions) and recall (ability to find all relevant instances). It averages the precision at different thresholds, providing a single score that reflects performance across multiple classes or thresholds. mAP is crucial for understanding a model's overall effectiveness in correctly identifying and classifying objects.

Cars A

Figures

Figure 2a

Bicycle A

Figure 2b

Gif Image

Figure 3a animated. 'Cutoff' is the threshold for the difference between performance on compressed and uncompressed adversarial images. We see that if we remove images where the difference in performance is increasingly great, MaP score increases.

Detr

Figure 3b

The purpose (demonstrated by the above animation) is to detect if an image has been adversarially attacked, and if it has, remove it. This is then applicable to cleaning your dataset if it's being used for training or detecting when someone is trying to fool a live system.

Further reading

Read our two opinion pieces on the AI Act: