close
close

Improving Automatic Image Cropping Models with Advanced Adversarial Techniques

Improving Automatic Image Cropping Models with Advanced Adversarial Techniques

Many commercial image cropping models use saliency maps (also known as gaze estimation) to identify the most critical areas in an image. In this study, the researchers developed innovative techniques to introduce imperceptible noise interference into images, thereby influencing the output of cropping models. This approach aims to prevent the accidental cropping of important parts of images, such as copyright information or watermarks, which promotes fairness in AI models. Credit: Masatomo Yoshida / Doshisha University

Cropping an image is a necessary task in many contexts, from social media and e-commerce to advanced computer vision applications. Cropping helps preserve image quality by avoiding unnecessary resizing, which can degrade image quality and consume computing resources. It is also useful when an image must adhere to a set aspect ratio, such as in thumbnails.

Over the past decade, engineers around the world have developed various machine learning (ML) models for automatic image cropping. These models aim to crop an input image in a way that preserves its most important parts.

However, these models can make mistakes and exhibit bias that, in the worst cases, can expose users to legal risks. For example, in 2020, a lawsuit was filed against X (formerly Twitter) because its auto-cropping feature hid copyright information from a retweeted image.

Therefore, it is crucial to understand the reasons why image cropping machine learning models fail so that they can be trained and used appropriately, thereby avoiding similar problems.

In view of these facts, a research team from Doshisha University in Japan set out to develop new techniques that would enable the generation of competitive examples for the image cropping task.

As explained in their article published in IEEE access June 17, 2024, their methods can introduce imperceptible noise distortions into an image to trick models into cropping areas that match the user’s intent, even if the original model would not detect them.

PhD student Masatomo Yoshida, the first author and principal investigator of the study, said, “To our knowledge, there is very little research on adversarial attacks on image cropping models, as most previous studies have focused on image classification and detection. These models need to be improved to ensure that they respect user intentions and eliminate biases as much as possible during image cropping.”

Also participating in the study were Masatomo Yoshida and Haruto Namura of the Graduate School of Science and Engineering at Doshisha University in Kyoto, Japan, and Masahiro Okuda of the Faculty of Science and Engineering at Doshisha University.

The researchers developed and implemented two different approaches to generating adversarial examples—the white-box approach and the black-box approach.

The white-box method, which requires access to the internal workings of the target model, involves iteratively computing perturbations of the input images based on the model gradients.

By employing a gaze prediction model to identify salient points in an image, this approach manipulates gaze saliency maps to obtain effective adversarial examples. It significantly reduces the perturbation size, achieving a minimum size 62.5% smaller than baseline methods on the entire experimental image dataset.

The black-box approach uses Bayesian optimization to efficiently narrow the search space and select specific image regions. Similar to the white-box strategy, this approach involves iterative procedures based on visual saliency maps.

Instead of using internal gradients, it uses a tree-structured Parzen estimator to select and optimize pixel coordinates that affect gaze clarity, ultimately producing desired adversarial images. Notably, black-box techniques are more widely used in real-world scenarios and have greater relevance in the context of cybersecurity.

Both approaches are promising based on experimental results. As graduate student Haruto Namura, a participant in the study, explains, “Our findings indicate that our methods not only outperform existing techniques, but also show potential as effective solutions for real-world applications, such as those on platforms like Twitter.”

Overall, this study represents significant progress toward more reliable AI systems, which are crucial to meeting public expectations and gaining their trust. Improving the efficiency of generating adversarial examples for image cropping will drive ML research and inspire solutions to its pressing challenges.

Professor Masahiro Okuda, advisor to Namura and Yoshida, concludes: “By identifying gaps in increasingly deployed AI models, our research contributes to the development of more equitable AI systems and addresses the growing need for AI governance.”

More information:
Masatomo Yoshida et al., Examples of Adversarial Attacks in Image Cropping: Gradient-Based and Bayesian-Optimized Approaches for Effective Adversarial Attack, IEEE access (2024). DOI: 10.1109/ACCESS.2024.3415356

Provided by Doshisha University

Quote:Improving Automatic Image Cropping Models with Advanced Adversarial Techniques (2024, August 1) retrieved August 1, 2024, from https://techxplore.com/news/2024-08-automatic-image-cropping-advanced-adversarial.html

This document is subject to copyright. Apart from any fair use for private study or research, no part may be reproduced without written permission. The content is provided for informational purposes only.