Publications

Discover the research findings of IndustoAI in various applications of AI/ML and automation.

Topics

deep anomaly detection and segmentation

large language models (LLM) in industries

deep anomaly detection and segmentation

Deep Local Feature Matching Image Anomaly Detection with Patch Adaptive Average Pooling Technique

Author(s): Afshin Dini, Esa Rahtu

Year: 2025

Link: Paprer

Abstract: We present a new visual defect detection approach based on a deep feature-matching model and a patch adaptive technique. The main idea is to utilize a pre-trained feature-matching model to identify the training sample(s) being most similar to each test sample. By applying a patch-adaptive average pooling on the extracted features and defining an anomaly map using a pixel-wise Mahalanobis distance between the normal and test features, anomalies can be detected properly. By evaluating our method on the MVTec dataset, we discover that our method has many advantages over similar techniques as (1) it skips the training phase and the difficulties of fine-tuning model parameters that may vary from one dataset to another, (2) it performs quite well on datasets with only a few training samples, reducing the costs of collecting large training datasets in real-world applications, (3) it can automatically adjust itself without compromising performance in terms of shift in data domain, and (4) the model’s performance is better than similar state-of-the-art methods.

Keyword(s): Anomaly Detection, Deep Local Feature Matching, MVTec-AD Dataset.

Detecting Anomalies in Textured Images Using Modified Transformer Masked Autoencoder

Author(s): Afshin Dini, Esa Rahtu

Year: 2024

Link: Paper

Abstract: We present a new method for detecting and locating anomalies in textured-type images using transformer-based autoencoders. In this approach, a rectangular patch of an image is masked by setting its value to gray and then fetched into a pre-trained autoencoder with several blocks of transformer encoders and decoders in order to reconstruct the unknown part. It is shown that the pre-trained model is not able to reconstruct the defective parts properly when they are inside the masked patch. In this regard, the combination of the Structural Similarity Index Measure and absolute error between the reconstructed image and the original one can be used to define a new anomaly map to find and locate anomalies. In the experiment with the textured images of the MVTec dataset, we discover that not only can this approach find anomalous samples properly, but also the anomaly map itself can specify the exact locations of defects correctly at the same time. Moreover, not only is our method computatio nally efficient, as it utilizes a pre-trained model and does not require any training, but also it has a better performance compared to previous autoencoders and other reconstruction-based methods. Due to these reasons, one can use this method as a base approach to find and locate irregularities in real-world applications.

Keyword(s): Anomaly Detection, Anomaly Localization, Masked Autoencoders.

Anomaly Detection and Localization for Images of Running Paper Web in Paper Manufacturing

Author(s):Afshin Dini, Marja Mettänen, Esa Rahtu

Year: 2024

Link: Paper

Abstract: We introduce a new method based on convolutional autoencoders to detect and locate paper web anomalies that can cause web breaks during the paper production process. In this approach, we pre-process the images, captured by two high-speed cameras located at the opposite sides of the running paper web at a paper machine, in several steps to remove noises and separate the paper web areas from the background. After designing and training a convolutional autoencoder with non-anomalous samples, a novel anomaly score and map are defined to find and locate web irregularities based on an edge detector and a reconstruction error, defined by the combination of absolute error and Structural Similarity Index Measure between the reconstructed and the original images, in each test sample. By assessing the proposed approach on the images taken from a real paper machine, we discover that this method can detect paper defects properly and, therefore it has the potential to improve machine functionality and even to prevent certain types of web breaks, which reduces the machine downtime, paper losses, maintenance costs, and energy consumption, i.e., increases the performance and efficiency of paper machinery.

Keyword(s): Anomaly Detection, Anomaly Localization, Running Paper Web Defects, Paper Manufacturing.

Visual Anomaly Detection and Localization with a Patch-Wise Transformer and Convolutional Model

Author(s):Afshin Dini, Esa Rahtu

Year: 2023

Link: Paper

Abstract: We present a one-class classification approach for detecting and locating anomalies in vision applications based on the combination of convolutional networks and transformers. This method utilizes a pre-trained model with four blocks of patch-wise transformer encoders and convolutional layers to extract patch embeddings from normal samples. The patch features from the third and fourth blocks of the model are then combined together to form the final representations, and then several multivariate Gaussian distributions are mapped on these normal embeddings accordingly. At the testing phase, irregularities are detected and located by setting a threshold on anomaly score and map defined by calculating the Mahalanobis distances between the patch embeddings of test samples and the related normal distributions. By evaluating the proposed method on the MVTec dataset, we find out that not only can this method detect anomalies properly due to the ability of the convolutional and transformer layers to present local and overall properties of an image, respectively, but also it is computationally efficient as it skips the training phase by using a pre-trained network as the feature extractor. These properties make our method a good candidate for detecting and locating irregularities in real-world industrial applications.

Keyword(s): Anomaly Detection, Anomaly Localization, Combined Transformer and Convolutional Networks.

TPSAD: Learning to Detect and Localize Anomalies With Thin Plate Spline Transformation

Author(s):Afshin Dini, Esa Rahtu

Year: 2022

Link: Paper

Abstract: We present a self-supervised learning approach with a novel proxy task, based on thin-plate spline transformation, for detecting and localizing anomalies in images. The self-supervised model, referred as TPSAD, is firstly optimized to classify normal examples from the artificially anomalous ones which are created by a new data augmentation technique that applies random thinplate spline transformation to a patch of an image, selected by the Canny edge detector. Then, the last layer representations of the model are utilized for detecting anomalies with the Gaussian density estimator technique, while the middle layer representations are used for localizing anomalies. By assessing the proposed method on the MVTec dataset, we discover that not only can it detect anomalous images and localize irregularities properly, but also it is computationally efficient in both training and testing stages, compared to previous methods. Moreover, the method is robust to images containing unaligned objects due to the usage of the Canny edge algorithm in proxy task learning. Lastly, high performance in addition to low computational cost makes our method a good candidate for image anomaly detection in industrial applications.

Keyword(s): Anomaly detection, Anomaly localization, Selfsupervised learning, Thin plate spline transformation, Canny edge detector.

Unsupervised Detection of Anomalous Sound for Machine Monitoring Under Domain Shifted Condition Based on GANs and Autoencoders

Author(s): Amirhossein Hassankhani, Afshin Dini, Konstantinos Drossos

Year: 2021

Link: Paper

Abstract: This report presents an unsupervised method for detecting anomalous industrial machine sounds, taken under two different conditions and shifted domains, and submitted to DCASE 2021 Task 2. The method tries to map the distribution of data into a learned latent space, using a reconstructive autoencoder followed by an additional second encoder. Furthermore, the method employs a discriminator trying to differentiate between the input and the reconstructed audio to and from the autoencoder. All components are jointly optimized, using a sum of weighted losses and utilizing an adversarial setting between the autoencoder and the discriminator. Anomaly is detected through the distance between the output of the two encoders. Obtained results show that the method performs better than the provided baseline in some cases.

Keyword(s): Anomaly detection, generative adversarial network, domain adaptation, GAN, autoencoder.

large language models (LLM) in industries

ChatGPT or A Silent Everywhere Helper: A Survey of Large Language Models

Author(s): Azim Akhtarshenas, Afshin Dini, Navid Ayoobi

Year: 2025

Link: Paper

Abstract: Large Language Models (LLMs) have revolutionized natural language processing Natural Language Processing (NLP), with Chat Generative Pre-trained Transformer (ChatGPT) standing out as a notable example due to its advanced capabilities and widespread applications. This survey provides a comprehensive analysis of ChatGPT, exploring its architecture, training processes, and functionalities. We examine its integration into various domains across industries such as customer service, education, healthcare, and entertainment. A comparative analysis with other LLMs highlights ChatGPT’s unique features and performance metrics. Regarding benchmarks, the paper examines ChatGPT’s comparative performance against other LLMs and discusses potential risks such as misinformation, bias, and data privacy concerns. Additionally, we offer a number of figures and tables that outline the backdrop of the discussion, the main ideas of the article, the numerous LLM models, a thorough list of datasets used for pre-training, fine-tuning, and evaluation, as well as particular LLM applications with pertinent references. Finally, we identify future research directions and technological advancements, underscoring the evolving landscape of LLMs and their profound impact on artificial intelligence Artificial Intelligence (AI) and society.