Citation: Sharma, M.; Lim, J.; Lee, H.
The Amalgamation of the Object
Detection and Semantic Segmentation
for Steel Surface Defect Detection.
Appl. Sci. 2022, 12, 6004. https://
doi.org/10.3390/app12126004
Academic Editors: José Salvador
Sánchez Garreta, Kelvin K.L. Wong,
Dhanjoo N. Ghista, Andrew W.H. Ip
and Wenjun (Chris) Zhang
Received: 8 April 2022
Accepted: 10 June 2022
Published: 13 June 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
The Amalgamation of the Object Detection and Semantic
Segmentation for Steel Surface Defect Detection
Mansi Sharma
1
, Jongtae Lim
1
and Hansung Lee
2,
*
1
Department of Computer Science and Engineering, Kongju National University, 1223-24 Cheonan-daero,
275 Budae-dong, Seobuk-gu, Cheonan-si 31080, Chungcheongnam-do, Korea;
mansisharma1245@gmail.com (M.S.); jtlim@kongju.ac.kr (J.L.)
2
School of Creative Convergence, Andong National University, 1375 Gyeongdong-ro, Andong 36729,
Gyeongsangbuk-do, Korea
* Correspondence: mohan@anu.ac.kr
Abstract:
Steel surface defect detection is challenging because it contains various atypical defects.
Many studies have attempted to detect metal surface defects using deep learning and had success in
applying deep learning. Despite many previous studies to solve the steel surface defect detection,
it remains a difficult problem. To resolve the atypical defects problem, we introduce a hierarchical
approach for the classification and detection of defects on the steel surface. The proposed approach
has a hierarchical structure of the binary classifier at the first stage and the object detection and
semantic segmentation algorithms at the second stage. It shows 98.6% accuracy in scratch and other
types of defect classification and 77.12% mean average precision (mAP) in defect detection using the
Northeastern University (NEU) surface defect detection dataset. A comparative analysis with the
previous studies shows that the proposed approach achieves excellent results on the NEU dataset.
Keywords: defect detection; deep learning; steel defect detection; RetinaNet model; UNet
1. Introduction
In recent years, digital transformation has been rapidly spreading in the manufac-
turing industry, and the concept is expanding from factory/process automation to the
smart factory. Among the related technologies, automatic vision inspection technology,
in particular, was created by combining machine vision and artificial intelligence tech-
nology. Its application is extensive and used in all manufacturing processes that require
inspection [
1
–
5
]. The automatic vision inspection is a technology that automatically detects
defects in parts of a product in a manufacturing line. The automatic vision inspection
system takes an image of the finished part (or end product) from a dedicated camera
installed on the production line and compares it with the normal product image to check
for any defects. By detecting in advance the defective parts during the manufacturing
stage, there is the possibility of a lower final defect rate and increased product productivity,
thereby enhancing the reliability and profit of the company. The traditional manufacturing
defect inspection is a method that relies on human eyesight and has the advantage of
being able to detect various types of defects very quickly and accurately, depending on
the skill level of the inspector. However, it takes a large amount of time and money to
train skilled inspectors. In addition, it has the disadvantage of missing defects due to the
accumulation of fatigue caused by the long-term work of the operator. On the other hand,
machine vision is a system that replaces a human inspector for the visual inspection of
product defect detection with a computer [
6
–
19
]. However, full automation is difficult to
achieve due to many variables. Although it is a necessary technology for factory/process
automation, it is still in the growing phase. Conventional machine vision technology is
developed in a rule-based way. After defining the good products first, the method of
classifying non-good products as defective products was adopted. At the time of the initial
Appl. Sci. 2022, 12, 6004. https://doi.org/10.3390/app12126004 https://www.mdpi.com/journal/applsci