Recognition of safety helmets wearing by construction workers is a common target detection topic in applications of deep learning-based image processing. This paper provides a study of an enhanced YOLOv5-based method, in which the challenges caused by complicated construction environment backgrounds, dense targets, and the irregular shape of safety helmets are addressed. In a trunk network, feature extraction is more based on the target shape by using the Deformable Convolution Net instead of the conventional convolution; in the Neck, a Convolutional Block Attention Module is introduced to weaken feature extraction of complex backgrounds by giving weights to enhance the characterization ability of target features; and the original network's Generalized Intersection over Union Loss is replaced by Distance Intersection over Union Loss to overcome the problem of erroneous location when the population is dense. The dataset for the training network is created by mixing open-source datasets with autonomous collecting to evaluate the effectiveness of the algorithm. We observed that the improved model has a detection accuracy of 91.6%, up 2.3% over the original network model, and a detection speed of 29 frames per second, which is compliant with most security cameras' capture frame rate.
|Number of pages||60632|
|Publication status||Published - 8 Jun 2022|