Lightweight focus quality assessment network for pathological image with amplified receptive field
Objective Histopathology is the gold standard for tumor diagnosis.With the development of digital pathology slide scanners,digital pathology has introduced revolutionary changes to clinical pathological diagnosis.Pathologists use digital images to examine tissues and make diagnoses based on the characteristics of the observed tissues.Simultaneously,these digital images are fed into a computer-aided diagnostic system for automated diagnosis,thereby speeding up diagno-sis.However,the quality of digital pathology images is blurred locally or globally by the focusing errors produced in the scanning process.For pathologists,these blurred areas will prevent accurate observations of tissue and cellular structures,leading to misdiagnosis.Therefore,studying the focus quality evaluation for pathological images is crucial.Methods based on machine and deep learning are currently available for this research.In machine learning-based methods,features are artificially designed with the help of a priori knowledge,such as optical or microscopic imaging,and fed into a classifier to automatically obtain focused predictions.However,these methods do not automatically learn the focus features in patho-logical images,resulting in low evaluation accuracy.Meanwhile,deep learning-based methods automatically learn com-plex features,substantially improving evaluation accuracy.Current learning-based work enhances the capability to process global focus information from pathological images by introducing attention mechanisms.However,the receptive scope of these attention mechanisms is limited,which results in inadequate global focus information.By contrast,the existing net-works with better performance require a larger number of parameters and computations,increasing the difficulty of their application in practice.In this paper,a focus quality assessment network with amplified receptive field(ARF-FQANet)is proposed to address challenges such as poor global information extraction and excessive computations.Method In ARF-FQANet,a large convolution kernel is used to amplify the receptive field of the network,and the dual-stream large kernel attention(DsLKA)mechanism is then integrated.In DsLKA,large kernel channel and spatial attentions are proposed to capture the global focus information in channels and spaces,respectively.The proposed large kernel channel attention is better than the classical channel attention mechanism,and the introduced large kernel retransmit squeeze(LKRS)method redistributes the weights in the space,thus avoiding the problem of losing saliency weights in classical channel attention.However,the local cellular semantic information gradually becomes salient with the downsampling of input features,which may affect the capability of the network to represent focus information.A local stable downsampling block(LSDSB)is designed to address the above problems.Extraneous information is minimized during the upsampling and downsampling processes by integrating LSDSB,thus ensuring the local stability of the features.A short branch is introduced to create a residual attention block(RAB)based on DsLKAB and LSDSB modules.In this short branch,the noise is extracted using a minimum pooling operation,which effectively suppresses the learning of noisy information during backpropagation,thus improving the capability of the network to represent focus information.In addition,an initial feature enhancement block(IFEB)is introduced at the initial stage of the network to enhance the capability of the initial layer to represent the focus information.The features obtained by IFEB provide highly comprehensive information for subsequent networks.A strategy to decompose large convolutional kernels is introduced to obtain a lightweight network,which substantially reduces the number of parameters and computational requirements.By contrast,the network parameters are reduced to achieve further compression.The network is then optimized into three aspects:large,medium,and small,each with a reduced number of parameters.Result Comparative experiments are performed on a publicly available dataset of focused quality assessment of pathology images.The compared networks are categorized as small,medium,and large according to the number of their parameters.In terms of large networks,the proposed large network performs the best with 0.765 8,0.957 8,0.956 2,and 0.852 3 for RMSE,SRCC,PLCC,and KRCC,respectively.These results show that the predicted focus scores are highly consistent with the actual focus scores.In terms of small and medium networks,the performance of the proposed small and medium networks is slightly degraded,but its parameters and computational complexity are notably reduced.Compared with self-defined convolutional neural network(SDCNN),the parameters of the small network(ARF-FQANet-S),the floating-point operations,and the CPU reference time(CPU-Time)are reduced by 39.06%,95.11%,and 51.91%,respectively.The small network may not be able to outperform the FocusLiteNN network in terms of speed;how-ever,performance comparable to larger networks is still provided.This paper visualizes the receptive field of several net-works in different stages.The results indicate that the ARF-FQANet proposed in this paper obtains larger receptive fields,especially in the initial layer of the network.Thus,additional global focusing information is obtained at the initial layer of the network,which contributes to the stable performance of the small ARF-FQANet.Conclusion Compared with similar methods,the proposed network efficiently extracts global focus information from pathological images.In this network,a large convolutional kernel is used to expand the receptive field of the network,and DsLKA is introduced to enhance the global information within the learning space and channels.This strategy ensures that the network maintains competitive per-formance even after notable parameter reductions.The small network(ARF-FQANet-S)offers remarkable advantages in terms of CPU inference time and is ideal for lightweight deployments on edge devices.Overall,the results provide a techni-cal reference for the lightweight models.
digital pathological imagesfocus quality assessmentamplified receptive fieldattention mechanismlight-weight