Fish behavior recognition based on Mel spectrogram and improved SEResNet
In order to solve the problem that the sound discrimination of fish is difficult and the behavior recognition accuracy is not high due to the stimulus sources such as feed release and water flow change in the breeding environment,a fish behavior recognition model based on Mel spectrogram and improved SEResNet was proposed.Firstly,in view of the difficulty of feature extraction due to the large frequency fluctuation and small feature difference of fish behavior sounds,a high-resolution Mel spectrogram with good feature representation is adopted to capture the spectral features of fish sounds and enhance the recognition ability of fine-grained sound information of fish.Secondly,to solve the problem that key information of fish sound features is easy to be lost,it is proposed to integrate the Temporal Aggregated Pooling layer in the SEResNet model,extract the maximum value and average value of the pooled region,and retain more fine-grained sound features of fish behaviors to improve the recognition accuracy.To verify the effectiveness of the proposed model,the ablation experiment and the model performance comparison experiment were designed respectively.The test results showed that the accuracy of TAP-SEResNet was improved by 3.23%compared with SEResNet without reducing the detection speed.Compared with advanced voice recognition models such as PANNS-CNN14,ECAPA-TDNN and MFCC+ResNet,TAP-SEResNet has improved its accuracy by 5.32%,2.80%and 1.64%,respectively.The results show that the proposed model can effectively solve the problem of low accuracy of fish behavior recognition in aquaculture environment,help to realize accurate monitoring of fish behavior in aquaculture process,and play an important role in promoting precision aquaculture.
fish behavior recognitionpassive underwater acoustic signalMel spectrogramSEResNet