Ultrasound examination has become the preferred choice for diagnosing non-alcoholic fatty liver disease(NAFLD)due to its non-invasive.Computer-aided diagnosis technology can help doctors avoiding deviations of detection and classification in NAFLD.Therefore,this study propose a hybrid model that makes the pre-trained VGG16 network combined with the attention mechanism and the Stacking ensemble learning model,which has ability of multi-scale feature aggregation based on the self-attention mechanism and multi-classification model fusion(Logistic regression,random forest,support vector machine)based on Stacking ensemble learning.The proposed hybrid method achieves four classifications of normal,mild,moderate,and severe fatty liver based on ultrasound images,and it reaches an accuracy of 91.34%,which is slightly better than traditional neural network algorithms(≤89.41%).The results show that compared with the pre-trained VGG16 network,adding the self-attention mechanism improves the accuracy by 3.02%.Using the Stacking ensemble learning model as a classifier further increases the accuracy to 91.34%,exceeding any one single classifier such as Logistic regression(89.86%),support vector machine(90.34%)and random forest(90.73%).The proposed hybrid method can effectively improve the efficiency and accuracy of NAFLD ultrasound image detection.