Traditional video scene classification methods were used to extract the features of image scenes from the visual modality,and combined with supervised learning methods such as support vector machine to achieve scene classification of certain categories.With the rapid emergence of various micro-videos on major platforms,the scene feature representation based on the characteristics of micro-videos had attracted more and more attention of researchers.Due to the problems of micro-video data such as noise,data loss,and inconsistent semantic intensity of each modality,these issues resulted in traditional methods for representing video scenes being unable to learn micro-video scene representations with rich semantics.In recent years,the research of some micro-video scene classi-fication had considered the above challenges and proposed corresponding methods based on micro-video scene classification.This study reviewed the research status of micro-video scene classification,introduced the feature representation and classification methods of micro-video scene,and analyzed the scene classification methods on different datasets.Aiming at the problems existing in the existing methods,the challenging problems to be solved in the future micro-video scene classification were analyzed.
video scenefeature representationmicro-video scene classificationmulti-modality fusiondeep learning