Abstract
Deep learning has revolutionized the field of artificial intelligence.Based on the statistical correlations uncovered by deep learning-based methods,computer vision tasks,such as autonomous driving and robotics,are growing rapidly.Despite being the basis of deep learning,such correlation strongly depends on the distribution of the original data and is susceptible to un-controlled factors.Without the guidance of prior knowledge,statistical correlations alone cannot correctly reflect the essential causal relations and may even introduce spurious correlations.As a result,researchers are now trying to enhance deep leaming-based methods with causal theory.Causal theory can model the intrinsic causal structure unaffected by data bias and effectively avoids spurious correlations.This paper aims to comprehensively review the existing causal methods in typical vision and vision-language tasks such as semantic segmentation,object detection,and image captioning.The advantages of causality and the approaches for building causal paradigms will be summarized.Future roadmaps are also proposed,including facilitating the development of causal theory and its application in other complex scenarios and systems.
基金项目
National Natural Science Foundation of China(62233005)
National Natural Science Foundation of China(62293502)
Programme of Introducing Talents of Discipline to Universities(the 111 Project)(B17017)
Fundamental Research Funds for the Central Universities(222202317006)
Shanghai AI Lab()