The Temporal Structure of Text-to-Video Model Sora:A Phenomenological Reflection on Generative Artificial Intelligence
Recently,OpenAI launched Sora,a model that represents the current pinnacle of text-to-video technology,marking a milestone in the evolution of generative artificial intelligence.However,Sora still has some technical flaws and shortcomings.From a phenomenological perspective,Sora's external temporal structure is incomplete,featuring only objective time,lacking subjective time and inner time consciousness,which prevents it from depicting human psychological time,explaining causal relationships,and constructing complex,meaningful events and plots.Moreover,the absence of retention and fore-shoot hinders its ability to link actions with outcomes.Without the intervention of the internal temporal dynamic generation structure,Sora is also difficult to show the events that occur over time.Therefore,from a technical standpoint,addressing the model's intentional design issues and enhancing both the internal and external temporal structures become the key to improving Sora's performance in reality.
text-to-videoSoratemporal structuregenerative artificial intelligencephenomenologyretention and fore-shoot