Currently,digital virtual human interaction technology faces issues like language understanding errors and limited emotional expression,resulting in a negative user experience.In this study,the current status and challenges of the technology were analyzed,a Unity3D-based interaction technology was introduced,and a technique for generating emotional speech directly from text was proposed.The approach combined with ChatGPT text comprehension and generation,text emotion analysis,and improved VITS speech synthesis.A digital virtual human interaction application capable of accurately understanding and modelling emotional responses was developed by simulating holographic interaction effects using a Kinect 2.0 device.The experimental results demonstrated that the technology improves both the interaction and emotional expression abilities of digital virtual human,providing the significant value for application and development.
Digital mediaArtificial intelligenceMedia interactionSpeech synthesis