{"id":649992,"date":"2025-05-03T17:40:22","date_gmt":"2025-05-03T21:40:22","guid":{"rendered":"https:\/\/www.rochester.edu\/newscenter\/?p=649992"},"modified":"2025-05-06T09:15:58","modified_gmt":"2025-05-06T13:15:58","slug":"ai-text-to-video-ai-metamorphic-capabilities-649992","status":"publish","type":"post","link":"https:\/\/www.rochester.edu\/newscenter\/ai-text-to-video-ai-metamorphic-capabilities-649992\/","title":{"rendered":"Text-to-video AI blossoms with new metamorphic video capabilities"},"content":{"rendered":"
While text-to-video artificial intelligence models like OpenAI\u2019s Sora are rapidly metamorphosing in front of our eyes, they have struggled to produce metamorphic videos. Simulating a tree sprouting or a flower blooming is harder for AI systems than generating other types of videos because it requires the knowledge of the physical world and can vary widely.<\/p>\n
But now, these models have taken an evolutionary step.<\/p>\n
Computer scientists at the Ä¢¹½´«Ã½<\/a>, Peking University, University of California, Santa Cruz, and National University of Singapore developed a new AI text-to-video model that learns real-world physics knowledge from time-lapse videos. The team outlines their model, MagicTime, in a paper<\/a> published in IEEE Transactions on Pattern Analysis and Machine Intelligence<\/em>.<\/p>\n \u201cArtificial intelligence has been developed to try to understand the real world and to simulate the activities and events that take place,\u201d says Jinfa Huang<\/a>, a PhD student supervised by Professor Jiebo\u00a0Luo<\/a>\u00a0from Rochester\u2019s\u00a0Department of Computer Science<\/a>, both of whom are among the paper\u2019s authors. \u201cMagicTime is a step toward AI that can better simulate the physical, chemical, biological, or social properties of the world around us.\u201d<\/p>\n Previous models generated videos that typically have limited motion and poor variations. To train AI models to more effectively mimic metamorphic processes, the researchers developed a high-quality dataset of more than 2,000 time-lapse videos with detailed captions.<\/p>\n