News

Chinese AI startup takes aim at OpenAI's Sora with image-to-video tool launch

Pictured here is an AI-generated clip from Vidu’s website. The tool can create videos from text or image prompts.
Evelyn Cheng | CNBC
  • Beijing-based Shengshu Technology on Wednesday said that its artificial intelligence-powered text-to-video tool Vidu will now be able to generate videos from multiple images.
  • When prompted with text and images, Vidu's new AI feature can combine three pictures — such as a shirt, person and moped — into a video of the person wearing the shirt and driving the moped through a scene, Shengshu said.
  • The AI video generator is already making money from advertisers, animators and other businesses, Shengshu co-founder and CEO Jiayu Tang said in Mandarin, according to a CNBC translation.

BEIJING — Beijing-based Shengshu Technology on Wednesday said that its artificial intelligence-powered text-to-video tool Vidu will now be able to generate videos by combining multiple images.

Vidu already allows users worldwide to create 8-second clips based on written prompts. While OpenAI the maker of ChatGPT — in February revealed that its AI model Sora could generate one-minute videos from text, it has yet to release that publicly.

Vidu's new AI feature can combine three pictures — such as a shirt, person and moped — into a video of the person wearing the shirt and driving the moped through a scene, Shengshu said.

Other platforms claim they can turn text or images into videos using AI, but the quality of output varies. The breakthrough that Shengshu claims is the ability to take three unique images and integrate them with visual consistency into an AI-generated video.

"Very early on we pinpointed [visual consistency] as the problem, and wanted to solve it well," Fan Bao, chief technology officer at Shengshu, said in Mandarin, translated by CNBC.

Vidu launched in April and its ability to turn two profile photos into lifelike videos of people hugging went viral on TikTok.

The AI video generator is already making money from advertisers, animators and other businesses, Shengshu co-founder and CEO Jiayu Tang said in Mandarin, according to a CNBC translation. He said monthly usage rates per customer can range from 100,000 yuan to 1 million yuan ($13,871 to $138,711).

To address copyright issues, Tang said a company might sign a deal with an artist that allows the AI to mimic the artist's style of painting for an advertisement. He said he hadn't seen significant legal cases around consumers' use of images.

Tang added that Vidu doesn't allow the public to generate content using images of celebrities or "sensitive" individuals. He said the AI tool also bans nudes and violent images. As for personal photos, Tang said Vidu destroys the data in accordance with general data protection regulation — a global benchmark.

Shengshu was founded last year with backers including Baidu Ventures, Alibaba-affiliate Ant Group, Chinese startup Zhipu AI, Qiming Venture Partners and Beijing city, according to PitchBook.

Tang said Vidu's AI runs off rented cloud servers in China and abroad.

Copyright CNBC
Contact Us