QuickVid uses artificial intelligence to generate short-form videos, complete with voiceovers • TechCrunch

Generative AI is coming for videos. A new website, QuickVidcombines multiple generative AI systems into a single tool to automatically create short YouTube, Instagram, TikTok and Snapchat videos.

Given as little as a single word, QuickVid selects a background video from a library, writes a script and keywords, overlays images generated by OFF-E 2, adding a synthetic voiceover and background music from YouTube’s royalty-free music library. QuickVid’s creator, Daniel Habib, says he’s building the service to help creators meet the “ever-growing” demand from their fans.

“By giving creators tools to quickly and easily produce quality content, QuickVid helps creators increase their content output, reducing the risk of burnout,” Habib told TechCrunch in an email interview. “Our goal is to empower your favorite creator to keep up with their audience’s demands by leveraging advances in artificial intelligence.”

However, depending on how they are used, tools like QuickVid threaten to flood already crowded channels with spam and duplicative content. They also face potential backlash from creators who choose not to use the tools, whether because of cost ($10 per month) or on principle, but still have to compete with a slew of new AI-generated videos.

Looking for video

QuickVid, which Habib, a self-taught developer who previously worked at Meta on Facebook Live and video infrastructure, built in a matter of weeks, launched on December 27. It’s relatively bare bones at the moment — Habib says more personalization options will be coming in January — but QuickVid can pull together the components that make up a typical informative YouTube Short or TikTok video, including captions and even avatars.

It is easy to use. First, a user enters a prompt describing the topic of the video they want to create. QuickVid uses the prompt to generate a script that takes advantage of the generative text powers of GPT-3. From keywords either extracted from the script automatically or entered manually, QuickVid selects a background video from the royalty-free stock media library Pexels and generates overlay images using DALL-E 2. It then outputs a voiceover via Google Cloud’s text-to-speech API — Habib says users will soon be able to clone their voice — before combining all those elements into a video.

Image credit: QuickVid

Watch this video made with the prompt “Cats”:

Or this one:

QuickVid certainly doesn’t push the boundaries of what’s possible with generative AI. Both Meta and Google have displayed AI systems that can generate completely original clips with a text prompt. But QuickVid merges existing AI to take advantage of the repetitive, templated format of b-roll-heavy short-form videos, bypassing the problem of having to generate the footage itself.

“Successful creators have an extremely high quality bar and are not interested in putting out content that they don’t feel is in their own voice,” said Habib. “This is the use case we’re focusing on.”

That being said, in terms of quality, QuickVid’s videos are generally a mixed bag. The background videos tend to be a bit random or only tangentially related to the topic, which isn’t surprising considering QuickVid’s currently limited to the Pexels catalog. The DALL-E 2-generated images, meanwhile, exhibit the limitations of today’s text-to-image technology, such as garbled text and uneven proportions.

In response to my feedback, Habib said that QuickVid “is being tested and tinkered with daily.”

Copyright issues

According to Habib, QuickVid users retain the right to use the content they create commercially and are allowed to monetize it on platforms like YouTube. But the copyright status surrounding AI-generated content is… murky, at least for now. US Patent and Trademark Office (USPTO) recently moved revoking copyright protection for an AI-generated cartoon, for example saying that copyrighted works require human authorship.

When asked how the USPTO decision could affect QuickVid, Habib said he believes it only concerns the “patentability” of AI-generated products and not creators’ rights to use and monetize their content. Creators, he pointed out, don’t often file patents on videos and usually lean into the creator economy, letting other creators reuse their clips to increase their own reach.

“Creators are interested in broadcasting high-quality content in their voice that will help grow their channel,” said Habib.

Another legal challenge on the horizon could affect QuickVid’s DALL-E 2 integration — and by extension, the site’s ability to generate image overlays. Microsoft, GitHub and OpenAI will be sued in a class-action lawsuit accusing them of violating copyright law by allowing Copilot, a code-generating system, to recycle portions of licensed code without giving credit. (Copilot was jointly developed by OpenAI and GitHub, which Microsoft owns.) The case has implications for generative art AI like DALL-E 2, which has also been shown to copy and paste from the datasets on which they were trained (i.e. images).

Habib isn’t worried and claims the generative AI genie is out of the bottle. “If another lawsuit came up and OpenAI disappeared tomorrow, there are several alternatives that could power QuickVid,” he said, referring to the open source DALL-E 2-like system Stable diffusion. QuickVid is already testing Stable Diffusion to generate avatar images.

Moderation and spam

Aside from the legal dilemmas, QuickVid may soon have a moderation problem. While OpenAI has implemented filters and techniques to prevent them, generative AI has well-known toxicity and factual accuracy issues. GPT-3 hoot misinformation, especially about recent events which are outside the bounds of its knowledge base. And ChatGPT, a fine-tuned offspring of GPT-3, has been displayed to use sexist and racist language.

This is worrying, especially for people who would use QuickVid to make informational videos. In a quick test, I had my partner – who is far more creative than me, especially in this area – type in a few offensive messages to see what QuickVid would generate. To QuickVid’s credit, obviously problematic prompts like “Jewish New World Order” and “9/11 Conspiracy Theory” didn’t yield toxic scripts. But for “Critical Race Theory Indoctrinating Students,” QuickVid generated a video suggesting that critical race theory could be used to brainwash schoolchildren.



Habib says he relies on OpenAI’s filters to do most of the moderation work, and claims it’s the users’ duty to manually review every video created by QuickVid to make sure “everything is within the bounds of the law.”

“As a general rule, I think people should be able to express themselves and create the content they want,” Habib said.

It apparently includes spam content. Habib claims that the video platforms’ algorithms, not QuickVid, are best positioned to determine the quality of a video, and that people who produce low-quality content are “only hurting their own reputations.” The reputational damage will obviously deter people from doing mass spam campaigns with QuickVid, he says.

“If people don’t want to watch your video, then you won’t get distribution on platforms like YouTube,” he added. “Producing low-quality content will also make people view your channel in a negative light.”

But it’s instructive to look at ad agencies like Fractl, which in 2019 used an artificial intelligence system called Grover to generate an entire website of marketing material—reputation be damned. In a interview with The Verge, Fractl partner Kristin Tynski said she envisions generative AI enabling “a massive tsunami of computer-generated content across every imaginable niche.”

In any case, video-sharing platforms like TikTok and YouTube haven’t had to contend with moderating AI-generated content on a massive scale. Deepfakes – synthetic videos that replace an existing person with someone else’s likeness – began populating platforms like YouTube several years ago, powered by tools that made deepfaked recordings easier to produce. But unlike even the most convincing deepfakes today, the types of videos QuickVid creates are not overtly AI-generated in any way.

Google Search’s AI-generated text policy may be an example of what’s to come in the video domain. Google doesn’t treat synthetic text any differently than human-written text when it comes to search rankings, but takes actions on content that “intends to manipulate search rankings and not help users.” It includes content compiled or combined from different web pages that “[doesn’t] add sufficient value” as well as content generated through purely automated processes, both of which may apply to QuickVid.

In other words, AI-generated videos may not be banned outright from platforms if they take off in a major way, but rather simply become the cost of doing business. That’s unlikely to allay the fears of experts who believe platforms like TikTok are becoming a new home for misleading videos, but – as Habib said during the interview – “there is no stopping the generative AI revolution.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button