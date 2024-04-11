Welcome to AI Decoded, Fast Company’s weekly newsletter that breaks down the most important news in the world of AI. You can sign up to receive this newsletter every week here.
AI video market is hungry for breakthroughs, and talent
There are only 20 to 30 companies currently at work developing video generation models. And 18 months ago, these companies had trouble getting funding because, at a time when chatbots were still in their nascent stage, the idea of producing life-like video seemed far away. But video generation tools like Runway and OpenAI’s Sora have totally changed the narrative—and Silicon Valley has taken notice.
“I’m seeing an inflection point happening right now,” says Robert Nishihara, cofounder and CEO of Anyscale, whose platform helps developers efficiently train, run, and scale AI models and apps. “It’s mostly a race to build the best model.”
In this new generative-AI race, we’ve got AI heavyweights—i.e. OpenAI—alongside companies including veed.io, Pika Labs, Loom, Captions, and Descript. Companies in the video space, like the large language model developers before them, are seeing improvements in their results as they feed their models more training data and more computing power.
But there’s still a lot of work to do. “Right now, we’re at the stage of having some really impressive demos,” Nishihara says. “Sora’s amazing, but the field is nascent, and none of the models have reached the level of being broadly useful.”
Indeed, the existing use cases are relatively narrow. For example, Hey Gen lets you upload a video of yourself talking, then generates a video of you saying the same thing in a number of different languages, complete with appropriate and realistic mouth movements, etc.
In general, video generation models haven’t advanced to the point where they can faithfully recreate a scene from the user’s imagination as a real video. “It’s going to generate something different right now,” Nishihara says. “Maybe the thing will be cool, but it’s not what I had in mind exactly.” And today’s tools lack the editing capabilities needed to modify the video. “It’s missing controllability,” he adds. “There’s still a lot of work that needs to go into that.”
Developing video generation models is an expensive business. As the size of the models increase, so do the costs of training them, including progressively larger blocks of cloud-computing power. The startups in the space must also attract the talent needed to advance video models and build infrastructure. “There’s a huge premium on people who have done it before,” Nishihara says. “There just aren’t that many of those people.” Acquiring and experimenting with the right kinds of training data, which usually means large amounts of labeled video, is also a costly process.
Nishihara says that as competition mounts to build the best models, and attendant costs rise, it’s likely that consolidation will occur in the space. Some companies may fall behind in the research and sell out. And, as we’ve seen, larger companies such as Microsoft and Google, are always watching closely, ready to pay high dollar for the premium talent or promising research.
Chipmakers increase their push to break Nvidia’s chokehold on AI
Google announced on Tuesday a new ARM-based chip at its cloud conference in Las Vegas, and later that same day, Intel announced a new version of its Gaudi accelerator chips for AI. Both tech giants are trying to take on Nvidia, which controls about 80% of the AI chip market.
Google’s ARM-based chip, Axion, is a CPU that’s expected to run general-purpose cloud-computing jobs, at least in the short term. More importantly for AI is the company’s announcement of a new and faster version of its cloud-based Tensor Processing Unit (TPU), the Cloud TPU v5p, which the company says is the most powerful and scalable AI accelerator to date. TPUs are well-suited for training large AI models—image and video generation models in particular. One video generation company, Lightricks, says it’s seen a speed more than doubled in training its text-to-image and text-to-video models, compared to Google’s earlier TPU v4 accelerator. Note that Google Cloud is still a major buyer of Nvidia GPUs.
Intel, for its part, says its new AI accelerator chip, the Gaudi 3, can perform inference jobs (that is, day-to-day data analysis by trained models) 50% better than Nvidia H100 GPUs, while being 40% more power-efficient, all at “a fraction of the cost.” But the real market impact of Gaudi 3 might be somewhat more muted. Analyst Patrick Moorhead predicted on CNBC that Gaudi 3 “will take a certain percentage, probably a single-digit market share in this, but I don’t think this is going to be a sea change out of the gate.”
Moorhead’s skepticism is based on the fact that Nvidia has already been working hard to get AI developers using the layers of proprietary software it’s built on top of its chips. One of these layers is Nvidia’s CUDA, which lets developers access all the performance and features of the Nvidia chips. Enterprise and data center customers are less likely to migrate to different chips that may be incompatible with the software. Advantage Nvidia. But Intel says it’s been working with industry partners, such as Google, Qualcomm, and Arm, to build open software that would make it easier for AI developers to utilize other, non-Nvidia chips.
Enterprises are still pointing to LLM accuracy issues
How are businesses and consumers feeling about the advent of AI? By the number of surveys coming out each week, it seems a lot of people want to know. Here are some selected bullets from recent survey findings:
- A survey of executives and AI pros by Writer AI finds that generative-AI solutions are often found lacking, with 61% of companies experiencing accuracy issues, and only 17% rating their in-house solutions as excellent in overall performance. It also found that 78% of businesses surveyed are either already using or planning to use private, in-house generative-AI solutions because of concerns over “security, data protection, and the need for robust controls.”
- An Ipsos survey asked Americans how they’re using AI models. The highest number, 43%, use it to “search for information,” which may bode well for companies like Microsoft, Google, and Perplexity that are trying to use AI to help people search the web. The next most-popular use was for “comparing things” (26%) followed by “entertainment” (25%).
- The staffing provider Adecco Group surveyed 2,000 executives at large companies around the world and found that 41% of them believe AI will result in smaller workforces in the next five years.
- Data management company AvePoint surveyed “digital workplace leaders” in the healthcare, financial services, IT, and government sectors and found that only 44% of organizations have confidence that they can use AI safely, and nearly half have encountered unintended data exposure when implementing AI.
