AI video generating hardware: Hands-on with the 1stAI Machine


VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Hear from top industry leaders on Nov 15. Reserve your free pass


‘Is this what AI hardware should look like?

That’s been one of the many questions percolating around my mind since the beginning of this month, when I saw Cristóbal Valenzuela, the CEO of well-funded generative AI video startup Runway ML post a video clip to his X account of something called the “1stAI Machine.”

Valenzuela called it “the first physical device for video editing generated by AI,” and included the following quote:

“We anticipate that the quality of videos will soon match that of photos. At that point, anyone will be able to create movies without the need for a camera, lights, or actors; they will simply interact with the AIs. A tool like 1stAI Machine anticipates that moment by exploring tangible interfaces that enhance creativity.”

VB Event

AI Unleashed

Don’t miss out on AI Unleashed on November 15! This virtual event will showcase exclusive insights and best practices from data leaders including Albertsons, Intuit, and more.

 


Register for free here

The video showed “the first AI editing board,” a chunky, angular matte silver device resembling a sound mixing board and that appeared at least two or three times as large as your average modern laptop — with physical dials and nobs for controlling different input styles and treatments.

I was immediately intrigued. As a journalist covering AI tools for creativity and media production for VentureBeat, I wanted to learn more about the machine and its goals: was Runway, heretofore a software startup focused on its Gen-1 and Gen-2 web-based programs, getting into the hardware game?

And if so, how much did the machine cost, when would it ship, and who was the intended userbase?

AI hardware emerges

Another AI hardware device, the Ai Pin from Humane, a startup formed by ex-Apple engineers, debuted last week to mixed reactions, namely around its $699 upfront price plus a $24 monthly subscription, and its unique form factor — a magnetic pin with battery pack and built-in laser projector that is clipped on your clothing. That device is powered by OpenAI’s GPT-4 AI model, and meant to act as a kind of life assistant and potential smartphone replacement, and it has already earned a place on Time Magazine’s 200 Best Inventions of 2023.

Clearly, AI-powered hardware is emerging fast. So where does the 1stAIMachine fit in, who built it, and what inspired it?

The man behind the machine

Valenzuela credited “SpecialGuestX for 1stAveMachine” in his post on X for creating the machine, which is powered by Runway’s software. I emailed Valenzuela, SpecialGuestX (SGX) and 1stAveMachine last week and received a response from Miguel Espada, co-founder of SGX, the latter of which is described on its website as “creative agency exploring new narratives of data, automation and artificial intelligence.”

Miguel Espada, co-founder of SGX and lead creative behind the 1stAI Machine, pictured holding the device. Credit: VentureBeat

Espada confirmed the device had been created by his small team in Madrid, Spain, where he calls home, and was kind enough to answer my questions about it, as well as give me a hands-on demo at the Brooklyn offices of his collaborators, 1stAveMachine, a “collective” of artists, designers, scientists and other creatives who work with major brands, creating commercials and other advertising materials for them.

Creative agencies are a fancier term for advertising agencies, so SGX and 1stAveMachine are in some ways analogous to modern-day, real-life equivalents of Sterling Cooper Draper Pryce (SCDP), the fictional, innovative ad agency at the heart of one of my favorite TV series, Mad Men. But with a hipster, transatlantic bent, as if later season Stan Rizzo took over the agency.

Espada has had long experience with AI for artistic pursuits in this role, being an early member of the Disco Diffusion community that later morphed into the Stable Diffusion image generation AI model. For a prior client, Carvana, his agency used Stable Diffusion code and tweaked it to create on-demand AI generated video for 1.3 million customers of the no-hassle auto purchasing and delivery service, emailing them vignettes from the imagined point-of-view of their cars being delivered to them and all the excitement the vehicles would have, if personified.

Can you buy it?

First thing’s first: don’t get your hopes up about getting your hands on a 1stAI Machine anytime soon. Espada confirmed the device was a one-of-a-kind prototype.

“Currently there aren’t plans for selling it but we’ve got some hardware products on the roadmap…” Espada wrote prior to our meeting in an email to VentureBeat.

Fittingly for a creative agency, Espada said the 1stAI Machine was born from the remnants of a pitch to a client in the automotive space around the idea of turning storyboards and concept sketches of a new car model into generative video using Runway’s software, Gen-2. Gen-2 accepts uploads of still images and applies realistic (sometimes surrealistic) motion to them.

The client didn’t go for the idea to turn their auto sketches and storyboards into AI generated video, but the pitch stuck in Espada’s head and he and his team decided to go ahead and build a generative AI video editing board as a proof-of-concept. They did so on their own, without seeking the assistance of Runway.

“It’s powered by Runway, but it’s not a Runway product,” Espada clarified, writing, “Its CEO, Cristóbal Valenzuela re-shared it because he thought it was an interesting product.”

How it works

In 1stAveMachine’s offices in the DUMBO (Down Under Manhattan Bridge Overpass) neighborhood of Brooklyn overlooking the East River, Espada showed me the 1stAI Machine set up on a table.

It’s an elegant and refined piece of equipment, not nearly as janky looking as some prototypes I’ve seen, with a smooth, matte aluminum chassis and black and silver knobs and dials that are as satisfying as the vintage midcentury modern stereos depicted in Mad Men and now coveted by audiophile collectors. The chassis was designed in 3D modeling software by the human creatives at SGX and laser cut into several pieces that were fitted neatly together with screws, aligned like a professional grade studio product.

Photo of the 1stAI Machine. Credit: VentureBeat.

Its defining feature, though — as one might expect for a video-focused product — are screens: there are actually eight separate displays on the device, including a full color LCD for playing the final video product, and six smaller black-and-white screens that show storyboards from which the final video is built. There’s also a narrow strip that displays the device’s status in a text bar, such as “playing” or “generating.”

Espada took me through how to operate it. The device helpfully is divided into numbered sections for the steps of its workflow: 1. story (storyboards) 2. style 3. music (the fourth section is simply a speaker grill that plays the music).

For now, the device is limited to drawing from a set of about a dozen storyboards and still frames sourced from iconic films — Pulp Fiction, E.T.: The Extraterrestrial, Titanic, The Godfather, and Star Wars, are among those films whose storyboards have been preloaded onto it.

The user selects six storyboards they want to use as source material (this being a single-use prototype research device designed only to be used in private, Espada and his collaborators are unconcerned about copyright) using the six small LCD screens, with the top most screen corresponding to the first frame in the final video.

These storyboards only serve as the basis from which Runway’s Gen-2 AI model applies transformations, linking all the transformed storyboards together into a 30-second-long video with figures and scenes that resemble the original storyboards, but only barely — Espada’s demo video he created for me on the spot transformed the iconic balcony scene in Titanic into a hallucinogenic fever dream of two masculine-presenting figures with short blonde hair leaning out from a mass of sticky pink substance over neon blue water.

Titanic storyboard remixed by Runway’s Gen-2 AI model on the 1stAI Machine. Credit: VentureBeat.

But before we get to the results, there’s two other important processes to the 1stAI Machine workflow we should mention: the style tuner and the music selector.

Let’s start with the music selector first, since it is a bit more intuitive and obvious: the machine allows you to select a soundtrack of AI generated music in different genres, from country to pop to reggaeton to rave/EDM and k-pop. These music pieces form the soundtrack to the generated video, and are themselves generated by SunoAI models. The music selector control is a slider, so you can actually produce hybrid sounds between two genres, say a fusion of pop and reggaeton. There is no dialog in these films — as with many generated AI videos. Instead, it is more like a film from the silent era, albeit in color and created with machine learning algorithms rather than human performers or camera operators.

In addition, before rendering the video, the user must select the style using a knob: corporate ladder, barbie obsession, childish regression, nordic noir, modest polycount, and unexpected future are all unique generative video aesthetics devised by Espada and his collaborators at SGX/1stAve Machine using Runway Gen-2, which allows you to control different parameters through its software interface. These styles have different qualities and characteristics that appear in the final rendered video — barbie obsession, for example, produces the kind of bright, neon pink, tropical scenery shown two photos above.

Espanda and team have taken Runway’s software interface and rendered it in physical form, albeit with the constraints of a range of pre-determined styles they made.

But in the future, Espada himself sees the potential to have the user’s custom styles inputted into a hypothetical future 1stAI Machine (2ndAI Machine), perhaps shown on another LCD display.

“You will own your unique style and get to decide who can use it,” Espada told me during the demo, noting that the boostraped AI startup Midjourney had just unveiled a unique style generator for still images.

Inside the machine is a Mac Mini computer running a Linux / Ubuntu operating system, with the software running on Python and Openframeworks. There’s also a router inside allowing finished video to be ported over wirelessly to a computer.

What’s next for the 1stAI Machine and AI hardware?

Espada said that while the 1stAI Machine was only ever designed to be a standalone prototype, the interest it has generated from Valenzuela and others in the online AI video editing community have suggested to him that there should be a second, more advanced model, one that could run on even lighter and cheaper computing resources, say a Raspberry Pi microcomputer or a few.

A future version might have the ability for the user to upload their own storyboards or source imagery as well.

Espada envisions a future version of the 1stAI Machine being used at music festivals or large events such as conventions, where attendees could come up and “vee-jay (VJ)” by creating their own AI generated videos through Runway software and projecting them form the device to a larger display, one the size of jumbotron like at a Taylor Swift Eras Tour concert.

Ever the creative advertiser, Espada thought this would make a good experience to be sponsored by a large brand, a hypothetical Coca Cola or PepsiCo or similar.

However, he was adamant that he was not interested in pursuing a stand-alone hardware business.

“Hardware requires years and years to make it a mass consumption device,” Espada told VentureBeat during our hands-on. “I want to stay focused on creating stories using AI and other tools for brands and our clients.”

That said, he was willing to turn the design over to Valenzuela or others at Runway to pursue if they should want it, for a fair and reasonable compensation.

Overall, Espanda and his collaborators believe that there is value in having dedicated hardware for AI programs in certain contexts, as it focuses the user on the AI production process, freeing them from the other myriad distractions and pings they’d get on a laptop or desktop setup.

And as Espada pointed out to VentureBeat, professional creatives in visual arts, motion graphics, special effects, and music often adopt such dedicated hardware setups — be they mixing boards or other peripherals like electronic drawing pads and styluses — even though their work could theoretically all be completed on a standard PC.

After viewing the 1stAI Machine up close, I can say I solidly agree: this is probably would AI hardware should look like.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.





Source link