
Google DeepMind unveiled the successor to the Genie synthetic intelligence (AI) mannequin, which might generate limitless 2D recreation worlds, on Wednesday. Dubbed Genie 2, the brand new AI mannequin is able to producing distinctive action-controllable, playable 3D environments primarily based on a single picture immediate. Calling Genie 2 an AI “world mannequin”, the corporate said that it might generate as much as minute-long environments with constant objects. The corporate mentioned these generated worlds may very well be performed by people or can be utilized to coach AI brokers.
Google DeepMind Unveils Genie 2 AI Mannequin
In a blog post, the corporate detailed the brand new AI mannequin and its capabilities. Whereas its predecessor might solely generate recreation worlds for 2D platformer video games, the Genie 2 AI mannequin can generate 3D worlds full with constant fashions that may be interacted with. This implies people or AI brokers can stroll, run, swim, climb, and carry out extra actions in these environments.
Genie 2’s generative capabilities enable it to generate routes, buildings, and objects that can not be seen within the enter picture. These parts are designed and rendered by the mannequin from scratch. Moreover, the inspiration mannequin can also be able to sustaining consistency in these environments. This implies even when a participant strikes away from one space and returns again, the environments stay the identical.
Other than this, Genie 2 is able to producing completely different views akin to first-person views, isometric views, or third-person views. Additional, customers also can work together with the objects within the generated worlds and might carry out actions akin to opening a door, bursting a balloon, or climbing a ladder. The mannequin will also be prompted to generate physics-related results akin to water ripples, smoke, gravity, directional lighting, reflections, and extra.
Coming to the technical particulars, DeepMind defined that Genie 2 is an autoregressive latent diffusion mannequin and has been skilled on a big video dataset. The transformer structure additionally consists of an autoencoder which allows frame-by-frame technology of those worlds.
Notably, DeepMind additionally released an AI mannequin dubbed Scalable Instructable Multiworld Agent or SIMA earlier this 12 months, which is actually able to agentic AI features in 3D worlds. The corporate says Genie 2 is able to offering distinctive environments to comparable AI brokers and coaching them for varied real-life situations.
For the reason that world mannequin can generate distinctive environments, Google says this may eradicate the chance of knowledge contamination and can enable builders to accurately assess an AI agent’s capabilities.
Catch the newest from the Shopper Electronics Present on Devices 360, at our CES 2025 hub.