- cross-posted to:
- technology@beehaw.org
- cross-posted to:
- technology@beehaw.org
twitter comment:
they made something that actually does not run doom, which is a first at least
But the researchers then dive head-first into wild claims:
GameNGen answers one of the important questions on the road towards a new paradigm for game engines, one where games are automatically generated, similarly to how images and videos are generated by neural models in recent years.
To which the obvious reply is: no it doesn’t, where did you get any of this? You’ve generated three seconds of fake gameplay video where your player shoots something and it shoots back. None of the mechanics of the game work. Nothing other than what’s on-screen can be known to the engine.
Yeah, this was apparent immediately.
Diffusion models are just matrices of positional and temporal probabilities. It is absolutely incompatible with even the simplest demands of a game, since any player will reject a game if it lacks reliable and input-deterministic outcomes. The only way to get that reliability is to create a huge amount of training data, and spend exorbitant resources training on it to the point of harshly over-fitting the model to the data, all of which requires that the team first make the game they’re emulating before they start training. It’s nonsense.
If someone is going to use AI to make a game, they would get exponentially higher ROI using AI to generate code that writes once the relationship between the data, versus inferring the raw data of every individual pixel.
The demo was always effectively a clickbait novelty for likes.
they would get exponentially higher ROI
They’d get infinitely higher ROI by not using genAI in the first place.
Stephanie Sterling of the Jimquisition outlines the thinking involved here. Well, she swears at everyone involved for twenty minutes. So, Steph.
She seems to think the AI generates .WAD files.
I guess they fell victim to one of the classic blunders: never assume that it can’t be that stupid, and someone must be explaining it wrong.
The Doom community has had random level generators for over a decade without any modern AI garbage involved.
This is conceptually different, it just generates a few seconds of doomlike video that you can slightly influence by sending inputs, and pretends that In The Future™ entire games could be generated from scratch and playable on Sufficiently Advanced™ autocomplete machines.
Skimmed the paper, but i don’t see the part where the game engine was being played. They trained an “agent” to play doom using vizdoom, and trained the diffusion model on the agents “trajectories”. But i didn’t see anything about giving the agents the output of the diffusion model for their gameplay, or the diffusion model reacting to input.
It seems like it was able to generate the doom video based on a given trajectory, and assume that trajectory could be real time human input? That’s the best i can come up with. And the experiment was just some people watching video clips, which doesn’t track with the claims at all.
yeah, i think they say a human played it but it’s not clear if an actual human did.
Can “ai” make a good game, or just a thing that generates video and mostly accepts inputs (and it isnt even hardly doing that)?
Carmack quaking (ahaha) in his boots
The second, and yes, only barely.
I didn’t seek out the video before, I read about all the glaring problems, but one thing that no one pointed out was… why is the entire thing slow motion?
Like, you know Doom? The game where the brisk pace and constant movement are a core part of its DNA? Witness it running at 0.5x speed and like 5FPS in the year of our acausal robot lord 2024.
It’s running slow because it’s running at such a low framerate. The speed and the framerate are tied. Old console games used to work that way, which was a problem because games would run at different speeds in different countries (PAL vs NTSC). This is a solved problem in modern games. Just separate the game logic from the display logic. But this AI can’t do that because there is nothing but the video.
Add to that that the AI was probably trained on high framerate footage but is only capable of generating low framerate footage and you get (gestures wildly) this