Procedural Artificial Narrative using Gen AI for Turn-Based Video Games
May 18, 2024 5:32 AM   Subscribe

"This research introduces Procedural Artificial Narrative using Generative AI (PANGeA), a structured approach for leveraging large language models (LLMs), guided by a game designer's high-level criteria, to generate narrative content for turn-based role-playing video games (RPGs)."

Full abstract: "This research introduces Procedural Artificial Narrative using Generative AI (PANGeA), a structured approach for leveraging large language models (LLMs), guided by a game designer's high-level criteria, to generate narrative content for turn-based role-playing video games (RPGs). Distinct from prior applications of LLMs used for video game design, PANGeA innovates by not only generating game level data (which includes, but is not limited to, setting, key items, and non-playable characters (NPCs)), but by also fostering dynamic, free-form interactions between the player and the environment that align with the procedural game narrative. The NPCs generated by PANGeA are personality-biased and express traits from the Big 5 Personality Model in their generated responses. PANGeA addresses challenges behind ingesting free-form text input, which can prompt LLM responses beyond the scope of the game narrative. A novel validation system that uses the LLM's intelligence evaluates text input and aligns generated responses with the unfolding narrative. Making these interactions possible, PANGeA is supported by a server that hosts a custom memory system that supplies context for augmenting generated responses thus aligning them with the procedural narrative. For its broad application, the server has a REST interface enabling any game engine to integrate directly with PANGeA, as well as an LLM interface adaptable with local or private LLMs. PANGeA's ability to foster dynamic narrative generation by aligning responses with the procedural narrative is demonstrated through an empirical study and ablation test of two versions of a demo game. These are, a custom, browser-based GPT and a Unity demo. As the results show, PANGeA holds potential to assist game designers in using LLMs to generate narrative-consistent content even when provided varied and unpredictable, free-form text input."

Buongiorno, S., Klinkert, L. J., Chawla, T., Zhuang, Z., & Clark, C. (2024). PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games. arXiv preprint arXiv:2404.19721.
posted by cupcakeninja (25 comments total) 10 users marked this as a favorite
 
I have not been keeping track of research along these lines, but I'm unsurprised to see this. Seems like every RPG space I frequent has regular or semi-regular discussions about "What if a DM, but it's an AI?" or similar.
posted by cupcakeninja at 5:34 AM on May 18 [2 favorites]


Yeah, board games are more fun.
posted by HearHere at 6:30 AM on May 18 [3 favorites]


When I've tried to use ChatGPT for solo adventuring - I've tried both Traveller and 5E D&D - It pulled in the needed rules from the game systems that it had been trained on, creating the illusion of running a solo game and tracking stats. The D&D game included not just my own PC, but also a sidekick NPC (with long conversations and background stories between the two of us) and later someone we rescued who joined the party.
But while it was nominally set in the Forgotten Realms - inside and near Waterdeep, to be precise - it slowly drifted from the original narrative, losing little pieces of my character stats, past interactions, and such - and was very willing to pull in elements that were not necessarily consistent with the story or world. I eventually dropped the endeavor.

It sounds like PANGeA is trying to fix some of those issues.

That said, the ethical implications of generative AI use are very problematic, so regardless, I don't know that I want them to work it out - and I usually hesitate to engage with ChatGPT beyond a few test runs.
posted by Flight Hardware, do not touch at 6:59 AM on May 18 [7 favorites]


There is definitely a space for "GM emulators" , but I don't think LLMs habe much of a role to play. The nature of the mod is to remix the most common in genre elements. That is not a lot of help for beuilding an interesting scenario.

I don't think you can cut human interpretation out of the loop and get anything interesting. But there are plenty of systems that provide prompts or outcomes that are easy to interpret in the context of the current game. Things like inspiration decks, or subgame systems like the Mythic GM Emulator.
posted by The Manwich Horror at 7:03 AM on May 18 [4 favorites]


I have a couple RPG buddies who've tried ChatGPT as GM for kicks. They've said parts of it work pretty well--the text descriptions are actually OK.

One problem is the limited memory chatbots have: They said it would seem to be working sometimes, but more often if they re-trod any ground, nothing previously established would have persisted. The other, funnier one was that it was very suggestible, which (being gamers) they started exploiting for loot. "Is the sword magic?" or "Is there a treasure chest under the bed?" would both tend to get "yes" answers.
posted by mark k at 8:16 AM on May 18


I also think there’s a difference between a full GM and filling in scenes. One of the weird things in cRPGs is that they’re so empty: you’re walking through a huge city like Baldur’s Gate and there are a handful of people who have much unique dialog. Obviously you don’t just want to spam LLM emissions but it seems like you could make something interesting by replacing crowd noises with NPCs actually talking to each other, and especially by incorporating specific game events – anything with hand-written dialog tends to be limited to major events because they didn’t want to write, translate, and voice every possible permutation of choices but it’d be neat if you had more people updating for noticeable things - like even if it’s just a minor side quest, it’d be cool if you were walking out and overheard the sailors by the dock are talking about the noise & light they saw and wondering who brought the combat magic, or people in the market are gossiping about someone bringing in three dungeons’ worth of old loot, etc. and it’s all different rather than one of a couple of lines.
posted by adamsc at 8:18 AM on May 18 [1 favorite]


Before I dive into this, can someone warn me if this requires chatgpt to work? I don't see it in the abstract, but the way people are reacting makes me wonder if it does. If the whole model doesn't execute on the user's computer without requiring OpenAI as a remote host, it's not interesting technology. Can someone confirm?
posted by I-Write-Essays at 9:06 AM on May 18 [1 favorite]


Yes, you can use local LLMs as well as OpenAI via the same REST interface AFAIK.

I tried hacking on something like this once, and I had a hard time even getting a simple PbtA-style game working. Besides the "forgetting/ignoring what's going on in the world" problem, it had a pacing problem. It would tend to make giant leaps in the story, like "you run up the mountain, kill the dragon, and win the game".

It'd be interesting to try with a fine-tuned LLM, but there's still an impedance mismatch between whatever algorithmic thing that imposes constraints on the world and the unconstrained world of the LLM.
posted by credulous at 9:24 AM on May 18 [2 favorites]


> It'd be interesting to try with a fine-tuned LLM, but there's still an impedance mismatch between whatever algorithmic thing that imposes constraints on the world and the unconstrained world of the LLM.

So, it's like... LLMs have yet to develop a prefrontal cortex? The unconstrained world of the LLM is all dream, and the thing that imposes constraints on the world is the prefrontal cortex.
posted by I-Write-Essays at 10:58 AM on May 18 [1 favorite]


I think I finally understood cortex's username.
posted by I-Write-Essays at 11:02 AM on May 18


It's an LLM. It doesn't analytically understand any of the text it's giving you; it is built to predict what text would look like based on the prompt you gave it. Of course it drifts and forgets what you did and what your stats are and what genre you're in- it never knew any of those things to begin with.
posted by Pope Guilty at 12:31 PM on May 18 [5 favorites]


Right, I think people keep forgetting that most RPGs require a lot of writing things down and referring to character stats or tables or scenario docs or whatever. They're an interactive exploration of a compelling story laid atop a complex system that dictates how the world works. LLMs can give you the "interactive" part, for some values of "interaction", but that's it. They're not good -- will never be good -- at compelling story, and they're absolute shit at maintaining the details of a complex system. Because they're not designed for that, at all.

The last time I explored this sort of thing with ChatGPT, late last year, I tried to get it to model a game of poker. It got as far as dealing five cards and then forgetting them immediately. It was unable to maintain the state of a deck of cards and a couple of hands for more than a turn or two without losing track. I tried it again just now and it looks like OpenAI has given this particular task some attention, it's better at keeping track of a couple of hands from turn to turn. But it didn't model the state of the entire deck until I explicitly asked it to (because as far as it was concerned up to that point, our poker hands were entirely conversational and not part of a more complex system). It appeared to manage discards and draws appropriately, and even managed a switch from five-card draw to Texas hold'em, but at the showdown it confidently told me that its Queen-high hand beat my King-high hand and proceeded to rake in the virtual pot.

If I can't trust it to play poker without cheating or hallucinating, I certainly don't want to trust it to GM any game I'm a part of.
posted by Two unicycles and some duct tape at 1:59 PM on May 18 [1 favorite]


(To be fair, TFA does mention these folks are implementing a "custom memory system that supplies context for augmenting generated responses". Presumably that includes a way for it to store and recall variables to keep track of the things an LLM would otherwise happily ignore or make up on the fly, like the state of the game world or a character's stats or inventory. If their project gains any traction, it'll be the integrated memory system that makes it possible, i.e. the bit that allows the LLM to behave like something other than an LLM.)
posted by Two unicycles and some duct tape at 2:23 PM on May 18 [1 favorite]


The linked paper says:
The custom GPT version of Dark Shadows can be accessed here: https://chat.openai.com/g/g-RhmfY1KJR-dark-shadows-gpt. The Unity version can be downloaded from GitLab.

And this other paper by some of the same authors says:
A Dark Shadows Unity demo, code, prompts, knowledge graphs, and examples can be found on our GitHub repository.

But I can't find any link to this Unity demo repo on either GitLab or GitHub.
Has anyone else found the code for this?
posted by Nossidge at 3:22 PM on May 18


(To be fair, TFA does mention these folks are implementing a "custom memory system that supplies context for augmenting generated responses". Presumably that includes a way for it to store and recall variables to keep track of the things an LLM would otherwise happily ignore or make up on the fly, like the state of the game world or a character's stats or inventory. If their project gains any traction, it'll be the integrated memory system that makes it possible, i.e. the bit that allows the LLM to behave like something other than an LLM.)

Again, the LLM doesn't know anything and can't do things that aren't predicting what the response to a given prompt would probably look like, so this would involve coming up with something that can somehow do its own narrative thing and integrate LLM output into it- at which point the LLM is probably the least interesting part of the technology, since that's what LLM marketers are fervently pretending LLMs do.
posted by Pope Guilty at 4:01 PM on May 18 [1 favorite]


I mean, to be clear, it doesn't matter if you store a variable for how many HP you have because the LLM doesn't, and can't, know what HP are. It can't call a variable from an external system to find out how much HP you have because there's nothing there to understand what the number the variable is storing means. "This other system is storing all the stuff my character did in the past!" okay great, fine, but none of that means anything to the LLM except as prompts to determine what its next output should be.
posted by Pope Guilty at 4:08 PM on May 18 [2 favorites]


My son, who is home now from his second year at college studying computer science and game design, just shook his head when I showed him this. People who can write good narratives aren’t the ones who will be using this.
posted by pmbuko at 4:42 PM on May 18 [3 favorites]


This is neat. I've done experiments along the same lines with okay results (Big 5 for personas, long/short term memory that changes over time, content generation based on input parameters with a bounds validator on the output, etc.). An LLM is not a game engine, but it can be integrated into one.
posted by ryoshu at 4:42 PM on May 18


The entire purpose of a game NPC is to provide information to the player. By preference, that information has to be relevant to the player (because despite all this AI-NPC stuff always being prefaced with "wouldn't it be great if you could have a whole conversation with any random NPC", virtually nobody actually plays games that way, and in fact most players get annoyed real quick if they interact with several characters and find themselves hammering through irrelevant dialogue).

But, more importantly, that information has to be accurate. If an NPC mentions a cave in the woods, there has to be a cave in the woods, because the player is going to go looking for it. If an NPC mentions anything interesting at all -- a ghost, a cave, pirates, treasure, any kind of interesting fantasy-game thing -- the player is going to read that as a quest hook. And as Pope Guilty says above, an LLM does not and cannot know the truth-value of the things it says.
posted by rifflesby at 7:26 PM on May 18 [4 favorites]


It's a valid design decision to have NPCs lie, or not know what they're talking about. Dragon Quest XI has a preference setting that lets NPCs do just that. And it makes sense that the people in the villages might not even have a clue what's happening in the game world. Why would they know the evil wizard's evil plans for evil?

But that's a design choice. That's different from what "AIs" do. What an AI says has a null truth value: it just bullshits. NPCs provide useful information because they're a conduit between the DM and the players. Remove the DM and replace it with an automatic talking machine, and what is even being communicated? It's just going to meander. Eventually what it previously said will scroll out of the AI's context buffer, and it'll make up something else. No rising action, no denouement, no meaningful conclusion or resolution, just endless tepid crap.
posted by JHarris at 7:58 PM on May 18 [3 favorites]


For those with an interest, app versions exist for Mythic GME 2e, both iOS and Android. I like the paper version, but I also have the app. I don't much use the journaling features of it, but the prompts and random tables are enough to generate a narrative. I also like the One Page Solo Engine, which also has app versions. I use none of them as often as I might like because solo RP has such a strong overlap with creative writing (and some argue that it is creative writing with randomized elements) that it can detract from my actual writing, both in terms of amount of time I have in the day and in terms of using up my "creativity quotient" for the day. They're lovely for a long weekend, though, or futzing around when traveling, on the bus, etc.
posted by cupcakeninja at 6:05 AM on May 19


I feel bad for having very little to add to this one since unlike most AI threads this touches on what I do professionally, but it’s all been covered upthread. Game systems have a very restricted grammar and writing natural-sounding language without invoking unsupported verbs or systems remains tricky for human designers with decades of experience.

Eventually what it previously said will scroll out of the AI's context buffer, and it'll make up something else. No rising action, no denouement, no meaningful conclusion or resolution, just endless tepid crap.

Worth noting Google’s Gemini 1.5 Pro has a context window of up to 1 million tokens (10-15 novels) and they recently released a paper on Infinite Context Transformers. With a lot of work it may be possible to address the character arc/narrative arc/Chekov’s gun memory issues, but the interplay between competing abstract models of human language <-> game system <-> player intentionality <-> technical implementation remains a worst-case scenario for LLMs.
posted by Ryvar at 7:25 AM on May 19 [2 favorites]


I read LLM & only hear stochastic parrots.
posted by HearHere at 8:33 AM on May 19 [1 favorite]


Anyone remember the D&D satanism moral panic? We'll see how this one goes..

Tourist makes accidental bomb threat when app garbles translation
posted by jeffburdges at 2:14 AM on May 21 [1 favorite]




« Older How Lizzie Borden Got Away With Murder   |   Time Is Shaped Like a Labyrinth Newer »


You are not currently logged in. Log in or create a new account to post comments.