Consider the Cronslave (hiandrewquinn.github.io)

definitely! However, my intuition is that correctly interpreting the rules pulled in context will require some basic understanding of the game system that pretraining would help with. Ultimately after training this base model for instruction-tuning and tool-use (to provide a search tool) I'll compare it against https://huggingface.co/Qwen/Qwen3-0.6B without any specific domain pretraining and see how it performs at rule adjudication. I expect the shadowdark-trained model will have better understanding of the rules, but there's only one way to find out.

palmfacehn · 15h ago

It is an interesting problem to solve. When reading, I noticed the model's ambiguity around terms like 4d6. At first I thought you might try editing your markup to describe the concept of dice more thoroughly. Ultimately I wonder if you might try having the model fill in data to be utilized by a hard coded combat system. Are you going to rely on the LLM for pseudorandom numbers? Concepts like turns and dice rolls could be abstractly defined in code and instantiated by the model.

The model might excel at creating character sheets, after you define a schema. From there you can validate the generated sheets against known lore. You could combine the story telling from the LLM with the formalized character schema to create campaigns. I'm not an expert here, but I suspect you might try asking the model to translate an existing fantasy story dataset into a series of narration/dialogue blocks and character sheets.

Without training, I've experimented with similar approaches for item generation using EBNF.

pact_inference · 15h ago

> Are you going to rely on the LLM for pseudorandom numbers?

Definitely! I'm going to start with instruction tuning it for basic question answering, and then add tools to allow it to search the markdown source to cite answers to rules questions. I think adding some dice tooling for proper character sheet creation would be an awesome task to test as well. I'm actually thinking a lot about what tasks I could try that are "trivially" programmatically verifiable in their correctness for stuff like GRPO, so I'm definitely going to use that idea.

> You could combine the story telling from the LLM with the formalized character schema to create campaigns. I'm not an expert here, but I suspect you might try asking the model to translate an existing fantasy story dataset into a series of narration/dialogue blocks and character sheets.

I think probably late this year I'll be able to work on that sort of thing. There's a really interesting approach to story generation https://arxiv.org/abs/2503.22828 here, but modifying ways to translate it into campaign relevant structured objects and "reward" that will take some experimentation.

jasonjmcghee · 15h ago

> I used the AdamW optimizer and selected a learning rate of 5e-5. I’ve seen learning rates of 5e-6 for pretraining and 5e-5 for finetuning. I would consider this closer to the latter - I don’t want to totally destroy the knowledge Qwen already had, I just want to add to it a bit.

Is this a typo? Maybe 5e-4 for pretraining?

Otherwise this goes against all the intuition I have around learning rates and catastrophic forgetting. (a smaller learning rate causing knowledge degredation)

pact_inference · 15h ago

whoops, definitely a typo! It should be 5e-4 for as the base "pretraining" LR, you're absolutely correct.

your intuition is sound, but my fingers are not.