Don't rely on AI coding.
Mar. 11th, 2026 05:35 pmMy audience probably isn't much into having AI help code software, but I ran an experiment just out curiosity. Basically, I asked 3 different AIs (ChatGPT, Claude, and Llama) code a Stardew Valley Content Patcher mod. I am very familiar with coding these things and can easily spot errors. None of them got the coding right.
The mod request was simple: create the code for a machine that would output 1 random item per day, acting similar to the Worm Bin and/or the Statue of Endless Fortune. (These actually work slightly different in their coding despite essentially doing the same output at start of day thing.) Most of the time, the player should get common farm items - stone, wood, fiber. But upon occasion, the player should be able to get either the Dish of the Day, a random seasonal vanilla item, or a random object in one of a couple of categories.
(It was fun to play with, but I initially put the percentages up too high for the random items and so I was getting stuff that was way out of balance with what I really wanted. Not that I really needed this; I just like random generators!)
All of them were able to write a good manifest.json, which is kind of the introductory file that all Stardew Valley mods use.
ChatGPT 5.2: Mostly decent code, including finding DISH_OF_THE_DAY and RANDOM_BASE_SEASON_ITEM (the saloon's Dish of the Day and the pool for when you break mine barrels/fish in garbage cans respectively) but refused to recognize that you could concisely put in percentages in; it insisted on multiple entries for wood/stone/fiber to weigh them higher.
Claude Sonnet 4.5: Pulled a similar odd way of weighing choices, though when I gave it an example from one of my other mods, it did fix the code. However, it failed to recognize RANDOM_BASE_SEASON_ITEM when I gave it the example of "trash/barrels"; it created a list of the trash items, which works, but not really what I wanted. It did write a nice recipe entry to make the machine though. (Mine I just set to buy from Robin.)
Llama 4 Maverick: I have no idea what in the heck this was doing. It got the structure right, but not much else.
ETA: I forgot Gemini!
Google Gemini 3: Gemini, like Claude, failed to find RANDOM_BASE_SEASON_ITEM; it recognized the idea of random chances and mostly got the idea correct but had the code for them wrong (see below on how to write that in Content Patcher). Even when I linked it to the wiki page that included that, it insisted I was wrong.
Here's the meat of the mod:
"OutputItem": [
{
"CustomData": null,
"ItemId": "(O)388",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 3,
"Quality": 0,
"ModData": null,
"
},
The section above is my default. Apparently I decided that if nothing else applied, that I would get 1-3 wood. (O)388 is the Item ID for wood.
{
"CustomData": null,
"ItemId": "DISH_OF_THE_DAY",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
},
This section tells the game that there's a 10% chance that the machine will give me the Dish of the Day instead.
{
"CustomData": null,
"ItemId": "RANDOM_BASE_SEASON_ITEM",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
},
This section tells the game that there's a 10% chance that the machine will give me a random item from the list of base season items (that is, the ones you get from breaking barrels and digging through trash cans).
{
"CustomData": null,
"ItemId": "RANDOM_ITEMS (O)",
"PerItemCondition": "ITEM_CATEGORY Target -2 -5 -6 -12 -15 -16 -20 -21",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
}
This is another 10% chance. This one tells the game to choose an item out of one of 8 categories: gems, eggs, milk, minerals, metal resources, building resources, junk, or bait.
Assuming I've coded this correctly - still testing it, wouldn't put it in a serious game - it would choose an item for any categories where this was random (that is, not the dish of the day or the wood), then check to see if any of them rolled under 10%. It then chooses one category randomly and spawns the item. The order might be slightly different (the game might see if the percentage applies first, then rolls the random item). Not sure, doesn't change things.
The mod request was simple: create the code for a machine that would output 1 random item per day, acting similar to the Worm Bin and/or the Statue of Endless Fortune. (These actually work slightly different in their coding despite essentially doing the same output at start of day thing.) Most of the time, the player should get common farm items - stone, wood, fiber. But upon occasion, the player should be able to get either the Dish of the Day, a random seasonal vanilla item, or a random object in one of a couple of categories.
(It was fun to play with, but I initially put the percentages up too high for the random items and so I was getting stuff that was way out of balance with what I really wanted. Not that I really needed this; I just like random generators!)
All of them were able to write a good manifest.json, which is kind of the introductory file that all Stardew Valley mods use.
ChatGPT 5.2: Mostly decent code, including finding DISH_OF_THE_DAY and RANDOM_BASE_SEASON_ITEM (the saloon's Dish of the Day and the pool for when you break mine barrels/fish in garbage cans respectively) but refused to recognize that you could concisely put in percentages in; it insisted on multiple entries for wood/stone/fiber to weigh them higher.
Claude Sonnet 4.5: Pulled a similar odd way of weighing choices, though when I gave it an example from one of my other mods, it did fix the code. However, it failed to recognize RANDOM_BASE_SEASON_ITEM when I gave it the example of "trash/barrels"; it created a list of the trash items, which works, but not really what I wanted. It did write a nice recipe entry to make the machine though. (Mine I just set to buy from Robin.)
Llama 4 Maverick: I have no idea what in the heck this was doing. It got the structure right, but not much else.
ETA: I forgot Gemini!
Google Gemini 3: Gemini, like Claude, failed to find RANDOM_BASE_SEASON_ITEM; it recognized the idea of random chances and mostly got the idea correct but had the code for them wrong (see below on how to write that in Content Patcher). Even when I linked it to the wiki page that included that, it insisted I was wrong.
Here's the meat of the mod:
"OutputItem": [
{
"CustomData": null,
"ItemId": "(O)388",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 3,
"Quality": 0,
"ModData": null,
"
},
The section above is my default. Apparently I decided that if nothing else applied, that I would get 1-3 wood. (O)388 is the Item ID for wood.
{
"CustomData": null,
"ItemId": "DISH_OF_THE_DAY",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
},
This section tells the game that there's a 10% chance that the machine will give me the Dish of the Day instead.
{
"CustomData": null,
"ItemId": "RANDOM_BASE_SEASON_ITEM",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
},
This section tells the game that there's a 10% chance that the machine will give me a random item from the list of base season items (that is, the ones you get from breaking barrels and digging through trash cans).
{
"CustomData": null,
"ItemId": "RANDOM_ITEMS (O)",
"PerItemCondition": "ITEM_CATEGORY Target -2 -5 -6 -12 -15 -16 -20 -21",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
}
This is another 10% chance. This one tells the game to choose an item out of one of 8 categories: gems, eggs, milk, minerals, metal resources, building resources, junk, or bait.
Assuming I've coded this correctly - still testing it, wouldn't put it in a serious game - it would choose an item for any categories where this was random (that is, not the dish of the day or the wood), then check to see if any of them rolled under 10%. It then chooses one category randomly and spawns the item. The order might be slightly different (the game might see if the percentage applies first, then rolls the random item). Not sure, doesn't change things.
no subject
Date: 2026-03-12 12:14 pm (UTC)Not sure that makes any sense, but I hope it does :D
no subject
Date: 2026-03-12 11:38 pm (UTC)no subject
Date: 2026-03-12 03:12 pm (UTC)I work on related stuff... may I share an opinion?
FWIW, the paid versions are more effective than the free versions right now, and I think this may be an important thing to be aware of. I expect that advances in the paid tiers will eventually filter down into the free tiers.
A key is to run everything (whether coding or report analysis or content generation) through multiple times (as you tried to do!!!) and/or get it to run tests/comparisons to get it to find its own errors and correct itself. Feeding examples of what one wants helps a lot ("few-shot prompting" v. "zero-shot prompting"). I have access at my workplace to paid tiers of Gemini and they can give much better output for the same guidance, and much much better for better guidance. I'm not a SWE, myself, but the engineers as a group seem to have widely varied feelings about integrating the tools into their workflows. They don't seem, overall, worried about being replaced by the tools, as such, as much as by being pressured to let the tools do the fun stuff for them instead of the tedious stuff, because management doesn't always understand that humans really must do the fun (thoughtful, puzzling, innovative) stuff, that it both gives better results and that not every human wants to be a Program Manager in her heart of hearts. ;-)
no subject
Date: 2026-03-12 11:55 pm (UTC)When I did the first query, I acted as if I'd never done a smidgen of coding for this particular type of mod in my life. I wanted to put myself in the shoes of a new modder using Content Patcher to see how helpful it was by default. I think most of them actually put out code that might work, but maybe not what I wanted. (Llama was an exception in the worst way.) After I wrote the post, I nudged Claude with the RANDOM_BASE_SEASON_ITEM bit and while it and I write code differently, I could almost drop its code into a json file and have a working mod. Still faster to write/modify my own code, though, instead of asking an AI to do it.
To be honest, I'm not really interested in vibe coding myself. It's been coming up a lot on some of the subreddits I visit (selfhosted being the hotspot) so I wanted to see how well AIs did with a very much uncommon bit of coding and how much work it would involve as compared to coding on my own. This was a rather short mod in the scheme of things but complex in its own way, so very interesting but I don't think I'll do it again.
no subject
Date: 2026-03-13 02:38 pm (UTC)I have to learn (and prove I'm using) vibe coding by mandate of my work leadership.
FWIW, you might want to dip your toe in again on a very small scale in 6 or 9 months? I understand that the models are getting exponentially better at the high end, though it takes a while to filter down. ~shrug~