estirose

My audience probably isn't much into having AI help code software, but I ran an experiment just out curiosity. Basically, I asked 3 different AIs (ChatGPT, Claude, and Llama) code a Stardew Valley Content Patcher mod. I am very familiar with coding these things and can easily spot errors. None of them got the coding right.

The mod request was simple: create the code for a machine that would output 1 random item per day, acting similar to the Worm Bin and/or the Statue of Endless Fortune. (These actually work slightly different in their coding despite essentially doing the same output at start of day thing.) Most of the time, the player should get common farm items - stone, wood, fiber. But upon occasion, the player should be able to get either the Dish of the Day, a random seasonal vanilla item, or a random object in one of a couple of categories.

(It was fun to play with, but I initially put the percentages up too high for the random items and so I was getting stuff that was way out of balance with what I really wanted. Not that I really needed this; I just like random generators!)

All of them were able to write a good manifest.json, which is kind of the introductory file that all Stardew Valley mods use.

ChatGPT 5.2: Mostly decent code, including finding DISH_OF_THE_DAY and RANDOM_BASE_SEASON_ITEM (the saloon's Dish of the Day and the pool for when you break mine barrels/fish in garbage cans respectively) but refused to recognize that you could concisely put in percentages in; it insisted on multiple entries for wood/stone/fiber to weigh them higher.

Claude Sonnet 4.5: Pulled a similar odd way of weighing choices, though when I gave it an example from one of my other mods, it did fix the code. However, it failed to recognize RANDOM_BASE_SEASON_ITEM when I gave it the example of "trash/barrels"; it created a list of the trash items, which works, but not really what I wanted. It did write a nice recipe entry to make the machine though. (Mine I just set to buy from Robin.)

Llama 4 Maverick: I have no idea what in the heck this was doing. It got the structure right, but not much else.

ETA: I forgot Gemini!
Google Gemini 3: Gemini, like Claude, failed to find RANDOM_BASE_SEASON_ITEM; it recognized the idea of random chances and mostly got the idea correct but had the code for them wrong (see below on how to write that in Content Patcher). Even when I linked it to the wiki page that included that, it insisted I was wrong.

Here's the meat of the mod:

"OutputItem": [
{
"CustomData": null,
"ItemId": "(O)388",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 3,
"Quality": 0,
"ModData": null,
"
},
The section above is my default. Apparently I decided that if nothing else applied, that I would get 1-3 wood. (O)388 is the Item ID for wood.
{
"CustomData": null,
"ItemId": "DISH_OF_THE_DAY",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
},
This section tells the game that there's a 10% chance that the machine will give me the Dish of the Day instead.
{
"CustomData": null,
"ItemId": "RANDOM_BASE_SEASON_ITEM",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
},
This section tells the game that there's a 10% chance that the machine will give me a random item from the list of base season items (that is, the ones you get from breaking barrels and digging through trash cans).
{
"CustomData": null,
"ItemId": "RANDOM_ITEMS (O)",
"PerItemCondition": "ITEM_CATEGORY Target -2 -5 -6 -12 -15 -16 -20 -21",
"MaxItems": null,
"MinStack": 1,
"MaxStack": 1,
"Quality": 0,
"ModData": null,
"Condition": "RANDOM 0.1",
}
This is another 10% chance. This one tells the game to choose an item out of one of 8 categories: gems, eggs, milk, minerals, metal resources, building resources, junk, or bait.

Assuming I've coded this correctly - still testing it, wouldn't put it in a serious game - it would choose an item for any categories where this was random (that is, not the dish of the day or the wood), then check to see if any of them rolled under 10%. It then chooses one category randomly and spawns the item. The order might be slightly different (the game might see if the percentage applies first, then rolls the random item). Not sure, doesn't change things.

Flat | Top-Level Comments Only

From:

mara

I 100% wouldn't go to AI for coding! Since you tried this, though, I'm curious...do you feel like any of these would have saved you any time if you'd done this seriously? I'm hearing *some* people say "oh, I just edit it and it's fine" but it seems to me like for complicated coding it's the same problem as letting genAI write your paper for you. Like, if you didn't write it yourself, can you really know how to fix it?

Not sure that makes any sense, but I hope it does :D

estirose

I really don't think it would have saved me any time. As brightknightie mentions in the comment below yours, AIs are essentially eager interns on their first day of work - they try very hard, but they need a lot of guidance from you. For these kinds of mods, I'm much better off grabbing an existing set of code - mine or someone else's - and modifying it to fit my needs manually. (I will state that every single one of the ones I tested at least had the structure correct, just the actual code needed tweaking.)

brightknightie

Definitely, for sure, don't let any LLM or AI agent run around unsupervised! :-) The best widely available at this time are like an eager intern on her first day, very useful if you explain absolutely everything and check the work, but cannot even find the restroom without directions. ;-)

I work on related stuff... may I share an opinion?

FWIW, the paid versions are more effective than the free versions right now, and I think this may be an important thing to be aware of. I expect that advances in the paid tiers will eventually filter down into the free tiers.

A key is to run everything (whether coding or report analysis or content generation) through multiple times (as you tried to do!!!) and/or get it to run tests/comparisons to get it to find its own errors and correct itself. Feeding examples of what one wants helps a lot ("few-shot prompting" v. "zero-shot prompting"). I have access at my workplace to paid tiers of Gemini and they can give much better output for the same guidance, and much much better for better guidance. I'm not a SWE, myself, but the engineers as a group seem to have widely varied feelings about integrating the tools into their workflows. They don't seem, overall, worried about being replaced by the tools, as such, as much as by being pressured to let the tools do the fun stuff for them instead of the tedious stuff, because management doesn't always understand that humans really must do the fun (thoughtful, puzzling, innovative) stuff, that it both gives better results and that not every human wants to be a Program Manager in her heart of hearts. ;-)

Very much appreciated! My boss is very pro-AI so I've had courses on how to use it ethically for my job (their favorite AI is Grok, which tells you something about them). For all the ones I tested, I used duck.ai's models (the paid versions, since I think duckduckgo is awesome and therefore I give them money) except for Gemini, where I'm on the free plan.

When I did the first query, I acted as if I'd never done a smidgen of coding for this particular type of mod in my life. I wanted to put myself in the shoes of a new modder using Content Patcher to see how helpful it was by default. I think most of them actually put out code that might work, but maybe not what I wanted. (Llama was an exception in the worst way.) After I wrote the post, I nudged Claude with the RANDOM_BASE_SEASON_ITEM bit and while it and I write code differently, I could almost drop its code into a json file and have a working mod. Still faster to write/modify my own code, though, instead of asking an AI to do it.

To be honest, I'm not really interested in vibe coding myself. It's been coming up a lot on some of the subreddits I visit (selfhosted being the hotspot) so I wanted to see how well AIs did with a very much uncommon bit of coding and how much work it would involve as compared to coding on my own. This was a rather short mod in the scheme of things but complex in its own way, so very interesting but I don't think I'll do it again.

Understood!

I have to learn (and prove I'm using) vibe coding by mandate of my work leadership.

FWIW, you might want to dip your toe in again on a very small scale in 6 or 9 months? I understand that the models are getting exponentially better at the high end, though it takes a while to filter down. ~shrug~

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Don't rely on AI coding.

Don't rely on AI coding.

no subject

no subject

no subject

no subject

no subject

Profile

May 2026

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags