It's mostly argued around or against the application of fair use. I suggest consulting a lawyer if you're truly interested, as it quickly gets into legalese around what constitutes ownership, distribution, etc. Throw in a lack of extensive case law and you quickly get into opinions rather than legal bases.
I get the sense that these disassembly/decompilation projects believe that some types of IP-laden asset data can be shipped embedded into the project — not necessarily "legally", but in that they'll likely get away with doing so indefinitely — as long as:
1. those assets are stored in proprietary formats that only the game code itself understands, and
2. no tool exists in the project to extract the assets from these proprietary formats into open formats, unless that tool itself exists only in source-code form in the codebase, and requires the ROM as an input to compile it (even if in the case of such a tool, the ROM is doing nothing but serving as a "key" to unlock compilation.)
Basically, if you have to prove you have your own copy of the IP in order to make their embedded copy of the IP "legible", then it's very hard to construct an evidence-based DMCA takedown order that actually makes any coherent point about the project "distributing" said IP.
That being said, shipping assets like this at all, even if you "can get away with it", is ultimately just a kind of laziness / shortcut-taking. They do it because there's either no clear/simple/obvious way to automatically extract the given asset data from the ROM (e.g. because the relevant data is split up into various data planes + metadata bits that are stored "exploded" all over the ROM), so they just did it once by hand, committing the results; or because there's no clear/simple/obvious way to store the extracted asset data such that a regular compiler/assembler natively understands how to embed it into the binary in the particular form it was found in the original ROM. (Remember, re-assembling/compiling to the original ROM is always the test these projects use to ensure their disassembly/decompilation is preserving semantics. So they need to replicate every weird layout quirk the original dev tooling imposed upon the original ROM. And sometimes the original dev tooling included special-purpose domain-specific asset-codegen tools that aren't part of regular compiler toolchains.)
What these projects should actually be doing, is taking on the schlep: writing the extract tooling anyway, even if it's just "copy these bytes from here and these bytes from there, and spit them out as hex in an .asm file with this header"; and/or writing matching asset-codegen tooling to the tooling that likely existed in the platform SDK, to run before compile/assemble time, converting the extracted ROM asset files into a form (probably a bunch of little assembly files) that will land in the right places when linked back together to form the original ROM.
And, to be clear, they mostly do do this! These projects are very good at doing this!
But sometimes — especially on a larger project with many contributors — one or two things like this aren't audited properly, and fall through the cracks. Or they start out as temporary "bootstrap" approaches made during a private phase of development to get things working + compiling to a correct image; and then not all of those get cleaned up before the repo gets made public.
Perhaps I'm mistaken but the project doesn't need a copy of the original ROM at all right?
To be clear; I don't really understand the law around this - my own country is based on case law which means that even if I wanted to open source some of my reverse engineered games (I have a few private partial implementations of some old defunct game engines in-progress), the distinct lack of prior cases means, sadly, it's prudent not to release them at all while the companies are still active.
I am anti-vibe coding if that meets your criteria?
Reviewing vibe-coded PRs and features has been utterly exhausting over the past few months.
I work on critical, mature software - a small change in behaviour can mean data loss or non-compliance with regulations for our customers. The biggest problem with AI PRs is the sheer amount of churn, extra code and lack of intent with the PRs it generates.
The only way I can describe the latter is that an AI-only PR feels to me like a painting where everything is high detail - and you have to comb over each part before you understand why it's there because so much is superfluous. A well written human PR on the other hand, is painted such that your eye naturally follows the thought process of the author so you can just nod along during the review, as if the solution was obvious.
Also when I'm _using_ the agent; at least 50 percent of my time is spent telling it to stop with it's approach so it doesn't go down a useless rabbit hole and waste tokens.
> The biggest problem with AI PRs is the sheer amount of churn, extra code and lack of intent with the PRs it generates.
But this isn't an LLM problem; this is a problem of undisciplined engineers who feel they need to cram extra stuff in a PR. If an engineer doesn't look at the output of the LLM and generate extra work then it's still on them, right?
> The only way I can describe the latter is that an AI-only PR feels to me like a painting where everything is high detail - and you have to comb over each part before you understand why it's there because so much is superfluous
This just indicates that the engineer doesn't know how to use the tool. Hell they can ask the LLM to split the work into focused PRs and Claude will be happy to do it and the results might no even be half bad.
> Also when I'm _using_ the agent; at least 50 percent of my time is spent telling it to stop with it's approach so it doesn't go down a useless rabbit hole and waste tokens.
If this is happening often then the tool is probably not fit for the job.
I don't believe so - it's not as though the original prompts asked for extra code churn (note that as soon as you look and edit the LLM code output extensively it ceases to be vibe coding, which I was talking about in my OP).
I'm not talking about extra feature s; I'm talking about for the same single feature the code is either convoulted because the algorithm is overly complicated or the abstractions are just wrong for the domain.
The PRs typically are already focused in that they address a single feature; or at least a single "usable" feature in a complex system which necessarily has a lot of connected parts and behaviors.
> then the tool is probably not fit for the job.
Perhaps; but with an LLM I haven't found which jobs it _does_ work for and which it doesn't. I already use planning mode extensively; and capture the major points, but then it makes a stupid decision mid implementation and just starts churning.
I started similarly with it. I'm of the opinion that its a tool that behaves like a tool - how well it works depends on who is using it and how.
I don't have a good analogy but the immediate one that comes to mind is treating AI like a junior developer that you're mentoring. If you know what you're doing you can iterate quickly; if you don't then its a whole other story.
Claude built me a Markdown editor - I designed it, set coding standards, etc. It coded it to my spec. The output is in my opinion not bad and is very usable (for me - I use it daily now). Probably would have cost me north of $50k to get a team of seasoned devs to build it to the current level of polish. https://github.com/emrul/md
Happened to me, 3 days ago - deleted some tests and flipped assertions after outlining that it wasn't to change any assertions.
Our team was doing a similar task to move between test frameworks, and I had to do a git diff of hundreds of thousands of lines to try and work out where a test had disappeared to.
You are the one ruining this discussion, it's worrying that you don't even realize it. I pointed out that models change quite a bit over time (I said more than that) and you ridicule my reply. "Your fault. You should have used a model from 0.000005 seconds ago!"
A change in the sys calls that are used. That's pretty sensitive in general I think; I can see if it were introduced by an LLM why people would be upset if they experienced data loss from it.
Pratchett himself spent years as a journalist for a local newspaper before Colour of Magic.
These writing jobs in print media have mostly disappeared in the UK. It's certainly harder to make a living as a writer today than it was in the 70's and 80's.
Then I think you're lucky; I live in a major city (London) and can attest that there are parking spaces where the kiosk and booths are gone and the app the only way to pay.
LLMs aren't going to remove the "moat" that comes from owning specialized tools (sewer and drain cleaning machines, pro-quality welders, etc), and having a procurement and service infrastructure.
Individual property owners who want to dabble already have that option from the myriad YouTube videos available to them (and arguably they're more trustworthy than LLM slop), just as they've had with books and other media in the past. I don't see LLM-based trade "knowledge" as somehow fundamentally different.
Commercial service and construction isn't going to get put out of business any time soon by "dabblers" learning from LLMs.
I'm not sure where you're based, but having friends who are tradies most of the procurement and service infrastructure isn't owned by them at all.
Putting in a new kitchen or rewiring a house isn't beyond the physical abilities of most people and their customers tend to be the same middle class knowledge workers which AI is expecting to cannibalize.
As to your point about the knowledge being freely available; just as it's easier to ask an LLM about software questions, the same is true for other fields. It might not be accurate, but it doesn't really need to be - it just needs to lower the barrier for people to try.
Basically what I'm saying is that I absolutely expect secondary side effects for the trades if it has a big impact on knowledge workers as well.
reply