How many games did you have to throw away because stockfish wanted to castle? Or did you force stockfish to not castle? Castling seems like such a frequent move it is hard to draw any conclusions about the strength of an engine that does not support it.
zero games were thrown away for castling, because i forced stockfish not to castle (and not to play en passant/promotion) by filtering legal moves and only giving those filtered moves via root_moves
so every game stayed in the same no castling variant
and you're right, this rating is for that constrained variant, not full chess.
I'm not quite clear on the how of it, but Stockfish works pretty well outside the normal bounds of chess. There are toy chess variants on chess.com with "dragons" (knight + bishop) and stockfish can use those very effectively
Wouldn't that just stop it from considering castling, en passant, and promotion on the first move of the position you are analyzing? It's still going to consider those in the tree search, and the static evaluation neural net was trained on positions where they are allowed.
For castling you should be able to fix this by specifying that castling rights have been lost in the positions you give it.
For the others I think you would need to filter not just when giving it the list of allowed first moves. You'd also have to make it not consider en passant and promotion in the search, by modifying its move generation.
The static evaluation would still be off a bit, but that probably would not have much effect most of the time.
From what I've read it is feasible to train a new neural net on a decent home computer in maybe around a week, but that's probably overkill for your use of figuring out how strong your engine is at no castling/no promotion/no en passant chess.