the answer is not quite 30%. i used Golomb coding here, but Huffman yields a very similar result.
also this code is kinda bleh (i even just spotted a var called 'asdf' and an off-by-one) but i don't feel like working on it anymore tonight.
quick and *very* non-exhaustive test shows MM only(?) has ~25% more code and compresses a couple percent better.
also neither of these games use more than 128 unique instructions, so you could just drop a bit and get 12.5% for free*.
* not including the LUT cost, but this is all theoretical anyway.
would you have paid the price of dedicated decoding hardware to fit 30% more code in your NES game? i wonder…
if my interest in this persists, i might try this with megaman 1. it's a bigger cart to begin with, so it might take a lot more work to trace all of its subroutines, or maybe i'd just source the disassembly going around.