Roger is happy. He bought a cheap PC-Engine CDRom game called “Nishimura Kyotaro Mystery.: Hokutosei No Onna”. It seems to be a detective game like J.B. Harold Murder Club, Jake Hunter, Ace Attorney, Le Manoir de Mortevielle or Maupiti Island. It was published by Naxat (now Kaga Create) in 1990.
He remembers that some games have a language option. For example, the japanese version of J.B Harold Murder Club can be played in english. Unfortunately such option was nowhere to be found in Hokutosei No Onna… Nevermind! Roger Quartermain grabs his boots, his hat and jumps into the game dark bowels!
The introduction seems to be about some dude stabbed in his room, 2 girls going to a train station and another guy who is apparently stalking them.
As it’s a CDRom game, it must use the BIOS function for displaying text. Roger took his old notes and search for any text related routine. Here it is! ex_fnt located at $e060. He draw his favorite emulator and set a breakpoint to it.
It fires just after the introduction when what seems to be the savegame selection screen appears.
Roger is not really interested in the ex_fnt routine but more in the calling environment. So he sent breakpoints to the potential rts. A gentle pression to the R key sends him to $f1e2. Another push to the S key brings him the the caller.
Here it is at $566b.
According to the notes, the shift-jis is loaded from $41, $42 to $f8 and $f9. The logical next step is to put a write breakpoint to $41 and $42.
The write breakpoint is triggered at $5b20. The code is pretty simple.
5b1c: ldy #$00 lda ($3c), Y 5b20: sta $41 iny lda ($3c), Y iny sta $42
The next step is to search where the $3c pointer. Once again Roger has to set a new write breakpoint at $3c and $3d.
5a18: lda #$90 bra $5a22 lda #$40 bra $5a22 lda #$80 5a22: sta $c0 5a24: sty $3d cpy #$00 beq $5a2f stx $3c smb5 $c0 rts
Unfortunately the caller is not that obvious to discover. But the good news is that the text is not compressed. By keeping track of the values stored at $41 and $42, Roger manages to extract the string
40 81 40 81 40 81 C7 82 CC 82 54 8B E4 88 59 8C 96 8E C5 82 6E 8E DF 82 DC 82 B7 82 A9 82 48 81 0D 00 FF 00
The endiannes must be swapped. He hopefully has some handsome perl script at hand swap_endian.pl. The last 4 characters looks like some control code. His long time experience enables him to deduce that 00 0D is some kind of newline code and 00 FF may be for end of text. Here’s the string.
どの亀井刑事で始めますか?
But this doesn’t tell Roger where the $3c pointer is set. Roger knows that on CDRom systems, the data is first transfered to RAM before being executed. This means that the code can be self-modifiable. Unluckily this is the case here. Roger has to set a breakpoint where the jump is done and run it until he finally reaches the pointer initialization code. Time passes and he ends up at $724d
724d: ldx #$fc ldy #$74 jmp $5a18
He restarts the game with only the write breakpoint at $41 and $42. As expected he ends up at the same code as before for the first sentence. The next run lands at a different location acb4.
acae: lda $4f asl A tay lda ($50), Y acb4: sta $41 iny lda ($50), Y beq $acd8 sta $42 lda $2922 sta $3f lda $2923 sta $40 lda #$01 jsr $5655
The pointer $50 is initialized at ac92.
ac8a: tay lda $312a, Y asl ac8e: tay ac90: lda ($9f), Y sta $50 iny lda ($9f), Y sta $51
So the pointer is set up from a table. He gets the values
- b1b6
- bdb6
- cdb6
- d9b6
- e7b6
Hopefully the indices for this pointers are 2,3,4,5,6. Roger searchs in the iso for b1 b6 bd b6 cb b6 d9 b6 e7 b6. He ends up at the offset $11f49. The place is filled with what looks like pointers. Some bytes after the string he recognises byte swapped shit-jis values. It starts at $11f53. So The pointer soup must contain some value equals to $53Xf. He scrolls up and finds $53af at $11d45. Tadam, he has his first table and string bloc. As all the pointers are in increasing order, Roger deduces that the option strings are stored from $11f53 to $126f4. But just after this, he notices some other shift-jis symbols. It goes from $126f5 to $132f5.
Once the options are display, he jumps to another screen where some grumpy old cop is talking to you.
Once again, he set his breakpoints to $41 and $42. But then he remembers $5a18. It starts with
5a18: lda #$90 bra $5a22 5a1c: lda #$40 bra $5a22 5a20: lda #$80 5a22: sta $c0
So he can get here from $5a18, $5a1c or $5a20. This gives him 3 more breakpoints. $5a20 is the one and as expected it comes from RAM initialized code ($59ab). He prepares for a painful ride. And painful it is. Nevertheless, he finds out that that $5a20 is reached from $6c5e.
6c5e: ldy $a5 lda ($a8), Y sta $00 iny lda ($a8), Y sta $01 iny sty $a5 and $00 cmp #$ff beq $6c79 ldx $00 ldy $01 jmp $5a20
The first sentence of the grumpy cop comes from $89c9 (iso offset $1339c9). This address comes from a table pointed by $a8. It’s setup at $707e.
707e: lda #$00 sta $00 lda #$80 sta $01 lda ($00), Y sta $a8 iny lda ($00), Y sta $a9 stz $a5 stz $a6 lda #$03 jmp $619c
The pointer is store at $13308c. Roger is perplex. It’s not just a simple pointer table. There is some extra data. He needs to figure out how it works but he is tired and there’s Night of the Creeps on TV.
He streches, closes the emulator and his laptop and prepares himself for 90 minutes of pure 80’s badassery.