Join the discord

Predicting numbers - Visionaire 3.x XOR encryption

23 Feb, 2013 15:29
I wanted to write this one, as a hint to everyone interested in cryptography and breaking into algorithms.
It might not cover your current task, but at least you may get an idea for a exploit technique that will do the job.

My target will be a Visionaire 3.x Engine based game, that use a key-XOR encryption. There are few games using this engine, but unfortunately all of them are using different key for the encryption. Even different language versions of the same game may use a different key, so i would like to have an universal tool or method, that can "predict" the key.
Since prediction is basically a shoot in the dark, I can't just count on that...

But first, some explanations. What is Visionaire engine?
Visionaire 3.x engine or in short VIS3 is a adventure game engine that is technically a freeware.
You can use it as long as your game is licensed as a freeware. The disadvantage is that the game resources aren't going to be compresses in VIS files, so everyone can freely edit them.

If you aren't a cheap ass (no offense given), you can buy the engine for only 35€ and then sell your game, but for no more than 15€ per copy. Also, the game resources will be protected by a XOR encryption of the resource files.

Finally, if you are some Richie Rich (still no offense given) and have to spent 1000€, you can buy the engine with unlimited license and do whatever you want. Your game resources will be "securely" protected by encryption and everyone will be happy.

However, there's some insane people out there ready to translate a game in their native language for free. For example me, as a Bulgarian guy living in a country with less than 8 million people, no one will ever bother to translate the game in my language. BUT, there's a people that are ready to translate it in Bulgarian language FOR FREE. So what if they want to translate the game? The answer is - they cant. Or at least they will have to contact the developers, discuss the idea and finally, for a small market like the Bulgarian one, the answer will most probably be "No, thanks".

For that case, I'm about to write this article for these open minded people, that are ready to localize games in their native languages - for free.

The following games are using Visionaire 3.x engine:
A New Beginning
Deponia
Edna & Harvey: Harvey's New Eyes
The Dark Eye: Chains of Satinav
The whispered world
AR-K Episode 1
Zak McKracken between time & space
The Second Guest
Captain Delta
...
many more


Most of these are pretty good games, and if you are a quest/adventure game person, you should definitely buy and try them.

How does XOR encryption works?
XOR or also known as exclusive OR is just a bitwise operation, that's sometimes used as a simple encryption method. I have some "super secret" data, and I also have a key, that will "lock" it.
Lets say my data is this ANSI/ASCII string:
example data : plainOffset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  53 54 52 49 4E 47 2D 54 48 41 54 2D 49 53 2D 47  STRING-THAT-IS-G
00000010  4F 49 4E 47 2D 54 4F 2D 42 45 2D 58 4F 52 2D 45  OING-TO-BE-XOR-E
00000020  4E 43 52 59 50 54 45 44                          NCRYPTED

and I chose "a" or 0x61 in HEX, for my super secret key.

So, take 0x53 ("S" - the first byte of the data, and XOR it with 0x61 ("a" - the key byte) to get the result 0x32.
Then, take 0x54 ("T" - the second byte of the data, and XOR it with 0x61 ("a" - the key byte) to get the result 0x35, and so on, to finally get the result:
example data : encryptedOffset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  32 35 33 28 2F 26 4C 35 29 20 35 4C 28 32 4C 26  253(/&L5) 5L(2L&
00000010  2E 28 2F 26 4C 35 2E 4C 23 24 4C 39 2E 33 4C 24  .(/&L5.L#$L9.3L$
00000020  2F 22 33 38 31 35 24 25                          /"3815$%

As you can see, the result is not readable anymore. If I want to decrypt it back, I need the key but if I don't know it, I can simply brute force the string with every possible byte (0x00 to 0xFF which is not much effort) until the right key is found.
But in my case, the key is a single byte. So what if the key is composed of two or more bytes? The simple rule here is - the longer the key is, the harder is to brute force it.

In such cases, a deduction method can be used, based on the already decrypted data - e.g., some bytes in the data may be constants, other overwrap with known parts of the key, etc.

I'm targeting the game "A New Beginning", so this article is specially build for it, but should work for the other games too.
There are few resource files like VIS, VC* and VS*. All of them are technically a VIS3 files with identical structure.

Let's take the data.vis - the main game resource file. Its structure goes like this:
VIS file formattypedef struct _VIS {
   char[4]                   header; // "VIS3"
   DWORD                     num_entries;
   DWORD[num_entries*0x10]   structure;
   byte[n]                   entries_data;
}

The header is always set to "VIS3".
The num_entries depends on the specific game and it is a numeric value.
The file_structure is encrypted and that's what I'm targeting in this article.
Finally, the entries_data is the data of the files containing in this resource file, but they are not targeted in this article so I'm skipping them.

I already know that the encryption algorithm is a XOR encryption with a key, but i also know that the structure itself has its own formatting:
VIS file format > structuretypedef struct structure {
   char[3]   structure_start; // "HDR"
   DWORD[4]  entry {
      DWORD   entry_offset
      DWORD   entry_size_packed
      DWORD   entry_size_unpacked
      DWORD   entry_type
   }
   ...
   DWORD[4]  entry { ... }
   char[3]   structure_end; // "END"
}

There's the structure start, then the resource entries (the exact amount of entries is defined by num_entries), and finally a structure end.

I'll take some of the smaller files from the game, in this case "character.vc039", and here's its encrypted structure:
character.vc039Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  71 7C 36 31 61 31 66 31 32 30 E9 38 63 65 B8 35  q|61a1f1208ce5
00000010  39 38 6C 31 61 31 EB 31 32 30 E9 38 63 65 B8 35  98l1a11208ce5
00000020  39 38 6C 31 61 30 7C 31 32 30 E9 38 63 65 B8 35  98l1a0|1208ce5
00000030  39 38 6C 31 61 30 C1 31 32 30 E9 38 63 65 B8 35  98l1a01208ce5
00000040  39 38 6C 31 61 33 52 31 32 30 E9 38 63 65 B8 35  98l1a3R1208ce5
00000050  39 38 6C 31 61 33 A7 31 32 30 E9 38 63 65 B8 35  98l1a31208ce5
00000060  39 38 6C 31 61 32 28 31 32 30 E9 38 63 65 B8 35  98l1a2(1208ce5
00000070  39 38 6C 31 61 32 BD 31 32 30 E9 38 63 65 B8 35  98l1a21208ce5
00000080  39 38 6C 31 61 35 0E 31 32 30 E9 38 63 65 B8 35  98l1a5.1208ce5
00000090  39 38 6C 31 61 35 93 31 32 30 E9 38 63 65 B8 35  98l1a51208ce5
000000A0  39 38 6C 31 61 34 E4 31 32 30 E9 38 63 65 B8 35  98l1a41208ce5
000000B0  39 38 6C 74 2F 75                                98lt/u

Looks like gibberish, but by some badass (not really, but I'm just trying to look cool) reverse engineering I already find out, that the structure starts with "HDR" and ends with "END".

If I take the first three bytes - 0x717C36 and xor them with the hex representation of "HDR" - 0x484452, the result is 0x393864 or "98d" into ASCII.
So, the first three bytes from the key are "98d", and now I need to find the rest 13 bytes.

Since the key is 16 bytes long, I can easily find the next three bytes, by decrypting the structure end at offset 0x000000B3.
No matter how many entries you have in the resource archive, the "END" string will always point to the next three bytes of the key.

Now, I'll xor 0x742F75 (the last three bytes of the structure) with 0x454E44 ("END" into HEX), that produces 0x316131 in HEX or "1a1" in ASCII, and these are the next three bytes of the key.

The engine structure is quite straight forward. There's only PNG's and OGG's at the beginning of the resource files, and at its very end, there is a pretty heavy XML file (most of the time around 30-40 MB).
The XML, as a plain text file format, is compressed by DEFLATE (zLib), to reduce his size.
However, the PNG and OGG files are not compressed by the engine, since both file formats are using compression by default, and double compression will increase the final VIS archive size.

All in all, the first file should be an uncompressed OGG or PNG file.
Another thing is the entries offsets. They are assumed to be located on base address 0x00000000 (header's data and structure sizes are ignored), so the offset of the first entry will also be 0x00000000.

Then, if the first file offset is 0, I can actually decode the 4th to 8th key bytes, instead 4th to 7th.
character.vc03900000000  71 7C 36 [31 61 31 66] 31 32 30 E9 38 63 65 B8 35  q|61a1f1208ce5

Because the four bytes that I need to xor with 0x31613166 are all zeros, the key bytes will equal to the original value 0x31613166, or "1a1f" in ASCII.

The key so far looks like "98d1a1f*********".

I already know that the first entry will be and uncompressed OGG or PNG file. I also know the starting offset is pointing at 0x00000000.
So if I refer to the structure entry definition :
character.vc039 > structure > entrytypedef struct entry {
   DWORD   entry_offset;
   DWORD   entry_size_packed;
   DWORD   entry_size_unpacked;
   DWORD   entry_type;
}

- entry_offset for the first entry is known to be always 0x00000000;
- entry_size_packed is an unknown number;
- entry_size_unpacked is unknown number too;
- entry_type is again, something I don't know.

Let's take a look at the file dump and how its spreads over the structure:
character.vc039          header    offset       p.size       u.size       type
         [ H  D  R][ 0  0  0  0][ ?  ?  ?  ?][ ?  ?  ?  ?][ ?
00000000  71 7C 36  31 61 31 66  31 32 30 E9  38 63 65 B8  35
00000010  39 38 6C  31 61 31 EB  31 32 30 E9  38 63 65 B8  35

Like i mention before, if the first entry is not compressed, so its entry_size_packed and entry_size_unpacked values should be identical. However, that means, the second file offset will equal the first file entry_size_packed/entry_size_unpacked!

Let's get back to the dump once again, but this time with some highlighting:
character.vc039          header    offset       p.size       u.size       type
         [ H  D  R][ 0  0  0  0][ ?  ?  ?  ?][ ?  ?  ?  ?][ ?
00000000  71 7C 36  31 61 31 66 [31 32 30 E9][38 63 65 B8] 35
00000010  39 38 6C [31 61 31 EB] 31 32 30 E9  38 63 65 B8  35

all these three dwords should be equal.

I've already guessed the key values for bytes 4th to 8th, so I can use it over the second resource entry offset like so - 0x316131EB ^ 0x31613166, to get the value 0x0000008D.
This value should be the the first resource entry entry_size_packed and unpacked_size and also the second resource entry offset. So now, to find the next four bytes from the key, I can XOR 0x313230E9 with 0x0000008D, to get 0x31323064, or "120d".
And to get the next four bytes I should again use 0x0000008D for XOR, but this time with 0x386365B8, to get the result "8ce5" (0x38636535).

See how easy i was able to obtain not four, but eight bytes from the key, by just using the data overlapping and some drunk logic?

The key now is "98d1a1f120d8ce5*", and only one byte - the final one is unknown.

Playing some more with the file format, I've found that the entry_type value may be one of those:
entry types0x00000000   .OGG
0x00000008   .PNG
0x00000003   .XML
0x00000012   chunked .XML (.CXML)

Obviously, the entry_type is a small integer value from 0 to 0x12, so basically, the first three bytes from the entry_type DWORD should be all zeros.
This means, i can simply take the 16th byte from the data as the last byte from the key - in this case "5" - (0x35).

Finally the key is fully reconstructed to "98d1a1f120d8ce55". But is that the right key?
To find out first I can decode the structure using that key:
character.vc039 : decoded structureOffset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  48 44 52 00 00 00 00 00 00 00 8D 00 00 00 8D 00  HDR...........
00000010  00 00 08 00 00 00 8D 00 00 00 8D 00 00 00 8D 00  .............
00000020  00 00 08 00 00 01 1A 00 00 00 8D 00 00 00 8D 00  ..............
00000030  00 00 08 00 00 01 A7 00 00 00 8D 00 00 00 8D 00  .............
00000040  00 00 08 00 00 02 34 00 00 00 8D 00 00 00 8D 00  ......4.......
00000050  00 00 08 00 00 02 C1 00 00 00 8D 00 00 00 8D 00  .............
00000060  00 00 08 00 00 03 4E 00 00 00 8D 00 00 00 8D 00  ......N.......
00000070  00 00 08 00 00 03 DB 00 00 00 8D 00 00 00 8D 00  .............
00000080  00 00 08 00 00 04 68 00 00 00 8D 00 00 00 8D 00  ......h.......
00000090  00 00 08 00 00 04 F5 00 00 00 8D 00 00 00 8D 00  .............
000000A0  00 00 08 00 00 05 82 00 00 00 8D 00 00 00 8D 00  .............
000000B0  00 00 08 45 4E 44                                ...END

and second, verify the data by these points:
- the entries offset values should not exceed the overall size of the VIS resource file;
- the entry offset value should equal the sum of the previous entry offset plus packed size;
- the entry type value should be a 0x00, 0x08, 0x03 or 0x12, unless new types are introduced in future.

Overall, analyzing more than one of the VIS/VC*/VS* files is a good way to verify the key validity.

Finally, to give some hints to the engine developers *HINT-HINT*, here's some scenarios when this method as-is, wont work.

- If you remove the "HDR" and "END" structure marks
- If you compress the OGG's and PNG's too, so both size_packed and size_unpacked are different;
- If you add a padding between the structure and the entries data, so the first entry offset is no longer pointed to 0x00000000;
- If you add an entry padding, lets say of 0x16 bytes, so a 10 bytes long entry takes 16 bytes, a 20 bytes entry takes 32, and so on;
- Using a longer KEY. Technically the keys now are 32 bytes long MD5 strings, but only the first 16 bytes of them are actually used.

I've already publish a simple tool that makes most of the above logic, so you can obtain it from the "Game tools" section - "VIS3 key hound".

Comments

* You have an opinion? Let us all hear it!

Guest 01 Sep, 2022 22:44
I also want to use vs to translate the game language, but is there an easy way to do it?
Guest 30 May, 2020 01:27
This is a very nice breakdown of the VIS3 format! It's worth reiterating (because I sort of missed it on first read) that only the _VIS.structure block is XOR encrypted as described here. The _VIS.entries_data content (i.e. the actual data) is not XOR encrypted and should be read directly. Gave me a bit of confusion when at first I tried to XOR decrypt everything after the first 8 bytes of the file.

Also, a random observation: when digging into Scene files in Deponia 2, I observed that struct_entry.entry_type value 0x00000008 corresponded to WebP (RIFF) content, rather than PNG content. The takeaway, I suppose, is that the entry_type values may correlate to different content types from one game to another. (Deponia 2 did, however, use 0x00000000 to represent OGG files, I observed.)

Thanks again!
XpoZed 22 Apr, 2014 20:36
Maybe the XML you are looking at has a BOM (0xFE 0xFF bytes at the beginning)? If you drop me a email, i can send you the file structure the way i was able to reverse engineer it.
Guest 22 Apr, 2014 11:14
Hi,

just tinkering around with this myself, I stumbled over your very useful article. Thanks for that. My question is: you are saying, that the XML is deflated. This would make sense, but how do you know for sure? Because the header seems to be corrupted, just like with PNGs. At least I have nonsense in the first two bytes if the XML chunk, which makes identifying and inflating impossible.

So maybe you solved that too already? Any hints would be highly appreciated!
© nullsecurity.org 2011-2024 | legal | terms & rules | contacts