I wanted to write this one, as a hint to everyone interested in cryptography and breaking into algorithms.
It might not cover your current task, but at least you may get an idea for a exploit technique that will do the job.
My target will be a
Visionaire 3.x Engine based game, that use a key-XOR encryption. There are few games using this engine, but unfortunately all of them are using different key for the encryption. Even different language versions of the same game may use a different key, so i would like to have an universal tool or method, that can "predict" the key.
Since prediction is basically a shoot in the dark, I can't just count on that...
But first, some explanations. What is Visionaire engine?
Visionaire 3.x engine or in short VIS3 is a adventure game engine that is technically a freeware.
You can use it as long as your game is licensed as a freeware. The disadvantage is that the game resources aren't going to be compresses in VIS files, so everyone can freely edit them.
If you aren't a cheap ass (no offense given), you can buy the engine for only 35€ and then sell your game, but for no more than 15€ per copy. Also, the game resources will be protected by a XOR encryption of the resource files.
Finally, if you are some Richie Rich (still no offense given) and have to spent 1000€, you can buy the engine with unlimited license and do whatever you want. Your game resources will be "securely" protected by encryption and everyone will be happy.
However, there's some insane people out there ready to translate a game in their native language for free. For example me, as a Bulgarian guy living in a country with less than 8 million people, no one will ever bother to translate the game in my language. BUT, there's a people that are ready to translate it in Bulgarian language FOR FREE. So what if they want to translate the game? The answer is - they cant. Or at least they will have to contact the developers, discuss the idea and finally, for a small market like the Bulgarian one, the answer will most probably be
"No, thanks".
For that case, I'm about to write this article for these open minded people, that are ready to localize games in their native languages - for free.
The following games are using Visionaire 3.x engine:
A New Beginning
Deponia
Edna & Harvey: Harvey's New Eyes
The Dark Eye: Chains of Satinav
The whispered world
AR-K Episode 1
Zak McKracken between time & space
The Second Guest
Captain Delta
...
many more
Most of these are pretty good games, and if you are a quest/adventure game person, you should definitely buy and try them.
How does XOR encryption works?
XOR or also known as exclusive OR is just a bitwise operation, that's sometimes used as a simple encryption method. I have some "super secret" data, and I also have a key, that will "lock" it.
Lets say my data is this ANSI/ASCII string:
example data : plainOffset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 53 54 52 49 4E 47 2D 54 48 41 54 2D 49 53 2D 47 STRING-THAT-IS-G
00000010 4F 49 4E 47 2D 54 4F 2D 42 45 2D 58 4F 52 2D 45 OING-TO-BE-XOR-E
00000020 4E 43 52 59 50 54 45 44 NCRYPTED
and I chose "a" or 0x61 in HEX, for my super secret key.
So, take 0x53 (
"S" - the first byte of the data, and XOR it with 0x61 (
"a" - the key byte) to get the result 0x32.
Then, take 0x54 (
"T" - the second byte of the data, and XOR it with 0x61 (
"a" - the key byte) to get the result 0x35, and so on, to finally get the result:
example data : encryptedOffset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 32 35 33 28 2F 26 4C 35 29 20 35 4C 28 32 4C 26 253(/&L5) 5L(2L&
00000010 2E 28 2F 26 4C 35 2E 4C 23 24 4C 39 2E 33 4C 24 .(/&L5.L#$L9.3L$
00000020 2F 22 33 38 31 35 24 25 /"3815$%
As you can see, the result is not readable anymore. If I want to decrypt it back, I need the key but if I don't know it, I can simply brute force the string with every possible byte (0x00 to 0xFF which is not much effort) until the right key is found.
But in my case, the key is a single byte. So what if the key is composed of two or more bytes? The simple rule here is - the longer the key is, the harder is to brute force it.
In such cases, a deduction method can be used, based on the already decrypted data - e.g., some bytes in the data may be constants, other overwrap with known parts of the key, etc.
I'm targeting the game "
A New Beginning", so this article is specially build for it, but should work for the other games too.
There are few resource files like
VIS,
VC* and
VS*. All of them are technically a VIS3 files with identical structure.
Let's take the data.vis - the main game resource file. Its structure goes like this:
VIS file formattypedef struct _VIS {
char[4] header; // "VIS3"
DWORD num_entries;
DWORD[num_entries*0x10] structure;
byte[n] entries_data;
}
The header is always set to
"VIS3".
The num_entries depends on the specific game and it is a numeric value.
The file_structure is encrypted and that's what I'm targeting in this article.
Finally, the entries_data is the data of the files containing in this resource file, but they are not targeted in this article so I'm skipping them.
I already know that the encryption algorithm is a XOR encryption with a key, but i also know that the structure itself has its own formatting:
VIS file format > structuretypedef struct structure {
char[3] structure_start; // "HDR"
DWORD[4] entry {
DWORD entry_offset
DWORD entry_size_packed
DWORD entry_size_unpacked
DWORD entry_type
}
...
DWORD[4] entry { ... }
char[3] structure_end; // "END"
}
There's the structure start, then the resource entries (the exact amount of entries is defined by num_entries), and finally a structure end.
I'll take some of the smaller files from the game, in this case
"character.vc039", and here's its encrypted structure:
character.vc039Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 71 7C 36 31 61 31 66 31 32 30 E9 38 63 65 B8 35 q|61a1f1208ce5
00000010 39 38 6C 31 61 31 EB 31 32 30 E9 38 63 65 B8 35 98l1a11208ce5
00000020 39 38 6C 31 61 30 7C 31 32 30 E9 38 63 65 B8 35 98l1a0|1208ce5
00000030 39 38 6C 31 61 30 C1 31 32 30 E9 38 63 65 B8 35 98l1a01208ce5
00000040 39 38 6C 31 61 33 52 31 32 30 E9 38 63 65 B8 35 98l1a3R1208ce5
00000050 39 38 6C 31 61 33 A7 31 32 30 E9 38 63 65 B8 35 98l1a31208ce5
00000060 39 38 6C 31 61 32 28 31 32 30 E9 38 63 65 B8 35 98l1a2(1208ce5
00000070 39 38 6C 31 61 32 BD 31 32 30 E9 38 63 65 B8 35 98l1a21208ce5
00000080 39 38 6C 31 61 35 0E 31 32 30 E9 38 63 65 B8 35 98l1a5.1208ce5
00000090 39 38 6C 31 61 35 93 31 32 30 E9 38 63 65 B8 35 98l1a51208ce5
000000A0 39 38 6C 31 61 34 E4 31 32 30 E9 38 63 65 B8 35 98l1a41208ce5
000000B0 39 38 6C 74 2F 75 98lt/u
Looks like gibberish, but by some badass (not really, but I'm just trying to look cool) reverse engineering I already find out, that the structure starts with
"HDR" and ends with
"END".
If I take the first three bytes - 0x717C36 and xor them with the hex representation of
"HDR" - 0x484452, the result is 0x393864 or "
98d" into ASCII.
So, the first three bytes from the key are "
98d", and now I need to find the rest 13 bytes.
Since the key is 16 bytes long, I can easily find the next three bytes, by decrypting the structure end at offset 0x000000B3.
No matter how many entries you have in the resource archive, the
"END" string will always point to the next three bytes of the key.
Now, I'll xor 0x742F75 (the last three bytes of the structure) with 0x454E44 ("END" into HEX), that produces 0x316131 in HEX or "
1a1" in ASCII, and these are the next three bytes of the key.
The engine structure is quite straight forward. There's only PNG's and OGG's at the beginning of the resource files, and at its very end, there is a pretty heavy XML file (most of the time around 30-40 MB).
The XML, as a plain text file format, is compressed by
DEFLATE (
zLib), to reduce his size.
However, the PNG and OGG files are not compressed by the engine, since both file formats are using compression by default, and double compression will increase the final VIS archive size.
All in all, the first file should be an uncompressed OGG or PNG file.
Another thing is the entries offsets. They are assumed to be located on base address 0x00000000 (header's data and structure sizes are ignored), so the offset of the first entry will also be 0x00000000.
Then, if the first file offset is 0, I can actually decode the 4th to 8th key bytes, instead 4th to 7th.
character.vc03900000000 71 7C 36 [31 61 31 66] 31 32 30 E9 38 63 65 B8 35 q|61a1f1208ce5
Because the four bytes that I need to xor with 0x31613166 are all zeros, the key bytes will equal to the original value 0x31613166, or "
1a1f" in ASCII.
The key so far looks like "
98d1a1f*********".
I already know that the first entry will be and uncompressed OGG or PNG file. I also know the starting offset is pointing at 0x00000000.
So if I refer to the structure entry definition :
character.vc039 > structure > entrytypedef struct entry {
DWORD entry_offset;
DWORD entry_size_packed;
DWORD entry_size_unpacked;
DWORD entry_type;
}
- entry_offset for the first entry is known to be always 0x00000000;
- entry_size_packed is an unknown number;
- entry_size_unpacked is unknown number too;
- entry_type is again, something I don't know.
Let's take a look at the file dump and how its spreads over the structure:
character.vc039 header offset p.size u.size type
[ H D R][ 0 0 0 0][ ? ? ? ?][ ? ? ? ?][ ?
00000000 71 7C 36 31 61 31 66 31 32 30 E9 38 63 65 B8 35
00000010 39 38 6C 31 61 31 EB 31 32 30 E9 38 63 65 B8 35
Like i mention before, if the first entry is not compressed, so its entry_size_packed and entry_size_unpacked values should be identical. However, that means, the second file offset will equal the first file entry_size_packed/entry_size_unpacked!
Let's get back to the dump once again, but this time with some highlighting:
character.vc039 header offset p.size u.size type
[ H D R][ 0 0 0 0][ ? ? ? ?][ ? ? ? ?][ ?
00000000 71 7C 36 31 61 31 66 [31 32 30 E9][38 63 65 B8] 35
00000010 39 38 6C [31 61 31 EB] 31 32 30 E9 38 63 65 B8 35
all these three dwords should be equal.
I've already guessed the key values for bytes 4th to 8th, so I can use it over the second resource entry offset like so - 0x316131EB ^ 0x31613166, to get the value 0x0000008D.
This value should be the the first resource entry entry_size_packed and unpacked_size and also the second resource entry offset. So now, to find the next four bytes from the key, I can XOR 0x313230E9 with 0x0000008D, to get 0x31323064, or "
120d".
And to get the next four bytes I should again use 0x0000008D for XOR, but this time with 0x386365B8, to get the result "
8ce5" (0x38636535).
See how easy i was able to obtain not four, but eight bytes from the key, by just using the data overlapping and some drunk logic?
The key now is "
98d1a1f120d8ce5*", and only one byte - the final one is unknown.
Playing some more with the file format, I've found that the entry_type value may be one of those:
entry types0x00000000 .OGG
0x00000008 .PNG
0x00000003 .XML
0x00000012 chunked .XML (.CXML)
Obviously, the entry_type is a small integer value from 0 to 0x12, so basically, the first three bytes from the entry_type DWORD should be all zeros.
This means, i can simply take the 16th byte from the data as the last byte from the key - in this case "
5" - (0x35).
Finally the key is fully reconstructed to "
98d1a1f120d8ce55". But is that the right key?
To find out first I can decode the structure using that key:
character.vc039 : decoded structureOffset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 48 44 52 00 00 00 00 00 00 00 8D 00 00 00 8D 00 HDR...........
00000010 00 00 08 00 00 00 8D 00 00 00 8D 00 00 00 8D 00 .............
00000020 00 00 08 00 00 01 1A 00 00 00 8D 00 00 00 8D 00 ..............
00000030 00 00 08 00 00 01 A7 00 00 00 8D 00 00 00 8D 00 .............
00000040 00 00 08 00 00 02 34 00 00 00 8D 00 00 00 8D 00 ......4.......
00000050 00 00 08 00 00 02 C1 00 00 00 8D 00 00 00 8D 00 .............
00000060 00 00 08 00 00 03 4E 00 00 00 8D 00 00 00 8D 00 ......N.......
00000070 00 00 08 00 00 03 DB 00 00 00 8D 00 00 00 8D 00 .............
00000080 00 00 08 00 00 04 68 00 00 00 8D 00 00 00 8D 00 ......h.......
00000090 00 00 08 00 00 04 F5 00 00 00 8D 00 00 00 8D 00 .............
000000A0 00 00 08 00 00 05 82 00 00 00 8D 00 00 00 8D 00 .............
000000B0 00 00 08 45 4E 44 ...END
and second, verify the data by these points:
- the entries offset values should not exceed the overall size of the VIS resource file;
- the entry offset value should equal the sum of the previous entry offset plus packed size;
- the entry type value should be a 0x00, 0x08, 0x03 or 0x12, unless new types are introduced in future.
Overall, analyzing more than one of the VIS/VC*/VS* files is a good way to verify the key validity.
Finally, to give some hints to the engine developers *HINT-HINT*, here's some scenarios when this method as-is, wont work.
- If you remove the
"HDR" and
"END" structure marks
- If you compress the OGG's and PNG's too, so both size_packed and size_unpacked are different;
- If you add a padding between the structure and the entries data, so the first entry offset is no longer pointed to 0x00000000;
- If you add an entry padding, lets say of 0x16 bytes, so a 10 bytes long entry takes 16 bytes, a 20 bytes entry takes 32, and so on;
- Using a longer KEY. Technically the keys now are 32 bytes long MD5 strings, but only the first 16 bytes of them are actually used.
I've already publish a simple tool that makes most of the above logic, so you can obtain it from the "Game tools" section - "VIS3 key hound".
Comments
* You have an opinion? Let us all hear it!Also, a random observation: when digging into Scene files in Deponia 2, I observed that struct_entry.entry_type value 0x00000008 corresponded to WebP (RIFF) content, rather than PNG content. The takeaway, I suppose, is that the entry_type values may correlate to different content types from one game to another. (Deponia 2 did, however, use 0x00000000 to represent OGG files, I observed.)
Thanks again!
just tinkering around with this myself, I stumbled over your very useful article. Thanks for that. My question is: you are saying, that the XML is deflated. This would make sense, but how do you know for sure? Because the header seems to be corrupted, just like with PNGs. At least I have nonsense in the first two bytes if the XML chunk, which makes identifying and inflating impossible.
So maybe you solved that too already? Any hints would be highly appreciated!