The PAD Texture Extractor is a python script which extracts texture images from the binary data of the popular mobile game "Puzzle & Dragons". It was created as an exercise in reverse-engineering binary data formats. I used the freely available hex-editor HxD to help me.
HOW DOES IT WORK?
The PAD Texture Extractor extracts textures from the Puzzle & Dragons for Android application file (an .apk file) – either the English or Japanese version of the application can be used. Treating the .apk file as a .zip file, the PAD Texture Extractor reads the contents of the assets/DATA001.BIN file, where the texture data for the game is stored as a binary blob.
While I won't go into all the details, Puzzle & Dragons stores textures in what I call "blocks", with each block containing one or more textures. Each texture block starts with a header/manifest which looks something like this:
The texture block header is only 16 bytes long, and it always starts with: 0x54, 0x45 and 0x58 – the ASCII letters "TEX". This is followed by one byte whose purpose is unknown (typically it is 0x31 or 0x32) and then a single byte which is used to store the number of textures contained within this block (in the example above, its value is 0x1.) The next 11 bytes are unused.
For each texture stored in a given block, a 32-byte manifest follows. The first four bytes represent the offset from the start of the texture block header at which the texture data begins. The next two bytes contain the width and encoding of the texture (the encoding is the high four bits, with the width taking up the remaining 12 bits) and the next two bytes contain the height of the texture. The final twenty four bytes contain the name of the image unless the encoding of the texture is equal to 0xD, in which case, only 20 bytes are dedicated to the name – the last four bytes are used to hold the total size of the texture data (in bytes).
Puzzle & Dragons stores its texture data using seven distinct encodings:
- Encoding 0x0 stores four bytes per pixel, one byte per red, green, blue, and alpha channel.
- Encoding 0x2 stores two bytes per pixel. The high 5 bits represent the red channel, the next 6 bits represent the green channel, and the final 5 bits represent the blue channel.
- Encoding 0x3 stores two bytes per pixel, four bits for each the red, green, blue, and alpha channel.
- Encoding 0x4 stores two bytes per pixel, five bits for each the red, green and blue channels, and then one final bit to represent alpha.
- Encodings 0x8 and 0x9 store one byte per pixel; they're grayscale images. I cannot find any difference between encodings 0x8 and 0x9, but both are used.
- Encoding 0xD is a raw JPEG image.
All of this information was obtained through experimentation and a healthy dose of deductive reasoning. Until I wrote the PAD Texture Extractor I had never dabbled in extracting textures from games; now I know it's actually a lot of fun!
ACKNOWLEDGEMENTS
Special thanks to Johann C. Rocholl who wrote the open-source PyPNG library which the PAD Texture Extractor uses to output PNG files.