So it was late one night and I couldn’t sleep, which is nothing abnormal for me and I got thinking. Is there any way I can improve the compression ratio of my collection of PS1 games, or was 7zip with LZMA the de-facto and best option? The truth it turns out is a little more complicated.
Updated 2022-08-13: Added information about chd files, as a more user friendly “better than 7z compression” method.
Updated 2021-07-15: Article finished, downloads and final conclusions added!
An update from the future
This article was written to find the highest compression (lowest space usage) for PS1 games, at the cost of effort with extraction. If you’re looking for something to save space, but still have games more playable, you can compress them to CHD files, using chdman from the MAME emulator.
There are a couple of emulators now that can use these files without having to extract them, however the compression is a bit worse. The whole set I compressed into chd uses 388GB compared to the 319GB I was able to achieve with PSXMinimise. My article was still a very interesting journey however, and still remains the most efficient in terms of space saved and shows my entire process 🙂
Finding The ‘Best’ Compression
I’ve tried numerous compression formats in the search for the best one for PlayStation games. I needed a test game to find the best archive solution for the game data. I would tackle the audio content of the discs later. I grabbed my backup copy of Oddworld: Abe’s Oddysee (one of my favourite games) and tried all the regular formats that PeaZip could handle in their “Best” modes.
|Compression Format||Options||Compression Time||Resulting Filesize|
|7Zip||Ultra||2 minutes||403,753 KB|
|ARC (FreeArc)||9||4 minutes||289,820 KB|
|Brotli||9||35 minutes||417,106 KB|
|BZip2||Ultra||16 minutes||512,282 KB|
|GZip||Ultra||7 minutes||514,581 KB|
|XZ||Ultra||3 minutes||403,755 KB|
|ZIP||Ultra||6 minutes||514,582 KB|
|Zstd||19||3 minutes||447,022 KB|
|ZPAQ||Ultra||34 minutes||409,185 KB|
Up until I saw the FreeArc compression sizes, it would seem that 7zip ultra compression was the best for PS1 data. At first, I thought there’d been some kind of error and the archive couldn’t have been as small as it became. However, after some checking I finally saw that the result was indeed accurate and it did give an excellent compression ratio of PlayStation 1 game data.
A few years ago, I used to visit the EmuParadise website, and I recall that they used to compress their bin files with a tool called ECM. Armed with this vague piece of information I started searching the web for information about it. I managed to find the ECM Tools package (on archive.org) written in 2002 by Neill Corlett that is used to encode and decode files. These tools either add or remove sub-channel error correction data, or “Error Code Modelling” on a CD image. Removing this information will save some data from an image file and it can be added back in such a way that the file is an identical binary.
|Filename||Original Size||Size (after ECM Encoding)|
|Destruction Derby 2 (Europe) (Track 01).bin||71,603 KB||64,141 KB|
|Dino Crisis (Europe) (Track 1).bin||378,702 KB||340,398 KB|
|Driver (Europe).bin||729,961 KB||651,308 KB|
|Grand Theft Auto (Europe) (Track 01).bin||91,673 KB||81,870 KB|
|MediEvil (Europe) (Track 1).bin||516,356 KB||478,157 KB|
|Oddworld – Abe’s Oddysee (Europe).bin||685,461 KB||612,203 KB|
|Tomb Raider II (Europe) (Track 01).bin||224,530 KB||218,150 KB|
So it seems it is rather worth processing the ECM data, with all the games I’ve tested so far you get some kind of saving and with others you get a substantial saving.
To compress the audio portions of the PS1 games, I decided to use FLAC, the free lossless audio encoder. I used this because the input files you pass to FLAC when decoded will be exactly the same as the original files. In a typical PS1 game, Track 01.bin is usually the game data, and if there are any other tracks (Track 02.bin, etc) they are CDDA files, which you can compress using FLAC if you tell it how these raw files can be processed.
flac -8 -V --force-raw-format --endian=little --channels=2 --bps=16 --sample-rate=44100 --sign=signed *.bin
An explanation of these options is that -8 sets the maximum level of FLAC compression and -V makes FLAC verify the files it has written after they have been created. The –force-raw-format argument instructs the command line version of FLAC to treat the input files as raw data. The –endian=little setting instructs FLAC of the byte order of the files. –channels=2 indicates that the audio stream is stereo. –bps=16 informs that the files are 16bit, –sample-rate=44100 is the standard audio sampling rate for a compact disc and finally –sign=signed tells FLAC that it’s a signed audio file. I am indebted to Cacovsky for the vital information on how raw CD audio can be converted to FLAC.
Putting It All Together
I had a few design goals for anything I created. It needed to rely on mostly open tools so that the techniques used are operating system agnostic and could fairly easily be ported to any other system and to use nothing that has encumbering license terms.
Compression speed isn’t much of a factor as these files will really only be compressed once, however decompression speed shouldn’t be prohibitively slow. The methods used needed to be binary compatible so that the games were exactly the same after decompressing as they would be beforehand and I also wanted a system to verify that the files were indeed exactly the same as the originals.
From the results of compression earlier, FreeArc was definitely the obvious choice, given that it had a great improvement in compression ratios of the other archive formats and it didn’t take significantly longer to compress the data.
The overall method I came up with is to remove the ECM data from the first track and then process the audio data from the other tracks so that it’s filesize is reduced using FLAC and then the entire game can be packaged up in a .arc file.
Before doing this however, I needed a way to make sure that my compressed files would be the same as the originals when running the decompression process. On Linux there is a utility called sha256sum which will calculate a hash which can be compared against a file later to ensure that no changes have been made. On Windows there is a built in hash checker, however it doesn’t seem to be able to compare from a list of checksums in a single file and verify the data is intact. I am again indebted to Dave Benham who maintains HASHSUM.BAT at the DOSTips Forum. He created a script that is compatible with sha256sum but for Windows that works (mostly) in batch script.
So after a few sleepless nights, I’ve put together some utilities for automatically compressing and decompressing the games.
My compression script is a bit more ugly than the unpacker, since I imagine people will unpack the games more often than they will pack them.
|Game||Original Filesize||7zip Filesize||7z Percentage||PSXMinimise||PSXMinimise Percentrage|
|Destruction Derby 2||620 MB||533 MB||85 %||393 MB||63 %|
|Dino Crisis||405 MB||202 MB||49 %||170 MB||41. %|
|Driver||712 MB||357 MB||50 %||298 MB||41 %|
|Grand Theft Auto||714 MB||612 MB||85 %||452 MB||63 %|
|MediEvil||535 MB||322 MB||60 %||283 MB||52 %|
|Oddworld: Abe’s Oddysee||669 MB||387 MB||57 %||241 MB||36 %|
|Tomb Raider II||708 MB||460 MB||64 %||317 MB||44 %|
|WipEout 2097||681 MB||566 MB||83 %||413 MB||60 %|
|Average||630.5 MB||429 MB||67.22 %||320 MB||50.60 %|
|Average Saving||–||200 MB||32.77 %||309 MB||49.40 %|
Finally after a little while of having “writers block”, I have finally released a download of my scripts for working with PS1 files. These are batch scripts written for Windows with their accompanying required executables, however the techniques employed here can be used on Linux based operating systems too.
You can compile ECM tools with GCC. One problem you might have is that FreeArc only seems to like to run on 32bit builds of Linux and there’s a library it needs that doesn’t seem to exist for newer operating systems, however you can link to a newer version of it and it seems to run okay. If anyone has any good ideas for working around these problems, I could then package up a build that works on Linux with a shell script.
This package includes everything you need to compress and decompress PS1 games on Windows, including ECM Tools, FLAC, FreeArc and HASHSUM.BAT
SCED-00816 – Demo One (Version 5) (Europe).arc 227M
This is the original demo CD that I got with my PlayStation back in 1997, and I include it as an example of the power of PSXMinimise. I’ve included a demo as there aren’t any commercial games that I could provide which are free of copyright, and I haven’t found a PS1 homebrew that is big enough to demonstrate PSXMinimise.
Using the files is relatively easy if you just want to uncompress one file, simply place the downloaded .arc file into a separate folder, navigate to it, and extract the contents of the PSXMinimise tool in the folder, then run the psxminimise-decompress.bat file.
If you want to compress a file, place all the individual .bin files into a folder named after the game (which will become the name of the archive) with the track numbers after the game name (if it contains multiple tracks). Then run the psxminimise-compress.bat file.
If you want to work with multiple files, or use the decompress-all or compress-all tool, you need to add the folder to your path variable. Included is a script to add the current directory to your User path. To enable verbose output, simply place a number 1 after the command which will output all the information from the scripts as they run. The scripts will not change any colours if ran in this mode.
Using psxminimise-compress-all.bat will go into each subfolder and run psxminimise-compress. Running psxminimise-decompress-all.bat will extract all archives to individual folders.
Using the SCED-00816 “Demo One” PlayStation game (from above) which uncompressed is 498,163 K. Compressing with 7z results in a size of 304,084 K and with my PSXMinimise method it’s further reduced to 232,940 K
You could get smaller game files if you removed some of the games content (a process known as “Ripping”) using something like PocketISO to remove audio tracks and FMV sequences from games, however I wanted the games to be intact to their original release, and I think this has done quite a good job.
In total running PSXMinimise on a dataset of 1,492 games I was able to improve the compression on the entire folder from 393 GB in 7z format, to being 319 GB with PSXMinimise, an over 70 GB saving from files already compressed with the “de-facto best” archive format, with an overall compression ratio of around 50%. Very nice.