Creating a PAK File Format
by Raymond Wilson

Introduction

In this article I intend to demonstrate one way of organising all your game related media files in to a single "PAK" file. A couple of good reasons to do this would be for easier distribution of your application data and to apply a degree of protection to your individual files to stop them from being misused. This article is aimed at beginner to intermediate level game programmers but you should be au fait with C++ and concepts such as linked lists, which are used in the code.

The Format

The format of this PAK file will be quite simple. It will consist of an unencrypted file header, a file-table containing information on each file added to the PAK and then each actual file concatenated together at the end. The file-table and the included file's data will be encrypted. For the purposes of this article I will keep it simple and stick to using a Caesar cipher.

PAK File Header

File Table

Concatenated File Data

The PAK File Class

I think now would be a good time to introduce the PAK class I am going to use. It is shown below:

class CPakFile
{
private:
  // Private Variables
  char             m_szFolderPath[300];
  char             m_szPakName[300];				
  sPakHeader       m_Header;
  sFileTableEntry* m_FileTable;				

  // Private Functions
  BOOL  GenerateHFT();
  BOOL  WorkOutOffsets();	

public:
  CPakFile();
  ~CPakFile();		

  BOOL  CreatePak( char* Path, char* Output);
};

The first two variables m_szFolderPath and m_szPakName will be the absolute paths to a compilation folder and the location / filename of the PAK to be output. The compilation folder will just be a folder containing every file you want to add to the PAK collated together. These two values will be read in from edit boxes in the accompanying compile tool and passed to the Create() method. There is also a file header variable of type sPakHeader and the linked list file-table of type sFileTableEntry which are two structures I will look at below. On instantiation of this class, the constructor defaults the variables and then adds a blank dummy node to the head of the linked list file-table. To save space, I will just let you look at this in the accompanying source code. The other two private functions generate the header and the file table and then works out the individual file offsets for inside the PAK respectively. They are both called by the Create() method, which is currently the only public method.

The File Header

The file header for this PAK can be relatively simple. I am just going to use the following structure outlined below:

struct sPakHeader									
{
  char   szSignature[6];
  float  fVersion;
  DWORD  dwNumFTEntries;
  BOOL   bCipherAddition;
  BYTE   iCipherValue;
  char   szUniqueID[10];
  DWORD  dwReserved;
};

I feel that most of its contents are self-explanatory. The iCipherAddition variable just indicates whether the cipher value is added or subtracted from each BYTE sized element that is encrypted. For those of you who don't know, a Caesar cipher is where you transpose each value that is to be encrypted "left or right" by a certain, consistent value. As a quick example, the letter 'A' encrypted using a +3 Caesar cipher would become the letter 'D'. The dwNumFTEntries variable is the number of file table entries (the number of files) in the PAK.

The File Table

The file table will be a linked list of the following data structure, sFileTableEntry, which is outlined below. Each entry will be descriptive of any one file in the PAK.

struct sFileTableEntry								
{
  char   szFileName[30];
  DWORD  dwFileSize;
  DWORD  dwOffset;
  sFileTableEntry* Next;
	
  // Constructor
  sFileTableEntry()
  {
    ZeroMemory( szFileName, sizeof(szFileName) );
    dwFileSize	= 0;
    dwOffset	= 0;
    Next		= NULL;
  }

  // Deconstructor
  ~sFileTableEntry()
  {
    ZeroMemory( szFileName, sizeof(szFileName) );
    dwFileSize	= 0;
    dwOffset	= 0;
    delete Next;
  }
};

The offset value will be the first byte of the particular file within the PAK archive. The file name, size and link to another entry are also included.

The Create Method - Part 1

The functions (including private ones) are quite large so I'll leave it in the source for you to look at and just describe them here. I feel that they are well commented though and that you should have no problems keeping up with them.

The Create() method really starts when it calls the function GenerateHFT(). GenerateHFT starts by filling in the header structure. It adds the signature, version number etc and some random results for the cipher value, unique ID and cipher direction (add or subtract). It then looks at the specified compilation directory (a parameter for Create() ) and parses it file by file. With each file that is found in the directory, it creates a new sFileTableEntry() node, fills in the filename and file size variables (with a default offset value) and adds it on to the linked list file-table. With each file found a counter is incremented. When this process is finished, the dwNumFTEntries variable of the header structure is assimilated with this counter.

The next stage of Create() is the calling of the WorkOutOffsets function. The very first file offset is calculated like this:

dwOffset = sizeof(sPakHeader) + (m_Header.dwNumFTEntries *
                                 sizeof(sFileTableEntry));

The head entry of the file-table will take this to be the value of its offset member variable. Then the size of the file (already calculated for each entry by GenerateHFT() ) is added to the offset. It is then a case of iterating through the other file table entries and taking the offset value for its member variable and then adding on the particular file sizes. It is much easier to see in the code than it is to describe here!

The Create Method - Part 2

At this stage the header and file table for the PAK are completely filled in. Now we open a file stream, using the second supplied parameter for Create(), and write an unencrypted header.

To write the encrypted file table we need to iterate through the linked list file-table one entry at a time (using a local copy of a file table entry called Current). Once we have checked that we are not on the dummy entry, we create a BYTE array the same size as sFileTableEntry like this:

BYTE* Ptr = NULL;
Ptr = new BYTE [sizeof(sFileTableEntry)];

We then copy the current file table entry in to this BYTE array as follows:

memcpy( Ptr, Current, sizeof(sFileTableEntry) );

We then iterate through each BYTE in this array, encrypt it and write it out to the PAK file. The code for this is:

for( int i = 0; i < sizeof(sFileTableEntry); i++ )
{
  // Temporary BYTE variable
  BYTE Temp = 0;

  // Make equal to the relevant byte of the FT entry
  Temp = Ptr[i];
  
  // Encrypt BYTE according to the Caesar cipher
  if( m_Header.bCipherAddition == TRUE )
    Temp += m_Header.iCypherValue;
  else
    Temp -= m_Header.iCypherValue;
  
  // Write the FT encrypted BYTE value
  fwrite( &Temp, sizeof(BYTE), 1, PAKStream );
}

Once this is done the file stream is closed and the Current variables is set to the head of the linked list file-table again. What we do now is open two file streams. One is for writing to the PAK file and one for reading in each file to be added. These will be used in conjunction with each other.

We set the position in the PAK file (for writing) according to the dwOffset value stored in the current file entries member variable. We then read in a BYTE at a time from the input stream (which was opened using the szFilename variable of the current file entry), encrypt it using the Caesar cipher as before and output it in to the PAK file using the write stream. This is, again, demonstrated in the attached source code.

Conclusion

That's it for just now. A public method could easily be added that loads the header from an existing PAK file and, using that data decrypt and load the contained file-table. With that information you could easily, say, decrypt and extract files to a specified directory and dynamically load them. The way I envision using it would be to dynamically create BYTE arrays within my program the same size as the file I want to utilise and copy the data from the PAK (unencrypted) in to this array. I could then use "load from memory" functions (like D3DXLoadMeshFromXInMemory() or D3DXCreateTextureFromFileInMemory() ) to work with the data. LUA scripts would be perfect here to tell me which files I need to load and, as an example, you could load all the files you need using this method "between levels". I could always write this load function later on if anybody desires it. I hope you find this useful and that it can be a source of inspiration for your own projects.

Discuss this article in the forums

Date this article was posted to GameDev.net: 9/6/2003
(Note that this date does not necessarily correspond to the date the article was written)

See Also:
Featured Articles
General