This book is a work in progress, comments are welcome to: johno(at)johno(dot)se

Back to index...

Ground Control 2's Persistence Scheme

Introduction

Persistence schemes (i.e. being able to "save games") is something that we have historically shied away from at Massive, because we for some reason have always considered it to be a difficult problem to solve. Ground Control did not at all allow for the user to save games during a mission. Josephine did support saving games, but at the cost of all persistent data being in a custom and hard to manage format at run-time. For Exodus (Ground Control II), one of the main requirements was to remedy the flaw of Ground Control, so that user could save the game whenever they wanted to.

The first attempts at implementing persistence for Exodus involved having two distinct "application startup" pipelines through the code. This in effect meant that every single object that was to be "saveable" could be created in two ways, one way that created the object in a "new game" state, and one that created the object in a "load from disk" state. In addition, of course, specific "save to disk" code was required for each persistent object.

Needless to say, this approach was both tedious and prone to errors, and also "felt" redundant. Although we knew that this was an important problem to solve, early implementation attempts were abandoned due to prioritization issues. We postponed the solution of this important problem until... later.

When the time came to actually solve the persistence problem, the application was already for the most part completed. For this reason, adding "save to disk" code to all objects that needed to be persistent didn't seem like a very difficult job, seeing as all representations of objects were pretty much finalized. Adding this code would be tedious, yes, but not exceedingly complex. Adding a "load from disk" creation pipeline to all objects initially looked to be something in the same vein; not complex, but very tedious. Of course all of this was prone to errors, since it involved a great deal of redundancy.

So, an attempt was made to unify all cases of "application startup". There should be no client-visible difference (clients here being persistent classes) between the "new game" and the "load game" cases.

New Game / Load Game unification

The basic idea of mapping class member data to logical names and saving this data centrally came to mind early, but it seemed that the main problem was not mapping data pointers to logical names, it was rather to be able to easily enforce that these names would be globally unique. Within a class this was no problem, as it would be easy to simply map the member data to string names (i.e. the address of myData would be mapped to "myData"), but what about when there were several instances of a class instantiated at once. Looking to how relational databases organize data, the idea came to mind that each instance can be uniquely identified via class name and some kind of unique per-instance number (this maps into relational theory as table (class) and row (object/instance)).

So to begin with, each persistent class inherits from a common template baseclass calls EXCO_Persistent, parameterized with the class itself, as below:

//code from exg_unit.h...
#include "exco_persistent.h"

class EXG_Unit : public EXCO_Persistent<EXG_Unit>
{
    //class specifics here...
};

The main reason for this is that a mechanism is required to keep track of all instances of a class, in order to make sure that all instances indeed have a unique identifier within the context of the class. To implement this, each EXCO_Persistent subclass has a static instance manager object (EXCO_PersistentInstanceManager), which stores a pointer to the first instance of the class. Next, EXCO_Persistent includes a pointer to the next instance of the class, and the EXCO_Persistent baseclass constructor and destructor made sure that each instance is linked into and out of the class-wide instance list automatically. This linked list is used to make sure that all instances receive a unique instance key (we use an unsigned int for this key), regardless of whether the client code specifies a specific key or the system should auto-assign a key (both cases are supported, more on this below). Incidently, this means that automatic garbage collection can also be supported, but we didn't use this feature in Exodus. However, debug code was added so we would know when instances indeed were leaking.

Memory-mapped members

With this system in place, we can automatically identify each persistent instance between program executions based solely on class name and instance key, and all member data mapping is only required to be unique within the context of the instance itself, exactly the constraints already placed on member variables by C++. Continuing with the above example of EXG_Unit, the extra code required to support persistence in the typical case is as follows:

//code from exg_unit.cpp...

//setup for the static instance manager...
PERSISTENT(EXG_Unit, "EXG_Unit");

//constructor...
EXG_Unit::EXG_Unit(const PKEY aPersistenceKey, /*more parameters here...*/)
:EXCO_Persistent<EXG_Unit>(aPersistenceKey)
//more member initialization here, including everything that is persistent...
{
    //map memory to names...
    PersistPointer("myOwner", (void**)&myOwner);
    PersistMemory("myPosition", &myPosition, sizeof(myPosition));
    PersistMemory("myBodyHeading", &myBodyHeading, sizeof(myBodyHeading));
    PersistMemory("myLookHeading", &myLookHeading, sizeof(myLookHeading));
    PersistMemory("myLookPitch", &myLookPitch, sizeof(myLookPitch));
    PersistMemory("myHealth", &myHealth, sizeof(myHealth));
    PersistMemory("myState", &myState, sizeof(myState));
    PersistMemory("myEnabledFlag", &myEnabledFlag, sizeof(myEnabledFlag));
    PersistMemory("myCurrentMode", &myCurrentMode, sizeof(myCurrentMode));
    PersistMemory("myFireBehaviour", &myFireBehaviour, sizeof(myFireBehaviour));
    PersistMemory("myExperience", &myExperience, sizeof(myExperience));
    PersistMemory("myEnergy", &myEnergy, sizeof(myEnergy));

    //load data from persistent storage to mapped memory...
    LoadMembers();
}

The above code begins with the macro PERSISTENT, which simply declares the class specific instance manager, parameterizing it with both the specific class EXG_Unit as well as a string name for the class ("EXG_Unit"), which must be globally unique. The logical thing seemed to us to use the class name itself in string form, as C++ already requires globally unique classnames (in truth only within namespaces, but our persistence scheme does not support such concepts).

The constructor follows, and in the case of EXG_Unit there were a number of application specific reasons for us to ensure that EXG_Unit instances used the same instance key as they had at the time of persistence. For this reason, we passed a client supplied instance key (PKEY, which is a define for unsigned int) to the baseclass constructor. Code in the constructor makes sure that this key does not collide with that of any existing instance (asserting if so, since this is a design error if it occurs). It is also legal to pass 0 to the baseclass constructor (which is the default parameter and also used to signify an illegal instance key), in which case the baseclass constructor will generate a unique key. The default parameter is also very useful for cases when you want to store persistent objects in an array, in which case C++ does not allow for constructor parameterization.

In the constructor, member variables are mapped to logical string names (via baseclass methods), only required to be unique within the context of the instance itself. For this reason, it seemed logical to use the string versions of the member names themselves, again because C++ already requires that such names are unique within the same scope. The most commonly used variant (PersistMemory()) is the mapping of statically allocated memory, which requires a string name, a void* pointer, and an unsigned int size of the memory itself. Additionally, PersistString(), PersistLocString(), PersistPointer(), and PersistTime() are supported, which enable persistence of const char* / MC_String, MC_LocString, generic C++ pointers, and floating point timestamps, respectively.

PersistPointer() is interesting as it allows for pointers to survive between application executions. This is accomplished by requiring that all memory pointed to is indeed an object, and also an instance of a subclass of EXCO_Persistent. This is because all EXCO_Persistent subclasses instances are already uniquely indentifiable between application executions through the class name / instance key combination, and thus resolving these pointers can be managed behind the scenes.

Another interesting aspect of the system is that no "post-load" resolution of pointers is required; every time a PersistPointer() call is made, the backend attempts to resolve the pointer immediately, and if the object is indeed instantiated, the resolution can be completed immediately. Also, the EXCO_Persistent constructor also performs an inverted look up (i.e. "resolve pointers to this instance"). Together, these two mechanisms alleviate the need for a "post-load" pointer resolution pass, keeping the client isolated from such mundane bookkeeping.

PersistTime() is special in that it subtracts MI_Time::ourCurrentTime from the value of the mapped memory when persisting, and later adds MI_Time::ourCurrentTime to the persisted value when restoring. This way there is no need to persist internal MI_Time state in addition to application specific state.

Lastly, the call to LoadMembers() actually moves memory from the persistence storage backend into the mapped memory. Important to note about this call is that if first checks if the backend is in "loading state", doing nothing if this is not so; this effectively makes the difference between "new game" and "load game" transparent to clients.

Also, since LoadMembers() effectively is a "pull" from the backend, based on the memory maps that are in memory, the system automatically supports dynamic changes to the data scheme of a given class (i.e. what members exist, what types are they, etc). If the member memory being loaded is "new", i.e. it was added to the class definition after the data persistence event that created the data being loaded, then LoadMembers() will silently fail to find data in the persistence backend for this persistent member, and thus no data will be loaded. The member variable will remain as it was default initialized (which you always make sure to do with all members, right?). The same goes for the case when a class member changes size (i.e. from unsigned char to unsigned int), or type, as all memory maps include a type, a name, a pointer, and a size. Note that the use of "type" in this context corresponds to the various EXCO_Persistent::PersistXXX() methods, i.e. MEMORY, STRING, LOCSTRING, POINTER, and TIME.

Dynamically allocated objects

With the above described system in place, it seemed that things would work out fine. However, it soon became evident that handling objects that were dynamically allocated would pose a problem. This was mainly due to the fact that the way that the application data was moved from the persistence backend to actual run-time memory was designed as a "pull" system. Statically allocated objects are automatically created by the C++ run-time when program execution begins, and thus their constructors execute, and the necessary "pulling" of data is performed as desired. Indeed, this is a really nice aspect of the system as a whole; all that is required for an object to "load itself" is that it is instantiated, the rest is automatic.

But for dynamically allocated objects, the problem became that of knowing when to instantiate these objects, and furthermore, which objects to instantiate (in the class / instance sense). We didn't want clients to have to do a huge amount of custom bookkeeping in order to be able to determine what to restore at "load time". Neither did we want to obfuscate C++ by marking classes or objects as "dynamic" versus "static", as C++ requires no such distinction; indeed, C++ objects do not know and do not need to know if they are statically or dynamically allocated.

What we ended up doing was indeed less elegant than the general scheme, but worked out fairly well in retrospect. We reasoned, as is common, that the container objects for dynamically allocated objects should be responsible for all aspects of creation and destruction for these objects. This also rhymed well with what we were already doing throughout the application. We added overridable methods to EXCO_Persistent in the following manner:

virtual bool PersistDynamicMembers(EXCO_PersistenceDatabaseWriteLump& aLump);
virtual bool CreateDynamicMember(EXCO_PersistentDynamicMember& aDynamicMember);

PersistDynamicMembers() is called back by the backend during save, and CreateDynamicMember() is called back by the backend during load. The baseclass implemention of these methods was nothing at all, but persistent containers can override these in order store information about the existence of dynamically allocated objects. Consider the following implementations:

//EXCO_Persistent override
bool EXG_UnitContainer::PersistDynamicMembers(EXCO_PersistenceDatabaseWriteLump& aLump)
{
    unsigned short u;
    EXG_Unit* unit;
    EXCO_PersistentDynamicMember* dm;

    for(u = 0; u < EX_MAX_UNITS; u++)
    {
        unit = myUnits[u];
        if(unit)
        {
            dm = aLump.PersistDynamicMember(unit->GetPersistenceTypeName(), unit->myKey);
            if(!dm)
                return false;
            dm->AddParameter("id", unit->myId);
            dm->AddParameter("type", unit->GetType().myName);
            dm->AddParameter("name", unit->GetAgentScriptName());
        }
    }

    return true;
}

The above is the "save" code. The parameter aLump represents the current instance of EXG_UnitContainer from the persistence backend's point of view, simply a lump of member data, hence the name. As can be seen, the implementation requests that a EXCO_PersistentDynamicMember be created for each EXG_Unit instance that the EXG_UnitContainer owns and manages internally. Each such "dynamic member" is parameterised by class name and instance key (GetPersistenceTypeName() and myKey are both available via EXG_Unit's baseclass EXCO_Persistent), in order for the container to know what instance to instantiate when the system is restored.

Additionally, construction parameters are stored in the EXCO_PersistentDynamicMember via a key / value interface (AddParameter()). This was made available due to the fact that we used constants and references extensively as constructor parameters, and these needed thus to be resolved externally before the objects could be constructed. See below for examples on how this was used.

//EXCO_Persistent override
bool EXG_UnitContainer::CreateDynamicMember(EXCO_PersistentDynamicMember& aDynamicMember)
{
    unsigned short id;
    int typeId;
    EXCO_UnitType* type;
    int name;
    EXG_Unit* u;

    if(aDynamicMember.myType == EXG_Unit::GetPersistenceTypeName())
    {
        //id
        id = aDynamicMember.GetParameterValueInt("id");
        if(id >= EX_MAX_UNITS)
            return false;

        //resolve type
        typeId = aDynamicMember.GetParameterValueInt("type");
        type = myUnitTypes.GetType(typeId);
        if(!type)
            return false;

        //resolve name
        name = aDynamicMember.GetParameterValueInt("name");

        //create unit
        u = Priv_CreateUnit(aDynamicMember.myKey,			//use supplied persistence key
                        NULL,							//owner will be loaded
                        *type,
                        MC_Vector3f(0.f, 0.f, 0.f),	//position will be loaded
                        0.f,							//heading will be loaded
                        name,
                        id);							//use supplied id
        if(!u)
            return false;

        return true;
    }

    return false;
}

Here one can see the "load" code. Firstly, we make sure that the type of aDynamicMember is of a type that this container (EXG_UnitContainer) knows how to / wants to instantiate (this is simply an application specific precaution). Next, we extract various constructor parameters of constant or reference nature, convert them to something meaningful to our implementation, and make sure that they are all valid. Lastly we use the regular pipeline for creating EXG_Unit objects, using the supplied instance key to make sure we load the correct instance from the persistence backend. Important to note is that at this point, the new instance of EXG_Unit will itself take care of the rest of restoration, as it is an EXCO_Persistent. Note that those constructor parameters that are internally persistent within EXG_Unit have dummy data passed to them, this to avoid having to write and support an additional constructor for EXG_Unit, all in the interest of unifying as much of the code as possible.

This aspect of the persistence scheme is indeed a little more work to write and maintain, and indeed is very similar to what one would need to write were this a more traditional "save to file handle" type method. One small improvement is the general key / value interface as compared to a simple file system interface, which allows for the actual file system persistence to be isolated.

High-level Control

Since the goal of this system was a "save game" system, we needed to be able to control the persistence of the entire system from a single point, and allow the user to save at any time. This means that we needed access to all EXCO_Persistent objects and save them all at the same time. This was accomplished by creating a "persistence manager" (the class utility EXCO_PersistenceManager) which had access to all EXCO_PersistentInstanceManager instances, which in turn had access to all EXCO_Persistent instances (via the class-wide instance lists that were already in place). EXCO_PersistentInstanceManager was made to inherit from EXCO_PersistenceManager::Database, instances of which automatically put themselves into a global list via EXCO_PersistenceManager::AddDatabase() in the baseclass constructor. Thus everything was globally connected, and EXCO_PersistenceManager::Save() was written to save all instances of all "databases" via the baseclass method EXCO_Persistent::SaveMembers(). Again, the inspiration from the data organization of relational databases is clear in this design.

The first implementation of this saved each instance of each class into a unique file, with filenames that were simply (i.e. 1.EXG_Unit). While this was simple to implement, it soon proved unwieldly, as each "save game" was physically a file system directory which contained thousands of files consisting of only a few bytes each. As operating systems aren't very friendly towards this type of thing, the code was changed to save all databases into a single file.

Restoring / loading was not as simple a matter, as objects need to be instantiated before they can be loaded. As mentioned earlier, objects that were statically allocated worked well enough, as they knew to load themselves once the were instantiated. However, once the physical persistence format was changed from one file per instance to a single file for all instances, the problem became one of locating the data that the given instance wanted to load. When it was one file per instance, the backend could simply open the file and read the member data directly. With a single file, this wasn't quite as simple, considering the fact that there was no way of knowing in what order the objects would be instantiated, and we didn't want to have to open the single file and do some seeking for each instance that was instantiated. Clearly, some other scheme was required.

What we ended up doing was writing an in-memory version of the physical "save file", and loading the file into memory as the initial startup phase of loading a saved game (EXCO_PersistenceManager::StartLoad()), putting the backend into a "loading" state. Once this is complete, the regular application specific "new game" code is run, creating and initializing the various subsystems of the run-time system in the usual order. As constructors are executed, calls to LoadMembers() find the backend in a "loading" state, and data is pulled as needed from the backend format into the run-time data structures. Finally, when the run-time system is completely initialized, a call to EXCO_PersistenceManager::FinishLoad() cleans up the in-memory version of the "save file", changing the backend to a "not-loading" state, and things are ready to go. This way, when new objects that are persistent are created during run-time, they find the backend in a "not-loading" state and simply used default / constructor values for their members.

Summary

In general, the system is quite nice as it is very "external" to the client application code, and can, as we proved, be added quite late in development without overly disrupting the existing codebase. The way that dynamic members are handled is however a bit of a pain, but can be seen as the only "manual implementation" aspect of the system as a whole. There should be no problems involved in re-using the persistence system from Exodus as-is.

However, depending on the complexity of the application in question, "sleeper" issues may occur on a wholly application specific level. As there is nothing in the design of the persistence scheme that "requires" anything to be saved, we found that we began by marking the most obvious class members for persistence and then handing the application over to the QA department. This was a very nice and "lazy" job for us as engineers, but as time progressed we found that QA were reporting bugs related to load / save in the form of "such and such isn't restored correctly...". The engineers would look at the problem and find that, indeed, the state that controlled that aspect of functionality was not persistent, and the fix would be quite painless; just add a single line of PersistMemory() or equivalent to some constructor, or in the worst case have a class inherit from EXCO_Persistent where it hadn't before. Again, not very painful for engineers, but tedious from a QA standpoint.

These problems aren't really "bugs" in the persistence scheme, but they do directly arise from the way that it is designed, and manifest themselves in a wholly application specific manner. To be fair, these kinds of things might pop up given other implementations of persistence, but maybe the common sentiment would be for developers to just err on the side of excess and simply make all memory persistent just for the hell of it, so as not to have to commit too much thought to the problem. Trade some disk space for thought, so to speak.

In our specific case (Exodus), we also realized that some of our assumptions about object initialization weren't holding once we started loading saved games. For example, we had initially set unit health to the maximum dictated by their type when they were created, since when starting a new game all units are "new" when they are created. However, when loading a saved game, a unit may be "old" when instantiated, in that it shouldn't have maximum health when created since it was somewhat damaged when the game was saved. This required that we go back and change some interfaces related to unit creation events, which in turn affected our network protocol, etc.

These kinds of problems are indeed a clear conceptual / design error on the application side, not in the persistence scheme itself, but they are a good example of why core functionality such as persistence really should be added up front and closely monitered by QA throughout the entire development process.

Back to index...