Contents

Introduction
An Overview
Starting from
Familiar Ground

Source code
Printable version
Discuss this article

The Series

An Introduction
Data Manipulation
Dynamic Loading
The Stack and
Program Flow

Starting from Familiar Ground

This first example is designed with simplicity in mind, so as not to distract from getting the system up and running. You will want to create a console application to use this example code as provided. This example will be object-based, but not necessarily object-oriented; the classes can therefore easily be replaced by structures for those dealing with a pure C mentality.

Let's say you have a very basic desire to see your computer speak on command. You may request that it talk a specified number of times for each execution of a particular script. In its simplest form, you would write such a script in an unrolled form. For example a script that talks twice, and then knows it has finished its execution:

talk
talk
end

Pretty basic for now, but it's enough to see some results and know you're on track. We will enumerate these operations:

enum opcode
{
  op_talk,
  op_end
};

We may choose to pair opcodes with data to make them more useful later on. It would be in our interest to make an abstraction now, so that we don't have to change a lot of code later on when we decide to encapsulate the pairing as an instruction:

// the basic instruction, currently just encapsulating an opcode
class Instruction
{
public:
  Instruction(opcode code) : _code(code)	{}
  opcode Code() const         { return _code; }
private:
  opcode	_code;
  //char*	_data;  // additional data, currently not used
};

Reasonably, a script is then a collection of these instructions. Because the list of instructions generally will be formed during an initialization process, it's ok to use an arrayed form for implementation, such as a vector. The arrayed form is also useful in later optimizations, and for random access:

// the basic script, currently just encapsulating an arrayed list of instructions
class Script
{
public:
  Script(const std::vector<Instruction>& instrList)
    : _instrList(instrList) {}
  const Instruction* InstrPtr() const { return &_instrList[0]; }
private:
  std::vector<Instruction>	_instrList;
};

Given a pointer to the beginning of a list of instructions, all that remains necessary is a procedure for iterating through the list and executing each instruction:

// note that _instrPtr must point to a valid list of instructions
Instruction* _instr = _instrPtr;	// set our iterator to the beginning
while (_instr)	// the end operation will set _instr to 0
{
  switch(_instr->Code())
  {
  case op_talk:
    std::cout << "I am talking." << std::endl;
    ++_instr;    // iterate
    break;
  case op_end:
    _instr = 0;  // discontinue the loop
    break;
  }
}

For the sake of convenience, you will probably want to encapsulate this functionality into its own class, and allow it to internally manage the instruction lists (as scripts). This would be the virtual machine, provided with useful management utilities for loading and selecting scripts:

// rudimentary virtual machine with methods inlined for convenience
class VirtualMachine
{
public:
  VirtualMachine()
    : _scriptPtr(0), _instrPtr(0), _instr(0), _scriptCount(0) {}
  // a very basic interface
  inline void Execute(size_t scriptId);
  size_t Load(const Script& script)   { return AddScript(script); }
private:  // useful abstractions
  // pointers used as non-modifying dynamic references
  typedef const Script*       ScriptRef;
  typedef const Instruction*  InstrRef;
private:  // utilities
  size_t AddScript(const Script& script) // add script to list and retrieve id
  {_scriptList.push_back(script); return _scriptCount++;}
  void SelectScript(size_t index)    // set current script by id
  {assert(index < _scriptCount);  // make sure the id is valid
  _scriptPtr = &_scriptList[index];
  _instrPtr = _scriptPtr->InstrPtr();}      
private:  // data members
  std::vector<Script> _scriptList;
  ScriptRef           _scriptPtr;    // current script
  InstrRef            _instrPtr;     // root instruction
  InstrRef            _instr;        // current instruction
  size_t              _scriptCount;  // track the loaded scripts
};

The virtual machine maintains a list of scripts that have been loaded as a vector. It also internally maintains a count of the number of scripts so that an offset (id) into the vector can be returned upon loading a script, allowing it to be stored. This makes it very easy to execute a pre-loaded script by simply passing that offset to the machine.

Although currently unnecessary, it also keeps track of the current script executing. This can be useful if the script contains more than just a list of instructions, as it will in a future article.

Its Execute() method uses the procedure previously described:

void VirtualMachine::Execute(size_t scriptId)
{
  SelectScript(scriptId);  // select our _instrPtr by script ID
  _instr = _instrPtr;      // set our iterator to the beginning
  while (_instr)
  {
    switch(_instr->Code())
    {
    case op_talk:
      std::cout << "I am talking." << std::endl;
      ++_instr;  // iterate
      break;
    case op_end:
      _instr = 0;  // discontinue the loop
      break;
    }
  }
}

A side note about OOP:

Using an Object Oriented approach, you could eliminate this switch statement and derive specific instruction types from a base instruction type with some kind of virtual Process() command. To add support for a new instruction, you would simply inherit from a base instruction class, and isolate its specific processing to that class. Lists of these instructions would of course have to support polymorphism; a vector of pointers to instructions, or some equivalent.

This extensible approach can be very convenient, and is worthy of some investigation. In my own toy experiment, however, it ran at roughly 1/3rd the speed of my non-OO VM system, which is a pretty significant performance hit. Later on in development, you will probably want to optimize the heck out of your VM's processing loop. The overhead introduced with the OO version, at least in my own experience, is not worth it. I encourage the curious to explore this some more, however, as I probably did not perform the best test possible. And please give me some feedback if you do! That's it for the side note.

Now, let's see how we would use these components to create and execute a script which talks twice, and then ends:

VirtualMachine vm;

// build the script
vector<Instruction> InstrList;
InstrList.push_back(Instruction(op_talk)); // talk twice
InstrList.push_back(Instruction(op_talk));
InstrList.push_back(Instruction(op_end));  // then end
Script script(InstrList);

// load the script and save the id
size_t scriptID = vm.Load(script);

// execute the script by its id
vm.Execute(scriptID);

Conclusion

In the next article, I will probably get into some more interesting topics regarding instruction data, and some form of registered data (variables). This will lead into some simple mathematical functionality at the very least.

I didn't want to get too buried in example code this time around as the introduction was quite lengthy. Hopefully this was at least enough to be of some inspirational value until a more comprehensive second article. The main thing to keep in mind is that you want an efficient implementation, or else you'll end up with a system that drains all the processing power needed for your game. You will want algorithms that allow for optimizations later on for this purpose, but of course don't sacrifice the clarity of your code too early.

I'd like to hear about any questions, criticism, preferences, advice, mistakes i made, scolding (if deserved) you'd like to express. I'll keep an eye on the forum discussion, but you can always email me: glr9940@rit.edu