Contents

Cleaning Up
Loading From Text
Scanning
Matching
Basic File Handling
Loading From File

Source code
Printable version
Discuss this article

The Series

An Introduction
Data Manipulation
Dynamic Loading
The Stack and
Program Flow

Scanning

Given the text-based format created and a little thought, it is not too difficult to create a process by which to scan a line of text and break it into its constituent tokens. It may look something like this:

if line contains no content, break out prematurely
scan an opcode
for the entire length of the line:
1. skip over white space
2. scan in next numerical value

To handle the subsets of this process we will create a few private utilities for the scanner:

private:
    bool SkipSpacing(const std::string& line);
    bool ScanCode(const std::string& line);
    bool ScanNum(const std::string& line);

Each of these functions will make use of private data:

private:
    CharArray   _tokBuffer;
    size_t      _offset;

An offset into the string "line" and an array of tokens will be initialized at the start of ScanLine() and will be used and modified throughout by the specialized utility functions just described.

Now we can put some code into ScanLine():

bool Scanner::ScanLine(const std::string& line)
{
    // reset offset and token buffer
    _offset = 0;
    _tokBuffer.clear();

    // check for an empty line
    if (line.empty())
        return false;

    // check for valid line content
    if (!SkipSpacing(line))
        return false;

    // check for a valid opcode
    if (!ScanCode(line))
        return false;

    size_t len = line.length();
    while (_offset < len)   // scan args until the end of line
    {
        if (!SkipSpacing(line)) // get to next arg
            return true;        // unless we're done
        if (!ScanNum(line))
            return false;
    }

    return true;
}

Theoretically, this design should work once we implement the private functions. Take note that up until now, we have not had to deal directly with manipulating the string. The low-level implementation has been effectively separated from design of the actual process itself. The time has come to deal with these details, so here is an implementation:

bool SkipSpacing(const std::string& line)
{
    while (isspace(line.c_str()[_offset]))
        ++_offset;
    if (line.c_str()[_offset] == 0)
        return false;
    return true;
}

bool ScanCode(const std::string& line)
{
    size_t begin = _offset;
    while (isalpha(line.c_str()[_offset]))
        ++_offset;
    return MatchCode(std::string(line, begin, _offset-begin));
}

bool ScanNum(const std::string& line)
{
    size_t begin = _offset;
    while (isdigit(line.c_str()[_offset]))
        ++_offset;
    if (_offset == begin)   // were any digits scanned?
        return false;
    std::string number(line, begin, _offset-begin);
    _tokBuffer.push_back(static_cast<char>(atoi(number.c_str())));
    return true;
}

The only thing not accounted for now is this call to MatchCode(). This call is here to separate the actual string manipulation code from the code that will identify whether a string is actually an opcode or not. To complete the scanner, we will need to introduce the capability for this necessary identification.

Next : Matching