GameDev.net -- MMOG Considerations

Operating Platform Issues

Your 'operating platform' - server-side, at least - is the entire server system. It may well be spread over multiple machines, multiple processes and applications, and even multiple continents. What do you need to think about when designing the operating platform for your server?

Load. How many players do you expect a given element in the operating platform to have to handle at any one time? How much spare capacity do you want to build into the system - expecting and thus designing for 10,000 players in the beginning is fine, but if you later on want to expand your market share and thus attract new players, you're going to need to either have the capacity to handle them in the beginning, or come up with a concrete plan for extending the platform at a later point in time. (By a "concrete plan" I do mean something pretty specific; "buy another server" is not very good, while a detailed breakdown of what you will need, step-by-step instructions for integration into the platform, and how much you expect it to increase capacity by, is better). Also, be aware of the difference between total players and concurrent players - even if you have 25,000 total subscribers to your game, you may only have 3,000 players online at any one time. That may not affect the requirements for some parts of the system, but it can heavily affect others.

Distribution. How are you going to spread the load over multiple servers? Will you have one server per subsystem, or will all servers run all subsystems for some part of the game? (For example: will you have one server dedicated to physics processing and another dedicated to AI, or will both servers run physics and AI for separate parts of the world)? How will load distribution be controlled - will you use a single, central 'depot' server, or will individual servers 'pass off' excess work to each other?

Security. You have to provide a way for the outside world to connect to your servers; otherwise, none of your clients could ever log in. But you must be careful; every open port is another potential avenue of attack on your platform. Take the utmost caution when dealing with data from the client; your aim is to make it impossible for someone to crash your servers by sending in a carefully crafted packet. What measures will you take to verify the integrity of the data you're getting, or to protect routines that work with that data? Also, you need to protect your internal services - things like database servers - by keeping them on a network completely separated from the Internet; how will you architect your network to allow the necessary machines to communicate with them, without providing a route to them from the outside world?

Redundancy. Say you have a hardware failure, a power cut, or your security fails and someone manages to crash your servers. How will you go about restoring the world to a state as close as possible to before the problem? You can go for backups; dumping the entire world database to a tape drive every day may be enough for your purposes (and besides, there may be reasons beyond technology failures for keeping backups - such as responding to endemic cheating), but how precisely will that be achieved? Which parts of the system will be backed up?

Another thing to consider might be something like RAID - it means a higher initial cost (instead of buying one 120GB disk for the server, you're buying two or three plus a RAID controller) but it means you've got a second layer of defence before you have to go to your backups (if one disk fails, you can respond in a controlled way to replace it while the game continues to run - though you don't want to hang around, because the other disks could go too). If you do go with a RAID array, how many disks should you invest in? Which RAID level do you want? What's your plan of action when the RAID controller reports that one of the disks has failed (given that you usually can't hot-swap disks around)? How will you cope if the entire RAID array fails (the controller blows out)? Even if you do include a RAID system, it will only account for what's on disk and not what's in memory; so how will you restore what was in memory? If you're not even going to try, how will you minimize the data being kept in memory at any given time, and ensure that it gets written to the disk as often as possible without slowing down?

Synchronization. When you've got a system composed of multiple processes communicating with each other, you're bound to run into synchronization issues some time or another. It's particularly a problem for the database servers - what if two machines are accessing the same record at the same time? Do you implement a 'locking' system so that the first one there gets it and the others have to wait or drop out with an error? Perhaps you go for atomic operations? Or just allow it to happen and to hell with the consequences? The last approach might be perfectly valid if you don't think the chances of a given record being used by multiple other processes are high.

Technology issues

Contents

Introduction

Operating Platform Issues

Technology issues

Business issues

Printable version

Discuss this article