Upcoming Events
Unite 2010
11/10 - 11/12 @ Montréal, Canada

GDC China
12/5 - 12/7 @ Shanghai, China

Asia Game Show 2010
12/24 - 12/27  

GDC 2011
2/28 - 3/4 @ San Francisco, CA

More events...
Quick Stats
100 people currently visiting GDNet.
2406 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!
Link to us Events 4 Gamers
Intel sponsors gamedev.net search:

About the article

Anyway, learning the basics of SQL shouldn't take more than a very short while to get started. If you're used to programming and some of the theory behind data storage this tutorial will be easily understood.

However, I'm more than eager to help everybody learn so I will try to make the examples easy. This is an article for beginners; you don't need to know much to get started.

This article does not show you examples in languages like PHP or ASP. It only introduces you to the powerful language of SQL.

Introduction to SQL

SQL (Structured Query Language), pronounced "sequel" (from the former name SEQL, Structured English Query Language) is the language generally used to speak to any kind of database out there. Examples of databases are mySql (my favourite, it's free, it's fast, it's usable), Microsoft Access, Microsoft SQL Server, Oracle or literally any alternative in existence. ODBC is a driver that makes (or made, it's pretty outdated) it easier to connect to the database, but nowadays it's more common to connect directly to the database since that's generally faster.

SQL has been out there for many years and a decent SQL administrator can easily earn some serious money if he/she knows enough about it. Optimizing SQL queries is an easy way to secure your job future, since most sites want to have faster web pages (duh!). However, becoming an SQL expert takes years in the field and I wouldn't even consider myself to be an expert.

Databases are becoming more relevant to all fields of programming. Modern games connect to a global database (perhaps via a web interface) and get data for high scores, member access (ever played/watched EverQuest? A lot of database storage there my friends), etc. Also on the application front databases are becoming more and more popular on the users' edge. For instance budgeting applications, but also any other kind of application where you'd want a storage solution that is expandable, searchable and fast using as little memory as possible.

Creating tables and getting started

Most database today come with a user interface which is like any windows/kde/gnome application out there, making it easy to watch, add, edit and remove tables in the database. Tables are the "groups" in the database where information is stored. Normally you start off with a name for the database (most software allows you to have several databases within the product). Call your database the name of your webpage for instance, to make it easier to relate the database to the software that is using it.

The tables in the database are containers of data. For instance, for a site like GameDev.net, the names of the tables might be "Member", "News" and "Article".

You can add tables through SQL queries but personally I think it is more trouble than its worth, so use your favourite GUI to try it out. If you are on Windows I really recommend you try Microsoft Access just because it's so easy to get started with. If you are on Linux/BSD I would suggest you to download one of the graphical interfaces available for mySql.

Inside the tables there are entities. Just as in a programming language, entities are like variables. There are different types, and the types are named differently between different databases but I'll try to describe what they should be and you can try to find the equivalent in the software you choose to use. They also have a name, and of course the data which is stored in them. So, to sum up, Name, Type and Data.

When you design your tables you only need to enter the Name and the Type, data will be added later. For now, we will do a simple news system for our favourite homepage. I'll try to be as brief as I can; please read carefully and try to understand each part I'm talking about.

Let's take a look at the following graphics for the tables and try to implement it in your database; I'm explaining each fuzziness under the image.

The member table

As we would like to store the member's name, password, email and URL to a web page (if the user has such), we need to store just that in the database.

First of all, the ID field hits you. Is this auto generated? Well, in a sense it is, in another it isn't. You need to create it yourself, but you should make sure you find the right data type for it. In Access you could choose "Autonumber", and while you're at it, make sure it runs with "no duplicates" and check that "required" is also set to yes.

The ID field you just created will need to be a primary key. I showed this with an empty circle in front of the name. There is only one primary key per table, and personally I think it should ALWAYS be a separate field called ID. The big thing about a primary key is that it has to be unique. You might think "yeah, but the email address is unique, can't I use that?". Well, you can. But I wouldn't recommend it. First of all, the email address is a text string containing up to 50 characters in our example. Making the database need to connect tables through textfields is not only slow, but a fairly risky process since the foreign keys (I'll talk about them in a moment) will need to be more complex. Second, it will also require a lot more space due to these foreign keys. Back to them later.

By the way, you can set any entity of the table to required and "allow zero length" to true, etc. This is up to you - how much you want the database to control for you. Personally, I do this check in the WebPages instead, not by checking error values from the database. I guess this is a matter of taste and also a matter of how you are using databases. Try different methods out and use whatever works for you.

The memberName, memberPassword, memberEmail and memberWeb are named using a "member" prefix, because when you do complex queries, it's nice to know which table uses which entities. The ID's are an exception to this, if I want to know which table they belong I will have to write [Table].id to get it. More about that later.

Each of these entities is a text field with a maximum length set to 50. This does not only restrict users from entering names that are too long, but also saves storage (in theory yes, in practical database implementations a string never takes up more than length(data_text)+1 so to speak, but you don't need to bother about this, it's good to set a limit anyhow). I guess the database also figures out a good strategy for saving the data intelligently according to the limits of each entity too so setting everything up is quite a good idea.

That's basically the member table. A good idea to speed up databases is to put the entity field "index" to true on such entities that are likely to be used in comparisons with other tables in the database. Primary and foreign keys are especially targeted.

The news table

Here we also have an ID field just as with the member table. Having the same name and settings on these fields in all tables simplifies things since we never have to check the database or write down what the settings are on each of these tables.

Second you will see a field called memberID, which is our first foreign key (open up the champagne ladies and gentlemen)! This is a reference (like a link, if that makes you more comfortable) to the member table. In our case it's a reference to the member table because each news item also has an author. Now, why didn't we call this authorID, memberAuthorID or something similar? Well, I don't know. Once again, you have to make up your own mind here, I've just stuck with calling references to other tables [Tablename]ID. Of course, there can be several links to the member table from other tables. For instance all of the following could be members of an article table: memberCreatorID, memberAuthorID, memberAuthorizationID etc, for the one who created the article item in the article table, one who is actually responsible for the contents of the article, and one that verifies the article for posting on the web. When I write memberID, I'm referring to the one who created the news item, author or not.

The newsSubject entity is the same type as the member specific entities, only this entity can be 255 characters long.

The newsBody item is a new one on the heap. The memo type is from Access and simply describes the type for a whole lot of text. I believe varchar is used for text in mySql mySql and memo in Microsoft SQL Server, but I think you get the idea. This is storage that in practical use doesn't have any limits. Great for storing bodies in news items for sure!

Last but not least is the newsDate. This datafield contains the date when the table row (data is stored in "rows", entities can be seen as "columns") was created. It's a good idea to automate this feature so you don't have to enter the date yourself in your SQL queries. This is how you do it (one more time in Access - forgive me, but it's easy to start off in Access!) In the "Default" field of the entity newsDate, enter "Now()". Now() is a method that gets the date. So when the row is created, instead of entering the date through the query, it will set the date automatically. Sweet!

Tables done

Now the tables are done. You could try to enter some data through the graphical interface, but please remove them when you're done so we can continue with the article without confusion. Btw, note how nicely the dates are generated in the news table.

In most databases, there is something called relations between tables. This is so you can set up the foreign keys to point at the primary keys of the targeted table. This is a safety issue. While we are just messing around I don't think we want to set these, but if you're building a serious web site you should consider using this. It simply restricts you from removing the user "George Smithers" if he has written news items. In that case, you will simply be forced to first remove all the news items he has written, and then, after that is completed, remove the user himself. This stops the possibility of removing a user and having unreferenced material in the database (trash so to speak). As I said, when messing around, it doesn't matter what you do, but when you gain experience, you should really look this up because it will save you some database space in the long run, trust me.

Now when we're done setting up the tables (we should be up to 60 minutes by now, damn me, hehe).



  Printable version
  Discuss this article