Elementary Sorting Algorithms There is a suprisingly diverse collection of algorithms that have been developed to solve the apparently simple problem of "Sorting". The general sorting problem is simple enough to describe: Given an initially unordered array of N records, with one field distinguished as the key, rearrange the records so they are sorted into increasing (or decreasing) order, according to each record's key. Sorting algorithms are used in all kinds of applications and are necessary, for instance, if we plan to use efficient searching algorithms like Binary Search or Interpolation Search, since these require their data to be sorted. Sorting algorithms are often subdivided into "elementary" algorithms that are simple to implement compared to more complex algorithms that, while more efficient, are also more difficult to understand, implement, and debug. It is not always true that the more complex algorithms are the preferred ones. Elementary algorithms are generally more appropriate in the following situations: * less than 100 values to be sorted * the values will be sorted just once * special cases such as: o the input data are "almost sorted" o there are many equal keys In general, elementary sorting methods require O(N2) steps for N random key values. The more complex methods can often sort the same data in just O(N log N) steps. Although it is rather difficult to prove, it can be shown that roughly N log N comparisons are required, in the general case. Internal vs. External Sorting Normally, when considering a sorting problem, we will assume that the number of records to be sorted is small enough that we can fit the entire data set in the computer's memory (RAM) all at once. When this is true, we can make use of an internal sorting algorithm, which assumes that any key or record can be accessed or moved at any time. That is, we have "random access" to the data. Sometimes, when sorting an extremely large data set such as Canada's Census Data, there are simply too many records for them to all fit in memory at once. In this case, we have to resort to external sorting algorithms that don't assume we have random access to the data. Instead, these algorithms assume the data is stored on magnetic tapes or disks and only portions of the data will fit in memory. These algorithms use "sequential access" to the data and proceed by reading in, processing, and writing out blocks of records at a time. These partially sorted blocks need to be combined or merged in some manner to eventually sort the entire list. Stability When a sorting algorithm is applied to a set of records, some of which share the same key, there are several different orderings that are all correctly sorted. If the ordering of records with identical keys is always the same as in the original input, then we say that the sorting algorithm used is "stable". This property can be useful. For instance, consider sorting a list of student records alphabetically by name, and then sorting the list again, but this time by letter grade in a particular course. If the sorting algorithm is stable, then all the students who got "A" will be listed alphabetically. Stability is a difficult property to achieve if we also want our sorting algorithm to be efficient. Indirect Sorting One final issue to keep in mind when implementing a sorting algorithm is the size of the records themselves. Many sorting algorithms move and interchange records in memory several times before they are moved into their final sorted position. For large records, this can add up to lots of execution time spent simply copying data. A popular solution to this problem is called "indirect sorting". The idea is to sort the indices of the records, rather than the records themselves. --------------------------------------------------------------------------- Discuss this article in the forums
See Also: © 1999-2011 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
|