Upcoming Events
Unite 2010
11/10 - 11/12 @ Montréal, Canada

GDC China
12/5 - 12/7 @ Shanghai, China

Asia Game Show 2010
12/24 - 12/27  

GDC 2011
2/28 - 3/4 @ San Francisco, CA

More events...
Quick Stats
118 people currently visiting GDNet.
2406 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!
Link to us Events 4 Gamers
Intel sponsors gamedev.net search:

The C++ Standard Library Part 2
std::vector


Iterators and Algorithms

std::vector<> doesn't provide member functions to do everything. In fact, its member functions are only related to managing and accessing elements. For other kinds of functionality, std::vector<> provides iterators that algorithms can use to manipulate the vector's data. For example to sort a vector you can use the std::sort<>() algorithm found in the <algorithm> header.

std::vector<int> int_vector;
int_vector.push_back(6);
int_vector.push_back(2);
int_vector.push_back(3);
int_vector.push_back(9);
std::sort(int_vector.begin(), int_vector.end()); // sorts the vector so it now contains 2, 3, 6, 9

Iterators are generalizations of the pointer concept. std::vector<> has what are called random access iterators, the type of iterator that are most like regular pointers. 5) The begin() member function returns a std::vector<T>::iterator that refers to the first element in the vector. The end() member function returns a std::vector<T>::iterator that points to place one after the last element. Between the two, the iterators form a half open range, [begin, end), that describes all the elements in the vector.

std::vector<int> int_vector;
// fill vector
std::vector<int>::iterator first = int_vector.begin();  // itr refers to the first element of int_vector;
int i = *first;                                         // you can dereference it to read the value
*first = 5;                                             // or you can dereference it to write to it
std::vector<int>::iterator middle = first + 4;          // middle now refers to the fifth element (int_vector[4])
std::sort(first, middle);                               // sorts the first four elements
                                                        // (from first to the element before middle)
for (std::vector<int>::iterator i = int_vector.begin(); // this loop (or the moral equivalent of it) 
     i != int_vector.end();                             // appears in many algorithms. The i != int_vector.end()
     ++i) {                                             // is why you need an iterator to one after the last element
  // do stuff
}
std::vector<int>::iterator j = find(int_vector.begin(), // get an iterator into the vector that refers to an
                                    int_vector.end(),   // element whose value is 100. If there is no such element
                                    100);               // then j will be equal to int_vector.end()

With iterators, you can also insert and erase elements at arbitrary points inside the vector. When you use insert() the elements inserted are inserted before the iterator, or at the end, if you specify the end() vector.

int_vector.insert(int_vector.begin(), 5);  // inserts an element with the value 5 at the beginning of the vector
int_vector.insert(int_vector.begin() + 4, 10, 0); // inserts 10 elements with the value 0 between the fourth
                                                  // and fifth element of the vector
int_vector.erase(int_vector.begin()); // erases the first element of the vector
int_vector.erase(int_vector.begin() + 5, int_vector.begin() + 7); // erases the sixth and seventh elements of the vector
int_vector.erase(int_vector.begin() + 2, int_vector.end()); // erases everything from the third element down

However, iterators to elements in a vector may not stay valid. Some actions will invalidate iterators, which means that those iterators become unsafe to use. Also, when an iterator is invalidated, all pointers or references to the element that the iterator refers to also become invalidated. Whenever a reallocation occurs, such as when you insert() or push_back() more elements than the vector currently has capacity, all existing iterators become invalidated. Additionally, whenever you use insert(), all iterators that referenced elements after the point of insertion become unsafe to use, and erase() invalidates all iterators, pointers or references to the erased elements or the elements after the point of erasure.

In addition to producing iterators, some of std::vector<>'s member functions also accept iterator inputs. In particular, you can construct a vector from an iterator range, insert() elements from an iterator range into a vector or assign() to a vector from an iterator range. All these function take input iterators, which are all iterators that you can read from, including the std::vector<>'s iterators. 6)

std::vector<int> int_vector;
// fill the vector
std::vector<int>::iterator middle = int_vector.begin() + int_vector.size() / 2;
std::vector<int> first_half(int_vector.begin(), middle);
std::vector<int> second_half(middle, int_vector.end());
second_half.insert(second_half.end(), first_half.begin(), first_half.end());

Generally these iterator range functions are more efficient that calling the equivalent single element functions multiple times.

The Remove-Erase Idiom

One frequent operation for vector is removing all elements of a certain value. One way you can do that is repeatedly calling std::find<>() and erase() on the vector until find() returns end(). This isn't very efficient, and writing the loop necessary to do that is annoying. Another approach is called the remove-erase idiom. It looks like:

std::vector<int> int_vector;
// fill the vector
int_vector.erase( std::remove(int_vector.begin(),
                              int_vector.end(),
                              number_to_remove)
                  int_vector.end() );

Here's what it does: std::remove<>() takes two iterators that describe a range, and a value to remove from that range. It removes the elements by copying on top of the elements to be removed with values from later on in the range. So if you had values like:

1 2 3 4 2 2 3

and told it to remove all the 2's, the algorithm would do something like:

1 2 3 4 2 2 3
^
|
Here's the first one. It's not a 2, let's move on.

1 2 3 4 2 2 3
  ^
  | 
Here's the next one. It is a 2, let's find the next non 2.

1 2 3 4 2 2 3
  ^ ^
  | | 
The next one is a 3. Ok, copy that on top of the 2.

1 3 3 4 2 2 3
    ^
    | 
Now we need to copy something on top of 3's old position. Let's find the next non 2.

1 3 3 4 2 2 3
    ^ ^
    | |
This guy's a 4, that's not a 2. Let's copy that on top of the old 3.

1 3 4 4 2 2 3
      ^
      |
Now we need to find something to copy on top of the old 4's position.

1 3 4 4 2 2 3
      ^ ^
      | |
No good. That's a 2.

1 3 4 4 2 2 3
      ^   ^
      |   |
No good. That's another 2.

1 3 4 4 2 2 3
      ^     ^
      |     |
Ok that's not a 2. Copy it on top of the old 4's position.

1 3 4 3 2 2 3
        ^     ^
        |     |
Now we're out of inputs. So return the iterator after the last item we copied.

std::remove<>() doesn't try to erase the remaining elements from the container, because it doesn't know about the container, it only knows about the iterators it was given. Once remove copies forward all the non 2's, it returns an iterator to just after the range that has all the 2's removed. You can use this iterator to actually erase the elements from the vector, which is what the erase() call does. You need both, which is why this is called the remove-erase idiom.

Similar to std::remove<>(), there's also std::remove_if<>(). Instead of removing elements that are equal to a given value, std::remove_if() removes values if the satisfy a given criterea. This critirea is given passed to std::remove_if<>() as a function or function object argument. For example:

bool greater_than_five(int n) {
  return n > 5;
}
 
std::vector<int> int_vector;
// fill the vector
int_vector.erase( std::remove_if(int_vector.begin(),
                                 int_vector.end(),
                                 &greater_than_five)
                  int_vector.end() );

This removes all elements from the vector that are greater than five. The third argument to std::remove_if<>() is called the predicate. The predicate can be anything that you can call using operator() that takes the element type and returns something convertible to bool. So instead of the greater_than_five() function, you could do something like:

struct GreaterThan {
  GreaterThan(int n) : val(n) {}
  int val;
  bool operator()(int n) {
    return n > val;
  }
}
 
int_vector.erase( std::remove_if(int_vector.begin(),
                                 int_vector.end(),
                                 GreaterThan(5))
                  int_vector.end() );

This creates a special kind of object, called a function object, that you can call using operator(). In this case it return true of the value passed via operator() is greater than some value set during construction.

Another common use of std::remove_if<>() is to remove an element if one of it's member functions returns true. For example:

struct Cow {
  enum Color { Spotted, Brown };
  Color color;
  Cow(Color c) : color(c) {}
 
  bool is_brown(void) { return color == Brown; }
};
 
int main(int, char **) {
  std::vector<Cow> cow_vector;
  cow_vector.push_back(Cow(Cow::Brown));
  cow_vector.push_back(Cow(Cow::Spotted));
  cow_vector.push_back(Cow(Cow::Brown));
  cow_vector.push_back(Cow(Cow::Spotted));
  cow_vector.push_back(Cow(Cow::Brown));
  cow_vector.push_back(Cow(Cow::Brown));
  cow_vector.push_back(Cow(Cow::Brown));
  
  cow_vector.erase(std::remove_if(cow_vector.begin(),
                                  cow_vector.end(),
                                  std::mem_fun_ref(&Cow::is_brown)),
                   cow_vector.end());
 
  return 0;
}

This removes all the brown cows from the vector. In this case you use std::mem_fun_ref<>() to turn a pointer to the member function to something that can be called by std::remove_if<>() as it checks every element in the iterator range.

Const Vectors

Sometimes you may have a const vector or a const reference to a vector. This can happen for a variety of reasons, but often occurs when defining a const member function for a class type that has a std::vector<> member variable. When you have a const vector or const reference to a vector, then some of the functions change behaviour. For example, using operator[] on the vector no longer returns a reference to the element indexed. Instead it returns a const reference to the element, so you cannot assign to it. Also begin() and end() return std::vector<>::const_iterator instead of std::vector<>::iterator. This means that you can't use those iterators with algorithms that modify the vector, such as std::remove<>().


5) Pointers are actually a special kind of random access iterator. In fact, in some std::vector<> implementations, std::vector<T>::iterator is just a T *.

6) There are five classes of iterators: input, output, forward, bidirectional and random. All forward iterators are also input and output iterators. All bidirectional iterators are also forward iterators, and all random access iterators are also bidirectional iterators. std::vector<>::iterator is a random access iterator, so all std::vector<>::iterators are also input, output, forward and bidirectional iterators. More on this in the next article.





Tips, Tricks and Gotchas


Contents
  Introduction
  Iterators and Algorithms
  Tips, Tricks and Gotchas

  Printable version
  Discuss this article

The Series
  Language Features
  std::vector