The C++ Standard Library Part 2
In this article I will introduce the basics of the std::vector<> class. This is one of the most frequently used classes in the standard library. It's also one of the first classes that people not familiar with the standard library are recommended to use.
In C++, there are several ways to use the new operator. There's the familiar new MyObject(); which allocates a single MyObject. There's also new MyObject[n]; which allocates n different MyObjects in a dynamic array. While dynamic arrays are useful, they are also error-prone and can be cumbersome to use. For instance, it's relatively easy to generate a memory leak by forgetting to delete  the array. Or you may accidentally call delete instead of delete  which may cause any number of problems. 1)
The std::vector<> class is a template container class that provides the functionality of a dynamic array that is less error prone, easier to use and largely just as efficient as manually manipulating dynamic arrays.
Standard Library Conventions
The standard library is exposed through a number of header files. std::vector<> lives in the <vector> header file. So when using std::vector<> you would do:
This is not #include <vector.h>, but #include <vector>. Similarly when you include any other standard library header file there should also be no extension.
#include <list> #include <algorithm> #include <iostream>
The C++ Standard Library header files inherited from C are also extensionless. However, they undergo a name transformation: take off the .h at the end and add c to the beginning. For example #include <cstdio> instead of #include <stdio.h> or #include <cstdlib> instead of #include <stdlib.h>.
As mentioned in the last article, almost all of the components of the C++ standard library are found in namespace std 2). In this article series I generally do not use using directives or declarations and preface all the identifiers with std::.
Another thing you may notice about the standard library is that it uses all lower cases identifiers for classes, member function and basically any identifier that isn't a macro. There are historical reasons for this practice; this is not an example that you should copy in your own code.
The Basics of std::vector<>
As I mentioned, std::vector<> is defined in the <vector> header file. As a template class it has two template arguments: the type of element the vector holds, which I will often refer to as T in this article, and an allocator. The allocator argument defaults to std::allocator<T>. 3)
std::vector<int> int_vector; // vector of ints std::vector<int, std::allocator<int> > another_int_vector; // same type as int_vector std::vector<int, boost::pool_allocator<int> > a_third_int_vector; // a different type from the other two
Most of the time the default allocator argument is fine, but knowing it is there is useful when deciphering error messages when you have a compiler error involving std::vector<>. More on that later.
There are a number of different ways to construct a std::vector<>. Some of them are:
std::vector<int> int_vector; // no constructor arguments, creates a vector of size 0 (no elements). std::vector<float> float_vector(5); // single integer argument, creates a vector of size 5 (5 elements). std::vector<double> double_vector(5, 3.14159265359); // integer argument followed by T, creates a vector // of size 5, where each element is 3.14159265359 std::vector<double> another_double_vector(double_vector); // copy constructor
A std::vector<> can hold types other than primitives.
std::vector<int *> int_ptr_vector; // holds pointers to integers std::vector<MyObject> my_object_vector; // holds my_objects std::vector<std::vector<int> > int_vector_vector; // holds a vector of ints
However, a std::vector<> can't hold just any kind of object. In order to be stored in a std::vector<> a type needs to be copy constructible and assignable. There are formal definitions for this, but basically it means that copy construction and assignment should do what you expect.
const T & t = /* get a reference somehow */; T u(t); // this should compile and t and u should be equivalent afterwards const T & v = /* get a reference somehow */; u = v; // this should compile and u and v shold be equivalent afterwards
In general, most types will be valid to put inside a std::vector<>, including most user created types 4). Notable exceptions include std::auto_ptr<> and most classes that have const or reference member variables.
And unlike operator new, the type stored in a vector does not need to be default constructible. However, if it is not default constructible, some member functions won't work like you may expect, like the std::vector<> constructor that takes a single integer argument.
Once you have a vector, you can access the elements in the vector via operator, like you would an array.
std::vector<float> float_vector(5); // single integer argument, creates a vector of size 5. float_vector = 3.0f; float_vector = -1.1f;
std::vector<> also provides special member functions to get a references to the first and the last elements in the vector.
float f = float_vector.front(); // equivalent to float f = float_vector float_vector.back() = f; // equivalent to float_vector = f; // though it changes properly if you add or remove elements
You can add a new element to the end of the vector (expanding its size) by using the push_back() member function. And you can remove elements from the end of the vector (shrinking its size) by using the pop_back() member function.
float_vector.push_back(4.0f); // puts 4.0f at the end of the vector, size is now 6 float_vector.pop_back(); // removes the last element of the vector, size is now 5 float_vector.pop_back(); // removes the last element of the vector, size is now 4
push_back() requires you to supply an object to copy into the vector; you can't call push_back() by itself.
Size and Capacity
In addition to the actual elements in the vector, std::vector<> keeps track of two more details: the vector's size and its capacity, which you can access via the size() and capacity() member functions. The size() function returns the number of elements that the vector contains. The capacity() member function returns how many elements the vector can store before it needs to allocate new memory.
std::vector<int> int_vector(4); // vector of 4 ints size_t size = int_vector.size(); // returns 4 size_t capacity = int_vector.capacity(); // This is implementation specific, it can be any number // greater than or equal to 4. Let's pretend this is 8 int_vector.push_back(5); // adds an element, size is now 5, capacity still 8 int_vector.push_back(3); // adds an element, size is now 6, capacity still 8 int_vector.push_back(9); // adds an element, size is now 7, capacity still 8 int_vector.push_back(4); // adds an element, size is now 8, capacity still 8 int_vector.push_back(1); // adds an element and causes vector to allocate new memory // size is now 9, capacity is implementation defined
When you add new elements that cause the capacity of the vector to be exceeded, the vector needs to allocate new memory large enough to hold all the new elements plus some room for later growth, and then copies all the elements from the old memory to the new memory. When a push_back() call causes the vector to grow, the vector usually allocates a new capacity somewhere between 1.5 to 3 times the old size. All this work means that reallocations are very expensive.
However, if you know in advance that the vector is going to grow to a given size, you can call the reserve() member function to tell the vector to allocate enough capacity for the number of elements you expect. For example: if you're reading data from a file and the first line tells you how many elements to process.
std::ifstream ifs("myfile.txt"); int number_of_lines; ifs >> number_of_lines; std::vector<int> numbers; numbers.reserve(number_of_lines); // read each line into the vector
You can also change the size of a vector by calling the resize() member function. std::vector<>::resize() takes two arguments, the first one is the new size, and the second one is the an object to copy into any new elements added to the vector. This defaults to T(). You can also call the clear() member function to erase all elements in the vector.
std::vector<int> int_vector; // size 0 int_vector.resize(5, 4); // resizes the vector to size 5 with all the elements equal to 4 int_vector.resize(10, 3); // resizes the vector to size 10 with all the new elements equal to 3 int_vector.resize(6, 2); // resizes the vector to size 6, the 2 is ignored int_vector.resize(20); // resizes the vector to size 20, all the new elements are equal to int() or 0 int_vector.clear(); // clears all elements, size is now 0
A related member function is assign(). When you pass a integer, n, as a first argument and a const T reference as the second argument, assign() fills the vector so that it contains exactly n elements that are copies of the reference.
std:vector<int> int_vector; int_vector.resize(5, 100); // vector contains 5 elements each with the value 100 int_vector.assign(4, 10); // vector now contains 4 elements each with the value 10
Also, std::vector<> also supplies an empty() member function, which returns true if and only if the vector has 0 elements. This may be faster than using size() == 0, depending on the implementation.
1) Such as not calling the destructors for all the elements in the array or corrupting the heap.
2) exceptions being macros and the like
3) Technically, the standard library implementation may also add additional template arguments following these two as long as they have default arguments.
4) However, you should be careful that you implemented copy construction and assignment properly for user defined types you put in a std::vector<>.