Vector Wish List

Our most tangible goal for this chapter is the creation of a generic, random access, container class. A container class is one that is intended to be used to hold collections of objects of some type, called the element_type of the container. A generic container class is one that is re-usable across a wide variety of element_types without any modifications to the container class code. A random access container is one in which all elements can be accessed efficiently using a bracket operator, as with arrays. (In later chapters, we will make the relationship between requirements and efficiency more precise. For now, it is sufficient to understand that random access means that bracket operators exist and that they run in a roughly constant amount of time that is independent of the position of the element being accessed.)

Other goals for this generic class include index bounds checking (preventing clients from using out-of-bounds index values), safe memory management, and giving the client the ability to re-size the container. We will call this container a vector class.

A good way to start defining a generic class is with a wish list of desired properties. The first item on the list is that it be a generic class, so that we can re-use the code. Now, if code is to be re-useable, it is worth spending some time making it really useful (as in, people really use it). Because we envision the vector template being a replacement for arrays, a good place to start is with properties that we wish arrays had, but don't. Just imagine the ways you can get into trouble using arrays, or the ways that arrays are inconvenient, and correct them!

A vector [array] wish list is shown in this slide. It would be nice if we could set the size of an array at runtime, and maybe change the size later if the first choice is inadequate. It would be nice if we would be told when we access an out-of-bounds array element. An assignment operator that actually works would also be helpful. It would be nice if memory management (using operators new[size] and delete[]) would go away. And, of course, we want vectors to be a generic class. Then we could write code that is more readable, safer, and more convenient. For example, here we declare vectors of various types:

vector < int >            intVector (30);      // declare vector of int           size 30
vector < char >           charVector (20);     // declare vector of char          size 20
vector < widget >         widgetVector (20);   // declare vector of widget        size 20
vector < pair < char > >  charPairVector (10); // declare vector of pairs of char size 10

And, later, change the sizes as follows:

intVector.set_size (50);        // expand to size 50
charVector.set_size (60);       // expand to size 60
widgetVector.set_size (70);     // expand to size 70
charPairVector.set_size (80);   // expand to size 80

We will design our vector class to behave exactly this way. The itemized wish list essentially becomes a set of requirements for the public interface of the vector class:

Templated for vectors of any proper type T
Set size at runtime
Change size at runtime
Safe bracket operator, with syntax modelled on ordinary arrays
Assignment operator
Constructors
Copy constructor
Destructor
Automatic memory management

A person making up a wish list for a class may not think about the need for constructors and destructors, but this person would get into immediate trouble without them. After figuring out why, the person would register strong and justified objections to the class developer. People rightfully expect that objects know how to keep their house in order and do it without explicit commands to do so. Any class in which dynamic memory is involved will need to give its objects the ability to allocate memory as they are created and de-allocate it as the objects go out of existence. For a class developer to provide less is to fail in the prime directive: write correct, robust code. Therefore -- we require constructors and a destructor for the vector class.

fsu::Vector<T> Interface

A wish list is a guide toward the two principal elements of class design: (1) the public interface and (2) the implementation plan. The former consists of the public methods and public operators (and public data, in the rare case when public data is appropriate), while the latter consists of protected data, protected methods, and a data management plan. The public interface is the way objects are used by client programs (and programmers), and thus determines the utility of the class. The implementation plan ultimately determines the functionality and efficiency behind the public interface. Both, together, determine the success or failure of the class as a utility.

Our vector wish list implies the following public methods and operators:

Bracket operator [] (), with syntax and semantics modelled on the array bracket operator
Assignment operator = ()
SetSize ()
SetCapacity ()
Constructors

Default constructor
Constructor taking a size parameter
Copy constructor

Destructor

It is natural, and expected, that we software designers take the attitude that the person requesting the software may not ask for all he or she needs, and therefore we have to save the requester from this lack of foresight. Before donning a cape and rushing to improve every requester's wish list, however, consider the following two points:

The public interface of a class can be safely expanded without invalidating any client program, because the interface used by that client is still there.
Restraint and judgement must be exercised when expanding the interface, because the human user will not like an interface that is too complicated, nor will the user like objects that are slowed down by unneeded feature bloat.

Thus there is a healthy tension between richness and leanness in class design, driven by two competing notions of efficiency: it is efficient to make re-usable components more capable, because it enhances the ability of programmers to create correct code quickly; it is also efficient to create re-usable components with only the features that are needed, because unused features bloat the code and slow down the run of the program. We will find various reasons to consider such an expansion of the vector class in this chapter and following chapters.

Vector <T> Implementation Plan

Next, we need to decide on a basic implementation plan, including protected data, protected methods, and an approach to implementation of the public interface. Clues to how this might be done may also come from the wish list. For example, if a vector object is to manage memory for the user, it will need a protected place content in which to store vector elements. (Content will need to be a primitive C array.) Because, as in ordinary arrays, the client's vector may have fewer items actually stored than currently allocated memory can hold, the vector object will need to distinguish notions of size (i.e., the number of elements the client is actually using) and capacity (i.e., the number of elements that the client could use with the current allocated memory).

Maintain a protected C array content where elements are stored
Maintain two protected data fields: size and capacity

capacity is the size of the array content
size is the number of elements in the client's vector object
size <= capacity

Define a protected method that safely allocates memory for content

Memory allocation is likely to be done in the implementation of several methods; this method will isolate how memory allocation is done in one place

Manage memory and data for the user when appropriate

When capacity must be increased or decreased, new memory must be allocated, content must be copied from the old allocation to the new allocation, and the old allocation must be de-allocated
Constructors allocate memory
Destructor de-allocates memory

Build safety check into bracket operator

With the public interface and implementation plan, we can now write the class definition.

Defining Class Vector <T>

To clearly distinguish the container classes used in this course from those in the Standard Template Library, we will:

Capitalize the first letter of the class name, as in Vector, Deque, List, Stack, .
Use the classic style file naming conventions for header files, as in vector.h, deque.h, list.h, stack.h, queue.h.
Put the classes in the fsu namespace.

In addition, please note that the actual code in the library usually has extra features not discussed in the lecture notes. Thus it is best to use the notes as a guide into the fsu library. but always go to the actual header files for a complete version of an API. In particular, The distributed version of fsu::Vector<T> has several features not fully explored in this chapter, including: container class protocol, iterator support, and special display methods. These will be discussed later as needed. For now, let's look carefully at the class definition displayed in the slide:

template <typename T>
class Vector
{
public:
  // scope Vector<T>:: type definitions
  typedef T ValueType;

  // constructors - specify size and an initial value
  Vector  ();                      // vector of size = 0 and capacity = defaultCapacity
  explicit Vector (size_t sz);     // vector of size = capacity = sz ...
  Vector  (size_t sz, const T& t); // ... and all elements = t
  Vector  (const Vector<T>&);     // copy constructor
  virtual ~Vector ();              // destructor

  // member operators
  Vector<T>& operator =  (const Vector<T>&); // assignment operator
  Vector<T>& operator += (const Vector<T>&); // expand to append argument
  T&          operator [] (size_t);            // bracket operator
  const T&    operator [] (size_t) const;      // const version

  // other methods
  bool     SetSize     (size_t);    // set size as specified, change capacity iff needed
  bool     SetSize     (size_t, const T&); // ... and initialize new elements
  bool     SetCapacity (size_t);    // force capacity change (up or down)
  size_t   Size        () const;    // return size
  size_t   Capacity    () const;    // return capacity

  // Container class protocol
  bool     Empty       () const;    // 1 iff empty
  bool     PushBack    (const T&);  // expand by 1 new element appended at end
  bool     PopBack     ();          // contract by 1 from end
  void     Clear       ();          // make size = 0
  T&       Front       ();          // return front element (index 0)
  const T& Front       () const;    // cont version
  T&       Back        ();          // return back element (index size - 1)
  const T& Back        () const;    // const version

  // Iterator support
  typedef      VectorIterator<T> Iterator; 
  friend class VectorIterator<T>;
  Iterator     Begin       () const;
  Iterator     End         () const;
  Iterator     rBegin      () const;
  Iterator     rEnd        () const;

  // Generic display methods 
  void Display    (std::ostream& os, char ofc = '\0') const;
  void Dump       (std::ostream& os) const;

  // overload of const T* operator, facilitates use of previously defined array functions
  // operator const T* () const; // auto conversion of vector to array
  // removed 11/10/04: new standard does not allow, creates ambiguities

protected:
  // variables
  size_t size_,        // current size of vector, 
         capacity_;    // size of content_ array
  T*     content_;     // pointer to the primative array elements

  // method
  static T*  NewArray (size_t);  // safe space allocator
} ;

Constructors and destructor
There are three constructors declared: (1) a constructor with no parameters, (2) a constructor taking a size parameter, and (3) a copy constructor. There is an interesting situation regarding (1), the constructor with no parameters.

If no constructor is specified for the class, then a parameterless constructor for the class will be created automatically by the compiler
If any constructor is specified for the class, then no constructor will be created automatically; in particular, if any constructor is specified, but no parameterless constructor is specified, then there will be no parameterless constructor
The parameterless constructor is often called the default constructor, whether or not it is automatically created
Vectors of objects cannot be created unless the class has a default constructor

All this boils down to the following advice: if your clients need to make arrays (or vectors) of objects of a given class, then the class needs a default (parameterless) constructor, which needs to be explicitly specified in the class if any other constructor is specified. The vector class requires specified constructors, because object creation requires dynamic memory allocation.

A copy constructor is a bit mysterious at first. Simply recognizing one can be difficult, until you know the pattern:

  X (const X& );

is the prototype of the copy constructor for class X. No other pattern can be a copy constructor. Note the features of this pattern. Like other constructors, there is no return value type; the name of the method is the same as the class name; the single parameter, passed by reference, is an object of type X, the same class name; and finally, the parameter is passed as a const reference, which means "read only", i.e., the referenced object cannot be modified by the call of the function.

The copy constructor is invoked implicitly whenever client code implies that an object copy is needed. (Explicit copies are usually made by invoking the assignment operator.) The two ways in which implicit copies of objects are required:

When an object returned as a value by a function
When object parameters are passed by value to a function

In both these instances, an object must be copied, and the copy constructor is invoked. Every class is required to have a copy constructor. One will be created by the compiler if none is specified in the class. The issue of whether the compiler-created default constructor and copy constructor are appropriate for a given class is a subtle one. [See Dietel, Chapters 6, 7.]

Assignment operator
The assignment operator is used explicitly by client programs to make copies of objects. As with default and copy constructors, every class must have an assignment operator, and one is therefore created by the compiler unless one is specified in the class definition. Typically, the issues of whether the compiler-created assignment operator is appropriate are the same as for the copy constructor and default constructor. Both the copy constructor and the assignment operator must make an object copy. It is often useful to isolate the object copy code with a protected method, designed to be called by both the copy constructor and the assignment operator.

Note that the assignment operator takes a parameter with the same specification as the copy constructor. Note also that the assignment operator has as return type a reference to an object. We will discuss the import of this under implementation. The parameter specified for the assignment operator (and any other member operator) is really the second parameter of the operator, the first being the object that is making the call. To help understand this point, look at the following code fragment:

  Vector <char> v(2), w;
  v[0] = 'a';
  v[1] = 'b';
  w = v;

The first line declares two vectors of char, v has size 2 and w has no specified size. (The default constructor is called for w and the size-specific constructor is called for v.) The next two lines establish values for the two elements of v using the char assignment operator. The last line assigns v to w using the Vector<char> assignment operator. If we then compiled and ran the code

  v.Dump(cout);
  w.Dump(cout);

we would see on the screen:

  (a,b)
  (a,b)

In other words, w would be an exact copy of v, including size and content. This is exactly how we would expect assignment to work. Here is what is going on behind the scenes:

The line w = v is really w calling its assignment operator with parameter v. (In fact, a member operator can be called using function syntax. The following two lines of code have identical semantics:

  w = v;
  w.operator = (v);

The first line follows the familiar operator syntax, while the second line follows operator function syntax. Each of these lines amounts to the same thing -- a call by w to its assignment operator member with parameter v.)

To be assigned the value v, w must first prepare itself to receive new data, and then copy data from v to itself. You should come back to this point when we get to the implementation of assignment.

Bracket operator
The bracket operator is another member operator that takes one (explicit) parameter, an index value, and returns a reference to the vector element at that value. Because the return value is a reference, rather than a copy, it can be used on the left of an assignment and the actual vector element will be re-assigned a value. Here is the bracket operator used in this way, followed by the operator function syntax for the same call:

  w[1] = 'c';
  w.operator [] (1) = 'c';

First, w calls its bracket operator with index 1, returning a reference to the index 1 element of w, which is then assigned the value 'c'. Note that this is the char assignment operator here. A line of code such as

  w[0] = v[1];

results in calls to three different operators: First, w calls its bracket operator with index 0, returning a reference to the index 0 element of w. Second, v calls its bracket operator with index 1, returning a reference to the index 1 element of v. And third, the char reference on the left calls its assignment operator with parameter the char reference of the right.

Sizing methods
Clients often need to set the size of a vector object explicitly. More rarely, clients may need the ability to reserve memory of a certain size by setting vector capacity. They will therefore also need to be able to discover the current size and capacity of a vector object.

Container class protocol
These methods are part of a small set of standard methods that we will call collectively the container class protocol. The need for this pattern will become evident as we invent additional container classes and investigate abstract data types. For now, it suffices to understand what these methods do.

Empty() returns true (one) iff Size() returns zero
PushBack() adds an element to the container at the "back", increasing size by one
PopBack() removes the element at the back and decreases size by one
Clear() removes all elements and sets size to zero
Front() returns a reference to the front (index 0) element
Back() returns a reference to the back (index size - 1) element

A few more operators will be added to the container class protocol in subsequent chapters.

Display methods
Display(os, ofc) sends the vector elements to the ostream os, with the char ch between each pair of elements (or nothing between, if ofc == '\0'). Dump(os) sends a structural picture of the object to os. Dump() is primarily used in class testing, and not indended for client use. It may be protected or removed entirely prior to public release of the class.

Protected data
As planned, we have three protected data items. Size and capacity denote the current size and memory storage capacity of a vector object. Content is a pointer to the element type, where an array of type T can be stored.

Static method NewArray()
This method meets our stated goal of isolating memory allocation in one place. We will discuss its implementation below. The only point to discuss here is the static keyword. Static methods have several interesting properties:

Static methods are associated with the class, not particular objects
Static methods do not have an implicit object parameter
Static methods may be called by an object using regular "dot" syntax or using the scope resolution operator without an object.

We will return to these points again, and you can read up on static class members on pages 441 (for data) and 485 (for methods) of Deitel.

Non-Member Operator Overloads

We will overload three operators for class Vector<T>. Often, overloaded operators require friend status, but in this case they do not, hence they are declared outside the class definition.

template < typename T >
ostream& operator << (ostream& os, const Vector<T>& a);

template < typename T >
bool     operator == (const Vector<T>&, const Vector<T>&); 

template < typename T >
bool     operator != (const Vector<T>&, const Vector<T>&);

Be sure to review the concept and use of friend status in your C++ references.

The File vector.h

The class definition and its operator overload prototypes are contained in the file vector.h. The contents of this file follow a standard pattern. First, there is a file header, containing information about the file and its contents. The first line of the file is the name of the file. You might also expect to find a date and the name of the creator in the next two lines. Following this basic identification is a documentary statement about the file contents. (This is important information provided by the creator for the user.)

The rest of the file, the body, is code intended to be understood by either the preprocessor or compiler. The first non-comment line of the file is usually a preprocessor command designed to protect against multiple definitions of the same class or function. A typical trick is to use the name of the file in some mutated way as a preprocessor identifier, and conditionally enter the actual code. The pattern we would like to follow in this course is:

/*  vector.h
    author name
    creation date
    (optional modification dates)

    (general documentation for the code contained in the file body)
*/

#ifndef _VECTOR_H   // protect against multiple reads
#define _VECTOR_H

namespace fsu        // namespace for the fsu code library
{

(code goes here)

#include <vector.cpp> // include separate implementation file inside namespace
} // end namespace
#endif

The include statement just before #endif is used only if some or all of the implementation code is placed in a separate file.

Note that the effect of the last include statement is to make it appear that all of the code (declaration, definition, and implementation) is in the header file. This is the standard for template code, because template code cannot be compiled to object code without first substituting a real type for the typename parameter(s) in the template.

The File vector.cpp

We have made the choice of putting the implementation of the Vector<> code in a separate file. This choice is made only for convenience. When vector.h is included in any other file, the implementation file is automatically also included. We say that the implementation is logically in the same file as the class definition. Thisis the accepted practice for template code: implementations (whether template functions or template classes) are logically part if the header file that contains the function prototypes and/or class definitions.

/*  vector.cpp
    author name
    creation date
    (optional modification dates)

    (general documentation for the code contained in the file body)
    
    "slave" file: this file is a logical extension of vector.h
*/

// protect against multiple reads provided by master file
// namespace provided by master file

static const size_t  defaultCapacity = 10;

(implementation code goes here)

(end of file)

The .cpp file follows the same pattern as that described above for the .h file. The file header usually contains information about the implementation rather than client documentation. The body contains the code implementing all of the methods and functions defined in the corresponding .h file. We now proceed to implement the class Vector<T>.

Implementing Vector<T> NewArray()

This is a protected static method, which amounts to having a stand-alone function whose scope is limited to the two files vector.h and vector.cpp and which can only be accessed by members of the class. Using the static keyword seems an elegant way to accomplish this.

Before getting into the details of the implementation, it's worth looking at the form of the implementation. This is a pattern that will repeat itself often. The pattern is:

  template < typename T >
  ReturnType ClassName < T > :: MethodName (parameters)
  {
    // function body
  }

which is the same as the pattern for ordinary function templates, with scope resolution added for the method name. Scope resolution is necessary to identify properly the function name as a method in the class.

The NewArray() method is a key to the entire implementation, hence we are discussing its implementation first. The task to be performed is to allocate new memory for the content array. The benefit of isolating this task in a protected method is mainly one of organization: we could repeat essentially the same code in various places scattered around the implementation file. The disadvantages of doing so are, first, we would have a gaggle of code fragments that should all operate the same way, so if one is upgraded we must search for all the others and upgrade them as well. Second, we might even get lazy or careless and implement one or two of these incorrectly, creating memory allocation bugs that would be hard to track down. Having the allocation isolated here will provide a place to upgrade uniformly, a place to change the memory allocator, and a place to bring in exceptions in future software releases. Here is the code:

template <typename T>
T* Vector<T>::NewArray(size_t newcapacity)
// safe memory allocator
{
   T* Tptr;
   if (newcapacity > 0)
   {
      Tptr = new(std::nothrow) T [newcapacity];
      if (Tptr == 0)
      {
         std::cerr << "** Vector error: unable to allocate memory for array!\n";
         exit (EXIT_FAILURE);
      }
   }
   else
   {
      Tptr = 0;
   }
   return Tptr;
}

The implementation is not complicated. First look at the incoming parameter to see if it is zero; if it is, we need do nothing. Otherwise, we need to request a new allocation of memory, check to see if our request has been granted, return a pointer to this new allocation if it has been, and return zero (a legitimate pointer value) otherwise. This version also writes a warning message to cerr, "the error channel", if memory allocation has failed. Thus, both the user and the client software are made aware of failure. This is a simple implementation that provides safety and can be upgraded easily in the future.

One point to note is the use of the argument std::nothrow for operator new. This design choice prevents allocation failure from throwing an exception, thus confining handling the problem to this location.

Implementing Vector<T> Constructors and Destructor

The constructors provide the first uses of our memory allocator NewArray(), and it is a very good use: they become one-liners!

Before going further, do you remember that constant declared in file vector.cpp, the one called "defaultCapacity"? We didn't mention it at the time, just slipped it in. It is also declared using the keyword static. This is the legacy C use of "static", which means file scope. Note also that defaultCapacity is a constant, so even though it appears to be exposed, it is actually quite a safe device. Here are key properties of static const declarations (outside of classes):

They are visible only from within the file in which they are declared
They must be given a value at the start of the program -- either in the file or by every constructor
Once given a value, the value cannot be changed

So, we have this constant defaultCapacity. Guess what? It becomes the default capacity of a vector object when no other is specified! The default constructor (the one with no parameters) makes a vector object with capacity_ = defaultCapacity and size_ = 0. These values are established in the initialization list before the body of the method. The method body consists only of a call to NewArray() to establish the content_ array.

template <typename T>
Vector<T>::Vector() : size_(0), capacity_(defaultCapacity), content_(0)
// Construct a vector of zero size and default capacity
{
  content_ = NewArray(capacity_);
}

The constructor with two parameters requires an initial size and an initializing value for the vector elements. In this case we set the initial size and initial capacity to the client-supplied size argument, then initialzie the vector elements one at a time:

template <typename T>
Vector<T>::Vector(size_t sz, const T& Tval) : size_(sz), capacity_(sz), content_(0)
// Construct a vector of size and capacity sz, each element initialized to Tval
{
  content_ = NewArray(capacity_);
  for (size_t i = 0; i < size_; ++i)
    content_[i] = Tval;
}

A constructor with one parameter for size has been added to the class in the code library. One addition that you will see in the distributed version of vector.h is that this constructor is preceded by the keyword explicit. This keyword prevents the constructor from being called implicitly, as might otherwise happen when the compiler is trying to make sense out of a statement like "v = n", where v is a vector and n is an integer. This would certainly represent a typing error, but the existence of a 1-parameter constructor would allow the compiler to actually make sense of the statement, setting the size of the vector v to 1. The explicit keyword prevents such implicit type casts.

template <typename T>
explicit Vector<T>::Vector(size_t sz) : size_(sz), capacity_(sz), content_(0)
// Construct a vector of size and capacity sz, each element initialized to Tval
{
  content_ = NewArray(capacity_);
}

The destructor merely de-allocates memory, a simple but vital task:

template <typename T>
Vector<T>::~Vector()         
// destructor
{
   delete [] content_;
}

The copy constructor code is as follows:

template <typename T>
Vector<T>::Vector(const Vector<T>& source)  
  : size_(source.size_), capacity_(source.capacity_)
// copy constructor      
{
   content_ = NewArray(capacity_);
   for (size_t i = 0; i < size_; ++i)
   {
      content_[i] = source.content_[i];
   }
}

This code is repeated for the assignment operator, our next case.

Implementing Vector<T> Assignment Operator

The assignment operator header follows the general pattern for member operators

template < typename T >
ReturnType ClassName::operator symbol (parameter list)
{
  // operator function body
}

as well as the more specialized established pattern for assignment operators

typename& operator = (const typename&)

which contains two references to the same typename, one the return value and the other the parameter. The only place the pattern allows for change is in typename. This assignment operator pattern has developed in order to facilitate multiple calls to =(), as in

x = y = z;

Operator =() associates from right to left, meaning that parentheses are implied as

(x = (y = z));

This illustrates the utility of returning the typename reference as a value. After the first call (y = z) returns a reference to (the new) y, the second call is (x = y), which is effectively the same as x = z since y has already been made equal to z. The entire result is as we would expect: x and y are now equal to z.

Assignment is similar to copy with one huge caveat: The copy constructor is always building a new object, so it does not need to worry about proper care of an old object. Assignment, on the other hand, must always deal with an existing object that needs careful attention.

The first pitfall is the possibility of self-assignment. A client program may write something like

Vector <int> v, x;
// yada dada
v = v;

Now this may seem ridiculous, but it can happen. First, never underestimate the ability of a client (program) to do something unexpected, such as writing "v = v;" explicitly. But, second, it could happen in a more subtle way. The "yada dada" part of the code could have been very complex, with the result that the vector x has been significantly transformed, assigned to and from, and other things, and then we could have the statement "v = x;" without knowing whether x is v or not. The statement "v = x;" could very well be the equivalent of "v = v;" without the client's knowledge. We have to guard against that possibility, because self-assignment is often a disaster. Keeping all this advice in mind, here is an implementation for assignment in class Vector:

template <typename T>
Vector<T>& Vector<T>::operator = (const Vector<T>& source) 
// assignment operator
{
  if (this != &source)
  {
    // the NULL case
    if (source.capacity_ == 0)
    {
      if (capacity_ > 0)
        delete [] content_;
      size_ = capacity_ = 0;
      content_ = 0;
      return *this;
    }

    // set capacity 
    if (capacity_ != source.capacity_)
    {
      if (capacity_ > 0)
        delete [] content_;
      capacity_ = source.capacity_;
      content_ = NewArray(capacity_);
    }

    // set size_
    size_ = source.size_;

    // copy content
    for (size_t i = 0; i < size_; ++i)
    {
       content_[i] = source.content_[i];
    }
  }  // end if
  return *this;
}  // end assignment operator =

To assign, we first make sure we are not self-assigning by the test "(this != &source)". Recall that this is pre-declared as a protected data member that is the address of the object, that is, a pointer to the current object. Also recall that &x is the address of x. So this test is asking whether the calling object *this has the same address as the parameter object source. That really is the appropriate test. We don't care if the objects are equal, we only care if they are the same object. Plainly, if the two objects are the same, we do not need to do anything. Not so plainly, applying the copy code in this case would be a disaster. (Trace through the code, assuming (this == &source), and see what happens.)

Once past this test, assignment is straightforward. The null case requires only setting content to zero. The main case is more interesting. First we dispose of the memory allocated to *this, the receiving object, and then allocate the right amount of new memory (a step that may be skipped in the case where capacities are the same). Finally, now that the capacities and sizes are the same, we copy all elements from the source to the target (*this again). The return value is, in all cases, *this (or rather, a reference to *this).

Implementing Vector<T> Bracket Operator

The bracket operator header follows the general pattern for member operators:

template < typename T >
ReturnType ClassName::operator symbol (parameter list)
{
  // operator function body
}

The bracket operator is interesting on two points: first, we can rig a safety mechanism for out-of-range indices; second, it is our first const method. The safety mechanism is straightforward to implement, except you may wonder why there is no check for indices less than zero, that is until you realize that index values are of type size_t, an unsigned integer type.

There are actually two versions of the bracket operator. One version is an ordinary member operator that returns a reference to the element at that index. The other is a const member operator that returns a const reference to the element at that index. The compiler selects the appropriate version for a client program automatically.

As to const: this keyword in a method declaration means that the method is guaranteed not to change the state of the calling object. The const attribute means that the operator can be called by const objects, and it means that if we inadvertently do something "non-const" in the implementation, the compiler will complain. Here are the implementations:

template <typename T>
T& Vector<T>::operator [] (size_t i)
{
   if (i >= size_)
   {
      cerr << "** Vector<T> Error: vector index out of range!\n";
      if (i >= capacity_)
         exit(EXIT_FAILURE);
   }
   return content_[i];
}

template <typename T>
const T& Vector<T>::operator [] (size_t i) const
{
   if (i >= size_)
   {
      cerr << "** Vector<T> Error: vector index out of range!\n";
      if (i >= capacity_)
         exit(EXIT_FAILURE);
   }
   return content_[i];
}

The two implementations have identical code.

Implementing Vector<T> Non-Member Operator Overloads

Three non-member operators are overloaded for Vector:

template <typename T> ostream& operator << (ostream& os, const Vector<T>& v) { v.Display(os); return os; } template <typename T> bool operator == (const Vector<T>& v1, const Vector<T>& v2) { if (v1.Size() != v2.Size()) return 0; for (size_t i = 0; i < v1.Size(); ++i) if (v1[i] != v2[i]) return 0; return 1; } template <typename T> bool operator != (const Vector<T>& v1, const Vector<T>& v2) { return !(v1 == v2); } These operators are not class members, therefore their headers follow the non-member template pattern template < typename T > ReturnType FunctionName (parameter list) { // function body } Even friends, which are declared inside the class, follow this non-member pattern. We have overloaded the output operator using the output operator pattern ostream& operator << (ostream& , typename&) containing two references to ostream, one the return value and the other the first parameter. The only place the pattern allows for change is the typename in the second parameter slot. This output operator pattern has developed in order to facilitate multiple calls to <<(), as in cout << x << y << z; Operator <<() associates from left to right, meaning that parentheses are implied as (((cout << x) << y) << z); This illustrates the utility of returning the ostream reference as a value. After the first call (cout << x) returns a reference to cout, the second call is (cout << y), and so on. Operators ==() and !=() return an integer value, interpreted as boolean. The code for operator ==() has some efficiencies, stopping the test at any point when the answer is known. Ultimately, though, the need to do about size/2 tests, on the average, cannot be changed. Implementing Vector<T> Display Methods There are two methods in Vector designed to display vector data. The first, Display(), is a function that gives slightly more flexibility than is possible with the output operator via the second parameter char ofc. The value of this parameter is used as an output format character (hence "ofc") that is placed between the vector elements in the output stream. Common ofc instances are the null character '\0', which places nothing between elements, the blank character ' ', tab '\t', and end-of-line '\n'. template <typename T> void Vector<T>::Display(ostream& os, char ofc) const { size_t i; if (ofc == '\0') for (i = 0; i < size_; ++i) os << content_[i]; else for (i = 0; i < size_; ++i) os << content_[i] << ofc; } // end Display() The second display method is Dump(), which is designed to help in testing and debugging the class. template <typename T> void Vector<T>::Dump(ostream& os) const { size_t i; if (size_ == 0) { os << "()"; } else { os << '(' << content_[0]; for (i = 1; i < size_; ++i) { os << ',' << content_[i]; } os << ')'; } } // end Dump() Dump() should display as accurate as possible view of the actual internal structure of the container. Dump() might well be removed, or privatized, prior to public release of the software. Dump() will be a standard feature of our container classes in this course. Implementing Vector<T> SetCapacity() Setting capacity of a vector is under the direct control of a client program through the method SetCapacity(). This method would be used where exact control of allocated memory for the vector footprint is needed. // Reserve more (or less) space for vector growth; // this is where memory is allocated. Note that this is // an expensive operation and should be used judiciously. // SetCapacity() is called by SetSize() only when increased capacity // is required. If the client needs to reduce capacity, a call must be // made specifically to SetCapacity. template <typename T> bool Vector<T>::SetCapacity(size_t newcapacity) { if (newcapacity == 0) { delete [] content_; content_ = 0; size_ = capacity_ = 0; return 1; } if (newcapacity != capacity_) { T* newcontent = NewArray(newcapacity); if (newcontent == 0) return 0; if (size_ > newcapacity) size_ = newcapacity; for (size_t i = 0; i < size_; ++i) { newcontent[i] = content_[i]; } capacity_ = newcapacity; delete [] content_; content_ = newcontent; } return 1; } // end SetCapacity() This public method gives the client control over how much, or how little, memory is allocated for their vector objects. There are special circumstances, such as embedded systems applications, when such control is desirable and even essential. For most cases, though, such control is unnecessary and the client would be well advised not to bother using this method at all. SetCapacity() will be called when necessary by other methods, such as SetSize() and PushBack(), and these will do so with runtime efficiency taken into account. SetCapacity() is a costly method to invoke: both NewArray() and the content copying routine require runtime proportional to the size of the vector. The implementation is straightforward. First check to see if work can be avoided (it can, in the cases where newcapacity is zero or the same as old capacity_); then get a new content array allocated, and copy data from old to new space. Finally, delete old space. Implementing Vector<T> SetSize() and Clear() Setting size is a matter of changing the value of the parameter size_ and checking to see that current capacity is not exceded. A call to SetCapacity() is made if necessary. Note that SetSize() never lowers capacity. A second version of SetSize() initializes all new elements to a specified value: template <typename T> bool Vector<T>::SetSize(size_t newsize) // (re)set size { if (newsize > capacity_) if (!SetCapacity(newsize)) return 0; size_ = newsize; return 1; } template <typename T> bool Vector<T>::SetSize(size_t newsize, const T& Tval) // (re)set size with extra elements initialized to the same value { size_t i, oldsize = size_; if (!SetSize(newsize)) return 0; for (i = oldsize; i < newsize; ++i) { content_[i] = Tval; } return 1; } The 2-parameter version of SetSize() goes nicely with the initializing constructor discussed earlier. The Clear() method is implemented by re-setting the size parameter to zero: template <typename T> void Vector<T>::Clear() { size_ = 0; } The simplicity of implementation of Clear() is startling. Remember, the policy is not to reduce capacity unless explicitly commanded by the user. There is nothing gained by overwriting the excess capacity uncovered by setting size to zero. Implementing Vector<T> Size() and Capacity() These two methods are typical information suppliers. template <typename T> size_t Vector<T>::Size() const { return size_; } template <typename T> size_t Vector<T>::Capacity() const { return capacity_; } They are declared as const because they should not, and do not, change the state of the vector. Implementing Vector<T> Front() and Back() Two more informational const methods provide the content at the beginning and end of the vector via references. It is an error to use them on an empty vector. template <typename T> T& Vector<T>::Front() { if (size_ == 0) { std::cerr << "** Vector error:: invalid Front() called on empty vector\n"; exit (EXIT_FAILURE); } return content_[0]; } template <typename T> T& Vector<T>::Back() { if (size_ == 0) { std::cerr << "** Vector error: invalid Back() called on empty vector\n"; exit (EXIT_FAILURE); } return content_[size_ - 1]; } template <typename T> const T& Vector<T>::Front() const { if (size_ == 0) { std::cerr << "** Vector error:: invalid Front() called on empty vector\n"; exit (EXIT_FAILURE); } return content_[0]; } template <typename T> const T& Vector<T>::Back() const { if (size_ == 0) { std::cerr << "** Vector error: invalid Back() called on empty vector\n"; exit (EXIT_FAILURE); } return content_[size_ - 1]; } Like the bracket operator, these come in both const and regular flavor. Implementing Vector<T> PopBack() Like Clear(), this is a surprisingly simple method body. (The body of a function is the implementation, brace to brace.) There is no need to overwrite the element being "popped", and we are not reducing capacity as a matter of policy, unless explicitly commanded to do so by the client using SetCapacity(). template <typename T> bool Vector<T>::PopBack() { if (size_ == 0) return 0; --size_; return 1; } Implementing Vector<T> PushBack() On one level the task here is simple: increase size by one and insert (a copy of) the parameter value into the newly opened vector element. A problem arises when size cannot be increased without increasing capacity. In this case, size_ == capacity_, the following are issues of concern: What should the new capacity be if size = 0? What should the new capacity be if size > 0? (Remember, we want to keep calls to NewArray() as few as possible). Take care to test whether any call to NewArray() is successful Take care to keep current content when changing to new memory allocation Take care not to leave old memory orphaned Writing an implementation for PushBack() is a left as an exercise. Vector Complexity Requirements Since Vector has already been designed, implemented, and placed into service, it is a little late to be adding requirements! The complexity requirements shown in the slide (repeated in the table below) can be thought of in two ways, either actual properties of Vector that we determine from an analysis of the implementation, or requirements that were in place and adhered to during the implementation. Either way, these complexity statements are in fact all true statements about Vector. Vector Operation Runtime Complexity Actual Requirement PopBack(), Clear() Front(), Back() Empty(), Size(), Capacity() bracket operator [] Θ(1) O(1) PushBack(t) Amortized Θ(1) Amortized O(1) SetSize(n), SetCapacity(n) O(n) O(n) assignment operator = Θ(n), n = Capacity() O(n), n = Capacity() Constructors, Destructor Θ(n), n = Capacity() O(n), n = Capacity() Display(os,ofc) Θ(n), n = Size() Dump(os) Θ(n), n = Capacity() Note that Θ(1) and O(1) are equivalent The new ISO standard for C++ does in fact impose such complexity requirements on the language, particularly on the standards for the Standard Template Library. In this course we will follow the same practice: impose both functionality and efficiency requirements on our data structures and algorithms as they are designed and placed into service in the course code library. Most of the requirements (or findings) listed in the slide are completely straightforward to verify. For example, the code implementing PopBack() consists of a simple decrementation of the size_ datum, obviously a constant runtime algorithm. Similarly, the bracket operator just checks a condition and returns a value, again clearly an O(1) process. There are a few subtleties that warrant extra attention, however. Assertion: Vector<T>::PushBack(t) has amortized runtime complexity <= O(1). First the definition of "amortized" in this context: amortized complexity of a function f(n) is the complexity of the average function A(n) = (f(1) + f(2) + ... + f(n))/n . Having amortized complexity O(1) is almost as good, but not quite, as having complexity O(1). "Amortized" complexity of an operation is a statement about the average cost of the operation, leaving open the possibility that any given instance of the operation may be quite costly, as long as most of the others are low cost so that the average cost stays low. To evaluate the amortized complexity of Vector<T>::PushBack(t), we must evaluate the runtime for a sequence of operations and then take the average. Here, for reference, is the implementation code: template <typename T> bool Vector<T>::PushBack(const T& Tval) // grow by doubling capacity { if (size_ >= capacity_) { if (capacity_ == 0) { if (!SetCapacity(1)) return 0; } else if (!SetCapacity(2 * capacity_)) return 0; } content_[size_] = Tval; ++size_; return 1; } Let c(n) denote the runtime of the portion of this code that must always be executed, that is: { if (size_ >= capacity) { } content_[size_] = Tval; ++size_; return 1; } and let d(n) be the cost of the body of the if statement, that is: { if (capacity == 0) { if (!SetCapacity(1)) return 0; } else if (!SetCapacity(2 * capacity)) return 0; } where n is the size of the vector. Observe that c(n) is actually independent of size, so that c(n) is really a constant c. Also observe that d(n) = n, if size_ = capacity = 0, otherwise If we start with an empty vector and issue n PushBack() calls, then the condition (size_ >= capacity) is met exactly at the powers of 2, because of the doubling of capacity that takes place in the body of the if statement. d(n) = n, if n is a power of 2 = 0, otherwise The total complexity of a given call to PushBack() is then f(n) = c + d(n) Now compute the average of a sequence of n calls, assuming for simplicity that n = 2^k, starting with an empty vector: A(n) = (f(1) + f(2) + ... + f(n))/n = ((c + d(1)) + (c + d(2)) + ... + (c + d(n)))/n = c + (d(1) + d(2) + ... + d(n))/n = c + (1 + 2 + 0 + 4 + 0 + 0 + 0 + 8 + 0 + ... + 2^k)/2^k = c + (1 + 2 + 4 + ... + 2^k)/2^k = c + (2^{k + 1} - 1)/2^k < c + 2^{k + 1}/2^k = c + 2 Therefore A(n) <= O(1), which proves the assertion. Assertion: Vector<T>::NewArray(n) has runtime complexity >= Ω(n). This is really about the runtime of the more primitive new T [n] operation. It is surprising that this call is = Θ(n), because we think of it as allocating a block of memory of size n * sizeof(T), which should not be an iterative process. After raw memory is allocated, however, each individual "T footprint" must be prepared. The preparation consists of calling the type T default constructor for each of the n T objects in the memory block. Therefore at least n atomic computations are made, proving that the runtime of any call to new T [n] is bounded below by Ω(n). Assertion: Assignment operators, copy constructors, and destructors have runtime complexity >= Ω(n) for a container of size n. The reason behind this asserton is similar to that above. In every case, either a constructor or a descructor must be called for each object in the container. A summary of all the Vector runtime requirements is shown in the slide.

Vector Operation	Runtime Complexity
	Actual	Requirement
`PopBack(), Clear() Front(), Back() Empty(), Size(), Capacity() bracket operator []`	Θ(1)	O(1)
`PushBack(t)`	Amortized Θ(1)	Amortized O(1)
`SetSize(n), SetCapacity(n)`	O(n)	O(n)
`assignment operator =`	Θ(n), n = Capacity()	O(n), n = Capacity()
`Constructors, Destructor`	Θ(n), n = Capacity()	O(n), n = Capacity()
`Display(os,ofc)`	Θ(n), n = Size()
`Dump(os)`	Θ(n), n = Capacity()

A(n)	= (f(1) + f(2) + ... + f(n))/n
	= ((c + d(1)) + (c + d(2)) + ... + (c + d(n)))/n
	= c + (d(1) + d(2) + ... + d(n))/n
	= c + (1 + 2 + 0 + 4 + 0 + 0 + 0 + 8 + 0 + ... + 2^k)/2^k
	= c + (1 + 2 + 4 + ... + 2^k)/2^k
	= c + (2^{k + 1} - 1)/2^k
	< c + 2^{k + 1}/2^k
	= c + 2