Reading: Deitel & Deitel, Sections 8.8, 7.1, 7.5
Alert: This is the hardest worksheet. Read it slowly and ponder every sentence.
It often happens that you want to create a new object and initialize it with the same values as another instance of the class which was created previously. This calls for a special constructor called a copy constructor.
class someclass { ... }; int main() someclass A(...) someclass B(A); // use of the copy constructor ...
C++ automatically provides a default copy constructor which simply
copies all data members to the new instance of the class from the
existing one (i.e., from A
to B
). For many classes this
is exactly what you want.
A very similar situation arises with the assignment operator if you wish to assign to one instance of the class the values of another instance of the class.
int main() ... someclass A(...) ... someclass B; B = A; ...
As with the copy constructor, C++ automatically overloads the
assignment operator =
for new classes, and its action is to
copy the member values of the object on the right side of the operator
to those on the left side.
We have seen that there are problems when you use the = operator
on strings or other pointers when you really wanted to use the
strcpy function or the equivalent. Similar problems
arise when you use the default copy constructor and assignment
operator for classes which use pointers. Here is an example, found
in /dept/cs/cs2/download/ex1bug.cpp
:
#include <iostream> using namespace std; class simple { private: char *data; public: simple() {data = NULL;} // default constructor simple(char *s) { // constructor data = new char[strlen(s)+1]; strcpy (data, s); } ~simple() { // destructor if (data != NULL) delete [] data; } char *getdata() {return data;} void setdata(char *s) { if (data != NULL) // free up memory from previous string, delete [] data; // if there was one data = new char[strlen(s)+1]; // allocate new memory strcpy(data,s); } }; int main() { simple A("Mickey Mouse"); simple B; B = A; simple C(A); cout << A.getdata() << endl; cout << B.getdata() << endl; cout << C.getdata() << endl; A.setdata("Donald Duck"); cout << A.getdata() << endl; // should print Donald Duck cout << B.getdata() << endl; // should print Mickey Mouse cout << C.getdata() << endl; // should print Mickey Mouse return 0; }
Study this code closely; there is a very subtle error in it which will
plague you throughout your programming career if you do not understand
it. The program creates three instances of the class simple,
called A
, B
, and C
. A
is initialized
using a constructor, B
is set using the assignment operator,
and C
is initialized using the copy constructor. Note that
there is no code to overload the assignment operator (=
) and
there is no code for a copy constructor. In both cases the defaults
are used. The problem is that this code does not behave properly.
The value of data is changed from Mickey Mouse to Donald Duck in object A
, but this has the unwanted effect of
changing the value in all three. In addition, the program crashes in
the destructor function for object B
at the end of main
.
(It is reported in Visual Studio as a ``Debug Assertion Error''.) Here
is the output of this program before it crashes:
Mickey Mouse Mickey Mouse Mickey Mouse Donald Duck Donald Duck Donald Duck
The reason this program does not work properly is that the value of
data in instances B
and C
is set using the
=
operator, which means that data, which is a pointer, is
pointing to the same memory address in all three objects. When the
call to A.setdata occurs, that memory is deleted and
reallocated. On our systems it just happens that the same memory
which formerly held Mickey Mouse
is reallocated to hold
Donald Duck
. Objects B
and C
are still pointing
to this memory. The following diagram represents what has happened.
At the end of main, the destructor is called for each of the three objects. Since they all point to the same memory, only the first destructor called will succeed in deleting the memory. When the second destructor call tries to delete the same memory, the crash occurs.
The problem is that these copies of instance A
are referred to as
shallow copies. Only the pointer values themselves were duplicated.
What is needed here is a deep copy, so that each new instance
receives its own copies of the dynamically allocated data members.
C++ allows you to write your own copy constructor and to overload the assignment operator for classes if you need to in order to correct this problem. For this example, adding the following two functions to the simple class will cause the program to behave correctly.
class simple { ... simple(const simple ©) { // copy constructor if (copy.data != NULL) { data = new char[strlen(copy.data)+1]; strcpy(data,copy.data); } else data = NULL; } // rhs stands for "right hand side" simple& operator=(const simple &rhs) { // overload assignment operator if (this != &rhs) { // do not copy to yourself if (data!=NULL) delete [] data; // free up memory if needed if (rhs.data != NULL) { // copy string from rhs if it exists data = new char[strlen(rhs.data)+1]; strcpy(data,rhs.data); } else data = NULL; } return *this; // always end with this line } ...
The corrected program is in /dept/cs/cs2/download/ex1fix.cpp
.
The output and the corresponding memory diagrams now look like this:
Mickey Mouse Mickey Mouse Mickey Mouse Donald Duck Mickey Mouse Mickey Mouse
Copy constructors and const declarations
Notice that the parameter for both the copy constructor and the overloaded = operator is a const reference (&) parameter. A copy constructor requires that its argument be passed in this way. If its parameter were passed by value instead of by reference, it would have to make a copy of the object for use inside the function - but it would have to call itself, the copy constructor, to do that. The keyword const means that the function is not permitted to change the values of any data members of the object copy being passed in.
When an object is passed into a function as a const reference
parameter, any statements which could change the values stored in that
object are caught as errors. We clearly don't want a copy constructor
to change the values stored in the object being copied. For example,
if you add the statement copy.data = NULL; to the copy
constructor above, you will get a compiler error. In Visual Studio 6,
the error will read:
error C2166: l-value specifies const object
What if you call a member function of a const reference object? How can the compiler be sure that the other function won't change the object's values? For example, what if we changed the simple copy constructor to look like this:
simple(const simple &x) { if (x.data != NULL) { data = new char[strlen(x.getdata())+1]; strcpy (data, x.getdata()); } else data = NULL; }
The compiler can't guarantee that the call to getdata won't change the values of object x, so it gives an error message. The Visual Studio 6 message reads:
error C2662: 'getdata' : cannot convert 'this' pointer from
'const class simple' to 'class simple &'
To tell the compiler that a member function will not change the
object calling it, you must add the const keyword to that
function's prototype and definition. In this case, we have to change
the getdata member function to the following:
char * getdata() const {return data;}
The addition of the keyword const after the parameter list is how you specify that the function will not modify the object.
Assignment operators and the this pointer
The code for an overloaded assignment operator should contain all of the lines in the copy constructor, but it has a few more things to do because it is operating on an object that has already been instantiated. (A copy constructor, in contrast, is creating a new instance.)
1 int x, y, z; 2 x = y = z = 3; 3 while (x=y) { 4 cout << x << ' '; 5 y--; 6 }
The = operator associates right to left (most other operators
associate left to right). This means that the rightmost
assignment is done first, and the return value is then used as the
right operand for the next operation. You can see this in line 2.
The operation z = 3
is done first. This operation returns 3,
so this value is then assigned to y. This second assignment
also returns 3, and this value is assigned to x.
In line 3, the statement inside the parentheses looks like it is
wrong, but in fact it is assigning the value of y
to the value
of x
, and resulting return value is tested as the while loop
condition. Thus, this is a strange way of looping until y
has the
value zero (remember that zero means false and nonzero means true).
This code would print
3 2 1and after the loop, both
x
and y
would have the value zero.
Exercise 1: Write code for a class Bag which starts out as follows:
class Bag { private: double *thedata[100]; // note that this is an array of pointers int size; // counter of how many items are stored in the bag public: Bag(); Bag(const Bag&); // copy constructor Bag& operator=(const Bag &); // assignment operator int BagSize(); // returns the number of elements in the bag void Insert(double); // inserts a new element into the bag void PrintBag(); // prints each element in the bag ~Bag(); // a destructor };
A main program has been provided for you in /dept/cs/cs2/wksht10/bagmain.cpp
.
Implicit calls to the copy constructor and destructor
Even if there are no statements in your code which explictly call a copy constructor or a destructor for a class, these are implicitly called whenever an instance of a class is passed by value into a function. This is because a function call makes a copy of whatever data is passed when the data is call-by-value (the default), and for a class, this is done by calling the copy constructor for the class. Likewise, when a function terminates, the destructor is called for all local variables and local arguments.
Consider this program, which you can find in /dept/cs/cs2/download/ex2bug.cpp
:
#include <iostream> using namespace std; class buggyclass { private: char *aString; public: buggyclass() {aString = NULL;} buggyclass(char *s) { aString = new char[strlen(s)+1]; strcpy(aString,s); } buggyclass(const buggyclass& b) // copy constructor, identical to { // default provided by C++ cout << " In copy constructor " << endl; aString = b.aString; // what's wrong with this? } ~buggyclass() { cout << " In buggyclass destructor, deleting memory at address " << (int)aString << endl; if (aString != NULL) delete [] aString; } char *getstring() { // eventually needs to be a const function char *retval = new char[strlen(aString)+1]; strcpy(retval, aString); return retval; } }; void afunction(buggyclass b) { cout << " In afunction, b is " << b.getstring() << endl; } int main() { buggyclass rpi("Rensselaer"); // create an instance of the class cout << "Before first call" << endl; afunction(rpi); cout << "Before second call" << endl; afunction(rpi); cout << "About to exit" << endl; return 0; }
Note that nowhere in the code is either the copy constructor or the destructor for the class buggyclass ever explicitly called in the program. If you compile and run the program on Visual Studio 6, it will crash. You will see output that looks like this:
Before first call In copy constructor In afunction, b is Rensselaer In buggyclass destructor, deleting memory at address 264288 Before second call In copy constructor In afunction, b is Rensselaer In buggyclass destructor, deleting memory at address 264288 <DEBUG ASSERTION ERROR>
If this program worked correctly, it would call the copy constructor twice, at the beginning of each function call when a new copy of the object rpi is made. It would call the destructor three times, at the termination of each function call and at the termination of main. As it is written, it attempts to delete the same memory location twice, resulting in a crash.
There are two ways to avoid and correct these problems, and it is important to understand both of them. You will often need to use both in your programs.
Method 1: Write the copy constructor correctly for the class buggyclass, since it uses pointers. (While you're at it, write a correct assignment operator for the class as well.)
buggyclass(const buggyclass& b) // copy constructor { cout << " In copy constructor " << endl; if (b.aString != NULL) { aString = new char[strlen(b.getstring())+1]; strcpy(aString,b.getstring()); } else aString = NULL; }
Method 2: When you pass an instance of a class to a function,
pass it as a const reference parameter. In this example,
the function header for afunction would be changed to
the following:
void afunction(const buggyclass& b)
When objects are passed in this way, no copy of the class instance needs
to be made. Remember, though, that some member functions will need to
be changed to const functions (such as getstring in this
example). Passing objects by reference also gives better performance,
since the function call does not create and then destroy a temporary
object.
void afunction(const buggyclass& b) { cout << " In afunction, b is " << b.getstring() << endl; }
The Moral of the Story
If you write a class which uses any pointers, it should always include the following member functions:
In addition, any member functions that manipulate pointers (such as setname in the first example) must avoid memory leaks by properly deleting memory which is no longer being used.
If your class does not use any pointers, do not write these functions. Rely on the ones built into the language.
Pass objects to functions as const & unless you need to be able
to change the object being passed in.