66.4964, 66.6965 Design and Implementation of Generic Software
Prof. David Musser
Dr. Sibylle Schupp
Spring 1997
Generic Programming with Templates
Main application of generic programming: Developing and organizing libraries
of reusable software components.
- ``Reusable'' or ``generic'' means ``widely adaptable, but still efficient,''
where the adaptation is done by means of preprocessor or programming-language
mechanisms (not by manual text editing).
- Supported in C++ by templates, inlined function definitions,
and operator overloading.
We now focus on templates, paying some attention also to function inlining
and operator overloading.
Plan for this and next three sessions (subject to change):
- Template basic ideas and terminology; overview of STL
- Large example: Template counting classes for generic algorithm performance
measurement
- Identifying and documenting template parameter restrictions: the SGI
STL web pages; the BALM project
- Dealing with template problems (e.g., code bloat); far-out uses of
templates (expression templates, template meta-programs).
Template basics and terminology
We will assume some prior experience with templates, so the focus will
be on ensuring a common vocabulary and, based on your questions, noting
things we need to cover in the next three sessions.
Terminology used by ANSI/ISO committee
Subject to change, see Carroll and Ellis, p. ix.
- class template: A template of the form template <...>
X, where X is a class definition.
- function template: A template of the form template <...>
f, where f is a function definition.
- class template specialization: The class that results when the
parameters of a class template are replaced by actual arguments.
- function template specialization: The functionthat results when
the parameters of a function template are replaced by actual arguments.
- instantiation: The process of creating specializations.
The infamous pair class template
template <class T1, class T2>
struct pair {
T1 first;
T2 second;
pair() : first(T1()), second(T2()) {}
pair(const T1& a, const T2& b) : first(a), second(b) {}
};
Function inlining: another C++ feature that supports generic programming
template <class T>
inline const T& max(const T& a, const T& b) {
return a < b ? b : a;
}
Call:
u = max(x, max(y, z));
Because of the inline keyword in the definition of max, this
is expanded by the compiler into something like
temp = (y < z ? z : y);
u = x < temp ? temp : x;
and then that code is compiled, so there is no function call overhead
(passing of parameters or transfer of control).
Without the inline keyword, a function call is done ``out
of line''; i.e., the compiler generates code for parameter passing and
transfer of control to and from the code for the function body.
What are the advantages of inline functions versus out-of-line functions?
What are the disadvantages?
Class member functions can be inlined also. In fact that is the default
treatment when the body of the function is given inside the class definition.
When a class member function is defined outside of the class, you have
to use the inline keyword to get inlining.
Note how the template max function depends on having a definition of
< for type T.
Note how such a definition can be given with a function definition
in which the function name is operator<. This is user-defined
operator overloading.
As with the pair constructor, note the importance of using constant
reference parameters in defining the template max function.
Template member functions
Default template parameters
STL: Overview and Small Examples
The Standard Template Library is a(n almost standard) C++ library that
provides a set of easily composable data structures (template classes)
and generic algorithms (template functions).
- The data structures include vectors, lists, deques, sets, multisets,
maps, multimaps, stacks, queues and priority queues.
- The generic algorithms include a broad range of fundamental algorithms
for the most common kinds of data manipulations, such as searching, sorting,
merging, copying, and transforming.
At its July 1994 meeting, the ANSI/ISO C++ Standards Committee voted
to adopt STL as part of the standard C++ library. It has been incorporated
into the full standard and is now supported, at least partially, by several
major compiler vendors.
We will look at numerous small STL examples, mostly from
Chapters 1, 2, 5, and 6 of
David R. Musser & Atul Saini, STL Tutorial and Reference Guide:
C++ Programming with the Standard Template Library, Addison-Wesley,
1996. (Hereafter referred to as M&S.) The examples are also online in STL examples.
Links to particular programs are also included in the discussion in
these notes.
The assert macro
- Many of the examples we will examine use the assert macro,
from the header assert.h (which is not an STL header; it comes
with all C and C++ compilers). See Example
1-1, which uses assert independently of STL.
The assert macro with STL
- Example
1-2 shows the use of assert in conjunction reverse,
one of STL's generic algorithms defined as a template function.
- Example
1-3 shows the use of assert in conjunction with both reverse
and vector, which is an STL container class with features similar
to arrays but much more powerful.
Overview of STL components
STL contains six major kinds of components: containers, generic algorithms,
iterators, function objects, adaptors, and allocators. These
are all designed to fit together in many different combinations.
STL containers (data abstractions)
- Example
2-1 is an example very similar to Example
1-3 but uses an STL list instead of a vector. The STL vector, list,
and deque template classes are three members of the STL sequence family
of abstractions. Sequence containers are objects that store
collections of other objects in a strictly linear arrangement. There are
some differences between them in their interfaces, but mainly they differ
in their performance. There can be a tremendous performance advantage of
one kind of sequence over another, depending on what operations are performed
on sequences.
- vector<T> provides array-like random access to a sequence
of varying length, with constant time insertions and deletions at the end
- deque<T> provides random access to a sequence of varying
length, with constant time insertions and deletions at both ends
- list<T> provides linear time access to a sequence of
varying length, with constant time insertions and deletions anywhere
- The other STL family of abstractions is sorted associative containers,
which provide for fast retrieval of objects from the collection based on
keys. The size of the collection can vary at runtime. The collection is
maintained in order.
An example is the use of a map container to store a telephone directory,
as in Example
2-2.
The STL sorted associative containers are:
- set<Key, Compare> supports unique keys (contains at
most one of each key value) and provides for fast retrieval of the keys
themselves.
- multiset<Key, Compare> supports duplicate keys (possibly
contains multiple copies of the same key value) and provides for fast retrieval
of the keys themselves.
- map<Key, T, Compare> supports unique keys (contains
at most one of each key value) and provides for fast retrieval of another
type T based on the keys
- multimap<Key, T, Compare> supports duplicate keys (possibly
contains multiple copies of the same key value) and provides for fast retrieval
of another type T based on the keys
STL generic algorithms
There are many useful algorithms that can be done on sequence containers,
such as searching, merging, copying, sorting, etc., in many different variations.
In STL most of these these are not implemented as member functions
of the sequence container classes. Instead the containers and the algorithms
are designed so that each algorithm can be used with different container
classes. Because of their generality such algorithms are called generic
algorithms. Two of the simplest generic algorithms in STL are find
and merge.
- The generic find algorithm is illustrated in Example
2-3, which shows its use with an ordinary C++ array. Example
2-4 shows its use with a vector, and Example
2-5 shows its use with an STL list. Finally, Example
2-6 shows its use with an STL deque.
- An STL generic algorithm can even work with several different kinds
of sequences simultaneously, as shown by Example
2-7 in which the generic merge algorithm is used to merge
an array and a list and put the result into a deque.
This flexibility is due in large part to the way STL connects containers
and algorithms together using iterators, which are pointer-like
objects. Having different kinds of iterators for different containers allows
the same algorithm to work with different containers while still doing
``the same thing'' (abstractly). Iterators are discussed in detail in Section
2-3 and Chapter 4 of M&S.
Iterators
STL connects data structures and algorithms together using iterators,
which are pointer-like objects used to gain access to all elements of a
linear sequence. STL defines a set of requirements an object must satisfy
to be an iterator.
- Ordinary C/C++ pointers satisfy these requirements and are therefore
one form of iterator, but other kinds of iterators are defined and used
in the library.
- All iterators have a ++ operation defined on them so that
++i advances an iterator i to the next location in the
sequence.
- All iterators have a * (dereference) operation defined on
them so that *i returns the location i points to so that
it can be stored into, as in *i = x, or its value can be used
in an expression, as in x = *i.
- Some iterators also have other operations, such as --, +=,
and -=, defined on them, again in analogy to the way they are
defined on pointers.
The way STL generic algorithms use iterators is illustrated by the accumulate
algorithm.
template <class InputIterator, class T>
T accumulate(InputIterator first, InputIterator last, T init) {
while (first != last) {
init = init + *first;
++first;
}
return init;
}
- Example
2-9 shows a simple use of the accumulate function with vector
iterators. It could also be used, for example, with array iterators, which
are ordinary pointers, or with list iterators.
- In each case the abstract meaning is the same---adding to the initial
value the values in the range indicated by the iterators.
- All accumulate needs is minimal properties of iterators---the
properties guaranteed by input iterators, which get their name from
the requirement that their * operation be able to read values
but not (necessarily) store them.
- Output iterators are similar to input iterators, but require
* to be able to store values but not (necessarily) read them.
- Forward iterators can both read and store with *.
- Bidirectional iterators have, in addition to forward iterator
abilities, -- operations for traversing sequences in the opposite
of the normal direction.
- Random access iterators add the ability to jump to an arbitrary
position in a sequence in constant time, with += and -=.
The precise definition of these iterator categories is given in Chapter
4 of Musser & Saini. The key point to understand is some algorithms
need the more powerful iterators in order to operate efficiently. For example
linear searching only needs input iterators, but sorting needs random access
iterators.
Function objects
Definition of generic algorithms in terms of iterators makes the algorithms
very flexible, but even more flexibility is afforded by other parameters
called function objects.
- Example
2-10 shows how to use accumulate to accumulate a product,
by passing it a multiplication function to use instead of addition.
- Example
2-11 does the same thing but using a function object, an object
of a type defined by a class or struct in which the function call operator
is defined.
- Example
2-12 again does the same thing but using one of STL's predefined function
objects, times.
Adaptors
An STL adaptor is a component that modifies the interface of
another component.
- Example
2-13 shows the use of reverse iterators, which are obtained
from ordinary iterators using an iterator adaptor.
- STL also defines insert iterator adaptors and several container
adaptors and function adaptors
Organization and Examples of STL Generic Algorithms
In-place and copying versions
- Example
5-1 shows the use of an in-place sorting algorithm.
- Example
5-2 shows the use of reverse_copy, copying version of the reverse
algorithm.
Algorithms with Function Parameters
- Example
5-3 shows the use of a sort algorithm with a function object, a binary
predicate (a function that returns a bool value). The one used,
greater, makes the algorithm sort into descending order instead
of the usual ascending order.
Categories of STL Generic Algorithms
STL generic algorithms fall into four broad categories.
Nonmutating sequence algorithms.
If you need some algorithm for searching a sequence or counting elements
that have some property, check out the algorithms in this category. They
are called nonmutating because they don't change the sequence to which
they are applied.
Mutating sequence algorithms.
Those algorithms that do change sequences, but are not related to sorting
(the next category), reside here. E.g., copying, filling, generating, partitioning,
shuffling, removing elements, rotating, and transforming.
Sorting-related algorithms.
This category includes sorting; merging; and set operations on sorted
sequences, such as union and intersection. Maintaining a heap, which can
be used as a priority queue, is also provided. Also, comparing two sequences
lexicographically (a generalization of alphabetical ordering of strings)
and generating all permutations of a sequence in lexicographical order.
Generalized numeric algorithms.
This category contains algorithms like accumulate for computing
sums, sequences of partial sums, and inner products---all suitably generalized
with function object parameters so you can substitute other binary operations
for + and *.