CSCI 230 --- Data Structures and Algorithms
Project 4 --- Campaign Tour 2000
Due: April 27

You have been hired as a consultant to the political campaign of one of the Presidential hopefuls. The candidate is planing a nationwide grassroots tour, visiting 100 cities across the United States. Soft money contributions are down, so you have been hired to determine the tour route so as to minimize the total distance traveled.

To model this problem on the computer, you will construct a graph in which each city in the tour is represented by a vertex, and the edge between any two vertices has a weight equal to the distance between the corresponding cities. Note that a tour is simply a permutation of the list of cities, and the length of a tour can be easily computed from the graph.

A straightforward approach to finding the shortest possible tour is to simply examine all of the possibilities. Note that a tour is just a permutation of the vertices; this makes it easy to code using STL:

          std::vector tour, min_tour;
          for (int i=0; i < 100; ++i)  tour.push_back(i);
          double min_cost = cost(tour);
          std::copy(tour.begin(), tour.end(), min_tour.begin());
          while (std::next_permutation(tour.begin() + 1, tour.end())
            if (cost(tour) < min_cost) {
              min_cost = cost(tour);
              std::copy(tour.begin(), tour.end(), min_tour.begin());
            }
The function cost computes the total cost of the tour by summing the weights on the edges between each adjacent pair of vertices.

Unfortunately, the above approach has a running time that is unacceptable; the main loop will execute 99! times, which is almost 10 raised to the 166th power. To put this number in perspective, suppose we had started running the above program just after the Big Bang, about 15 billion years ago. Even if our computer executes a trillion operations per second (at least 1000 times faster than current technology), it still wouldn't be done checking all the combinations. In fact, we would need about 10 to the 126th power such computers working in parallel to have completed the job, which is far more than the total number of atoms in the universe.

Clearly we need a more efficient algorithm. However, no one has ever discovered a way to to find the shortest tour that is fast enough to be practical; in fact, many people suspect that this problem (better known as the Traveling Salesman Problem does not have an efficient algorithm.

Despite all this there is a relatively simple and efficient way to find a tour that, while not necessarily the shortest possible, is quite good. We start by finding a minimum spanning tree for the graph (using Prim's algorithm); then we can get a tour by "walking around the outside" of the spanning tree, as shown in the first figure below. This tour is not exactly what we want, since cities are visited more than once. We can fix this by skipping over cities we have already visited and removing the extra edges from the tour, as shown in the second figure below (we start from the vertex at the top of the graph).

The initial tour we created is exactly twice the length of the minimum spanning tree, and so at worst is only twice as long as the best tour (no tour can be shorter than the minimum spanning tree), and removing nodes already visited in the tour will only shorten it. Therefore the tour we find with this algorithm will be at most twice as long as the shortest possible tour.

Design and Implementation

Your program should do the following:
  1. Read in the list of cities and the distances between them; the input file will be specified on the command line.
  2. Compute a good tour:
    1. Find a MST of the graph formed by the cities.
    2. Print out the edges in the MST and its total length (the sum of the weights of the edges).
    3. Construct a tour walking around the outside of the MST, skipping any cities that would be visited twice.
  3. Print out the tour, by listing the cities in the order they are visited.
  4. Print out the length of the tour.

The first line of the input file will be the number of cities in the tour. The remainder of the file will consist of, for each city, the name of the city, followed by the distances (which can be floating point numbers) to the other cities, in order, skipping the current city. Therefore, if there are N cities, there will be only N-1 distances following each city. Code to read the input file and to print the final tour will be provided for you, along with with sample input and output.

A skeleton main program has also been provided, as well as the specification for a class to represent a graph. In filling in the implementation of the main program, you will not have to add any public members to the graph class; use only what is already provided. (You may, of course, add whatever private members you wish.) As you implement the main program, think about which graph data structure will allow for the most efficient implementation of the operations required (for example, an adjacency list can iterate over adjacent vertices more efficiently than an adjacency matrix).

Notes and assumptions