A Java Programmer's Guide to Reading C++ -- Part 2

This is part 2. There are two other parts: part 1 and part 3.

Moving on to the Fall 2005 competion, we see again a problem document of six problems, of which only two were solved in both C++ and Java. [For people interested in straight C programming, sample C solutions are posted for all six problems by the people running the contest.]

In ball.cpp we are introduced to the string stream type. The sstream library documentation is available. There is also a tutorial on how to use it. Basically the idea is to allow you to operate on a string as if it were a file (i.e., process it incrementally by reading and writing). This is similar to the old style C routines of sscanf and sprintf. The 3.5 minute video C++ Advanced Course :31- String Streams advocates this class as a way to solve the problem of what to do when you want to concatenate a string with a number to produce a string. The 12 minute video BeginnersC++ Lesson 17: Data Validation with String Stream mentions that it was introduced to C++ around 1999, so is not supported in some older systems. This second video shows how to deal with reading in an integer where you are worried about the user typing in something that isn't an integer.

Another thing interesting about ball.cpp is that more features of fstream are used. In particular we see how to open a file so that it can only be read from

  fs.open("ball.in", fstream::in);
We also see the peek and get functions used to check a character before it is read
   if (fs.peek()=='A') {
	current[0]=-1;
	fs.get();
   }
   else if (fs.peek()=='B') {
	current[0]=-2;
	fs.get();
   }

castles.cpp is short and introduces no new concepts.

Moving on to the Spring 2006 competition, we get six C++ solutions (three of which have corresponding Java solutions).

The first one to look at is money.cpp. This introduces the C++ vector template. Templates in C++ correspond roughly to generics in Java, particularly in the use of vectors in this program where they are a way to constrain the type of objects that can be members of the vector. A (C++ biased) discussion of the relation between Java generics and C++ templates can be found in Java Generics and C++ Templates by Vladimir Batov, July 01, 2004, Dr. Dobb's.

Example declarations of interest are:

  struct donations {
    vector< int > per_candidate;
    int total;
    bool violator;
  };
  vector< donations > donors(num_donors+1);
  vector< int > violators;
where violators is a vector of ints, the donations type has a vector of int field called per_candidate, and donors is a vector of donations with an initial capacity of num_donors+1.
  donors.at(i).per_candidate.resize(num_candidates+1);
  donors.at(i).per_candidate.at(k) = 0;
do what you would expect. In line with C++'s preference for terseness, at does the same as Java's elementAt. And, there are iterators for vectors in C++ just as there are in Java
   vector< int > violators;
   ...
   for (vector< int >::iterator iter = violators.begin();
	 iter != violators.end(); iter++)
       cout << *iter << endl;
:: is the C++ `scope resolution operator'. What it does is indicate where something is defined, in this case we are looking at iterator defined for vector < int >. Note that * and ++ have been overloaded so that ++ advances the iterator and * accesses the value that the iterator points to. In C, `for loops' that are driven across arrays typically progress by incrementing (++) an index variable. Also, in C, * is used to dereference an address (so we think of the iterator iter as not being a value of violators, but rather as the address of a value of violators). In the C++ FAQ, there is an interesting entry on operator overloading.

In C++, as in most languages, there are the standard libraries that come with the development environment (in this case, the C++ Standard Library [wikipedia overview]). We have already seen a few of them such as string, fstream, and sstream. Since we have been using the Gnu g++ compiler, it might be worthwhile looking at the documentation of the The GNU C++ Library (however, I have generally been linking to generic C++ library documentations). Part of the C++ standard library is the Standard Template Library [wikipedia overview] often abbreviated STL. Skimming through the two wikipedia overviews above gives a notion of what sort of functionality is being provided by default with C++. In the best cases, such standard libraries speed up programming by keeping you from reinventing the wheel as well as the even more time consuming issue of debugging the reinvented wheel.

One can spend a lot of time on the way C++ handles templates. The C++ FAQ has a nice entry on templates The wikipedia entry for C++ Templates already looks dense, but then there is also the wikipedia entry for Template Metaprogramming where one writes programs that the compiler executes in the process of compiling a C++ program, such as the wikipedia example:

  template < int n >
  struct factorial {
  	enum { value = n * factorial< n - 1 >::value };
  };
   
  template <>
  struct factorial< 0 > {
  	enum { value = 1 };
  };
   
  // Usage examples:
  // factorial< 0 >::value would yield 1;
  // factorial< 4 >::value would yield 24.
but, by focussing on actual code used to solve reasonable problems, we don't see much of this sort of thing, and so, at least for now, it is not worthwhile to spend time puzzling it out. Indeed, you will notice that although we find classes and templates being used as ways to organize libraries for programmers, we haven't yet seen examples where programmers really find the need to create their own classes and templates when working in a language like C++ that doesn't require everything to be a class (the way, for example, Java does).

Before leaving money.cpp, there is one other interesting piece of notation to observe:

  void process_record(ifstream & infile, int num_donors, 
   		      int num_candidates, int num_transactions)
Here, the parameter to the function process_record named infile is declared to be of type ifstream. The interesting part is the & symbol between the parameter and the type. This symbol indicates that infile is a reference to something of type ifstream (see Reference C++ wikipedia and references entry in C++ FAQ). However, there is no explicit dereference operator, so when you use the parameter infile, you are referring to the ifstream value rather than the address of the ifstream value. If you consider the declaration int num_donors in the above, you will note that when you pass the value x in that parameter, a copy of the value that x held is placed in num_donors. But if you pass a variable holding an ifstream to infile, then no copying is done and instead infile holds the same ifstream. This is similar to how passing objects in Java works (as opposed to passing Java basic types, where you would get a copy instead). This can be used in any variable declaration situation and is not just limited to parameter declarations.

Moving on to gerry.cpp, we see two dimensional arrays viewed as one dimensional arrays of addresses for other one dimensional arrays. Thus we have the two dimensional array preVotes introduced as

   int ** precVotes;
which tells us that precVotes holds an address of something that holds an address of an integer. To actually put a two-dimensional array out there, we then:
  precVotes = new int * [2];
  precVotes[0] = new int[numPrects];
  precVotes[1] = new int[numPrects];
The first statement giving precVotes the address of an actual array of two values, each of which is an address of an integer (or start of a sequence of integers). Then assigning to precVotes[0] and precVotes[1], we put actual arrays out there so that all the memory for the two dimensional array now exists. The values of precVotes are then filled and accessed as you would expect:
   input >> precVotes[0][i] >> precVotes[1][i];
   if (precVotes[0][i] > precVotes[1][i])
Recall that we have previously seen two-dimensional arrays in colors.cpp, but in that case the declaration was phrased differently
    double Contrast[N][N];
so that the underlying structure wasn't obvious as it is here.

The program speech.cpp is interesting because there is no C++ in it, instead it is pure C code (as mentioned, C++ is backward compatible with C). According to wikipedia, coinio.h is a DOS header file for some console i/o functions. There is a stackoverflow discussion of what to do with code that uses coinio.h when you are on a Linux machine.

Nothing new in stops.cpp

Nothing new in ballots.cpp. Although it mentions queue, it never uses it and although it mentions list and even declares one (currviolators), that too is never used. It does reference the cmath header, but this turns out to be the same as C's math.h .

The program winner.cpp shows vectors and strings being passed to functions by reference. We also see the usage of the keyword const

  bool valid(const string& vote, const vector< Cad >& rgc, int r)
indicating that the vector rgc references won't be modified. A start on the complications that arise from this concept can be found on the wikipedia page const. There is also a useful entry in the C++ FAQ on const correctness.

Moving along to Fall 2006, civil.cpp overs nothing new. The program biomed.cpp, has this puzzling piece of code

  string get_line(fstream &f)
  {
    char temp[1024];
    string t;
    f.getline(temp,1024);
    t = temp;
    return t;
  }
Why it was written, was to create a version of get_line that would ignore the end part of lines longer than 1024 characters, which is one way of avoiding a buffer overflow [wikipedia] bug. But the more interesting issue is that we are reading into a character array temp, but returning a string t. What we are seeing here is an aspect of the issues raised by C++ actually having two different notions of a string. One notion is the traditional C notion of an array of characters (temp), where the string starts at the beginning of the array and ends at the first null character (if no null character is in the array, then most string manipulation functions will walk outside the array reeking havoc in the surrounding declarations). The other notion is a class string that is part of the C++ standard library. Where the two types of strings are mixed, C++ promotes the C string type to the class string type, so the assignment of t to temp works fine.

But there is more here. If we returned temp, we would be returning the address of an array that is local to the function get_line, i.e., it exists on the stack that is used to handle function calls, rather than on the heap where class objects are kept. Thus, later function calls would overwrite the portion of the stack where the old location of temp was which would confuse someone who was reading the value of temp and if someone changed the value of temp, then they would possibly be making changes randomly into the functions that were currently using that stack space -- all of which would be quite confusing to the programmer trying to track down a bug that resulted from this. By assigning the value of temp to t, the characters got copied from the stack to the heap, where they would be safe and sound (hopefully).

** Does this really work this way? How could you test whether or not the assignment really made a copy of the characters? Should take at most a dozen lines of C++.

In ee.cpp, we see

  typedef float ff;
  struct point{ff x, y;};
which is traditional C for defining a new type ff that is the same as float. The struct line shows ff being used, just like it was a regular type. Although it might make the code a bit more difficult to read, if we decide that we should be using double rather than float for the calculations, it makes it very easy to change. We also see:
  cout.precision(2);
  cout << fixed << strength << endl;
which tells us we can control the output precision of floats going to cout using precision.

It is also worth noting that the STD library algorithm is included here. Algorithm includes some interesting functionality like sorting, searching, swapping, binary search, set operations, heap operations, and even sequencing through permutations. But, alas, here all it is used for is the definition of max.

** Nothing new in ame.cpp. However, if you do the following:

  wget 'http://contest.usc.edu/index.php/Fall06/Home?action=download&upname=ame.cpp.txt'
  mv 'Home?action=download&upname=ame.cpp.txt' ame.cpp.x
  cat ame.cpp.x | sed -e 's/\r$//' - | sed -e 7,+2d - > ame.cpp
you will have a copy of ame.cpp that you haven't yet seen and that has removed from it the declaration of the struct called stage. You should be able to figure out what the declaration should look like to get this program to compile and work against the provided test data.

The program ise.cpp shows us delete being used with C style address variables

   Point *stores = new Point[storecnt];
   Point *wares = new Point[warecnt];
   ...
   delete [] stores;
   delete [] wares;
thus avoiding a potential memory leak, as discussed earlier.

Moving on to Spring 2007, overlap-johngrotelueschen.cpp uses C++ programming constructs we have already seen.

The program studies-chengweilin.cpp shows use of nested vector declarations and how to access them

    vector< vector< int > > rgh(n, vector< int >(11, 0));
    for (int i=0; i < n; ++i)
	for (int j=0; j < 10; ++j)
   	    ifile >> rgh[i][j];

Nothing new in yesorno-andrewmcnamee.cpp.

In party-morganbrown.cpp, we get to see sort from the algorithm STL in use

    sort(locs[i].caps.rbegin(), locs[i].caps.rend());
where caps is a vector field of a struct and rbegin and rend are returning reverse iterators related to the vector.

** Note that sort sorts in ascending order, so giving it reverse iterators should produce what kind of sort? How would you test your answer?

Looking at the Fall 2007 contest solutions, format-chengweilin.cpp shows a usage of transform (another part of the algorithm STL). One usage of transform is where you want to change a bunch of values in a collection that can be iterated across. To do this, you supply an iterator that will traverse the collection, an iterator that is a marker for when you have reached the end of the collection, an iterator that will traverse where you want the result put, and a single arguement function you want to use to compute the modified values. It is also possible to transform a pair of input iterators, in which case you only need an `end marker iterator' for the first of the two input iterators. In this code, you see that transform is being used to walk down a string, applying modify to each character it encounters (so the input and output iterators are the same iterator). Tranform lets us handle many common situations while avoiding writing a for loop and having to manage the various iterators outself (possibly making a mistake in the process).

This code also demonstrates the erase method on strings as well as some of the useful functions cctype provides when you are doing character checking, such as: isalpha, ispunct, and tolower.

In prizes-morganbrown.cpp we see our first non-struct class definition:

class par
{
public:
	vector< prob > probs;
	int numGot;
	int totalTime;
	int parNum;
};
(note: prob was declared as a struct just above this definition of par). The original structs in C did not have methods associated with them (they only had public data fields), but both structs and classes in C++ can have methods associated with them (although we haven't seen an example of this yet).

There is also an interesting piece of iterator code in this example:

    for (vector< prob >::iterator jt = pars[i].probs.begin(); jt != pars[i].probs.end(); jt++)
       {
         jt->got = false;
	 jt->time = 0;
	 jt->numPens = 0;
       }
Since jt is an iterator, *jt could be used to get what jt references and then the got field of it could have been accessed as *jt.got, but jt->got is a common alternative notation that is used in this code. According to the wikipedia entry on operators in C and C++, the -> operator can be overloaded but the . operator can't, so there could be times when it mattered which one you used (if you would like to see an example where this comes up, look at the article Smart Pointers - What, Why, Which? by Yonat Sharon).

Moving on the Spring 2007 contest, none of the posted solutions used new features in C++. Hmmm, I seemed to have gotten out of order here and had Fall come before Spring, but it doesn't really matter.

** Read problem F in Spring 2007 Problems. Write a solution for it in C++. Hint: the posted solution built their answer around the data type loc

   struct loc
    {
	int numInt, numSob;
	vector< int > caps;
    };
and then their big data structure was:
  	loc locs[1000];
They found the vector methods clear and push_back (as well as rbegin and rend in conjunction with the sort algorithm) useful.

Moving on to Spring 2008, we finally find, in fingers-larrysequino.cpp a C++ program with a user-defined class with methods. The description of the problem being solved can be found in Spring 2008 Problems. Interestingly enough, the Java solution doesn't use any extra user defined classes, or any methods other than main.

The user defined class is

  class Fingerprint
  {
   public:
        char dots[25];
        int score;
	Fingerprint(){}
	void setprint(ifstream &fin){
		char dot;
		for(int i = 0; i <25; i++){
			fin >> dot;
			dots[i] = dot;
		}
	}
	int compare(Fingerprint F){
		int count = 0;
		for(int i = 0; i < 25; i++){
			if(dots[i] == F.dots[i])
				count++;
		}
		return count;
	}
};
which should work pretty much the way a Java programmer would expect, given the previous features of C++ we have looked at.

In response-chengweilin.cpp, we see a priority queue of double/int pairs (pair comes from the utility library) created

     priority_queue< pair< double, int > > q;
and then top, push, pop, and empty applied to it. The queue library contains priority_queue.

We also see an alternative notation for controlling the format of standard out

	cout << fixed << setprecision(2) << ans << endl << endl;
where cout, fixed, setprecision, and endl are all C++ predefined constructs. The iomanip library provided setprecision and fixed is part of iostream.

Also worth noting is that the second argument to a vector constructor is the value to initialize the entries with. The example usage in this program being:

    vector< double > rgf(mmap.size(), numeric_limits< double >::max());
    vector< bool > visited(mmap.size(), false);
The limits library contained numeric_limits.

Moving on to Fall 2008, we find zebras-chengweilin.cpp contains:

inline double dist(const point &a, const point &b)
   { return sqrt((a.x-b.x)*(a.x-b.x)+(a.y-b.y)*(a.y-b.y)); }
The keyword inline asks the compiler to not implement this as a function, but to rather replace any function code to dist with the instructions that would execute the calculation provided. This is sort of like a macro (like assembly or C programmers use), but is more flexible because you have normal language scoping going on. As discussed in the wikipedia entry inline function, this is generally done in an attempt to improve code performance. It is worthwhile noting that just because you didn't declare something inline, the compiler might go ahead and inline it anyway, if it `thought' that that was the better way to go (and indeed, Java programmers rely on the compiler to make these sorts of decisions). The C++ FAQ entry on inline is also worth taking a look at.

This code also contains an algorithm method we haven't seen before, min_element.

  cout<< fixed << setprecision(2) << 
         *min_element(dp.back().begin(), dp.back().end()) << endl << endl;
where dp was a vector of vectors of doubles. To see why there is a * in front of min_element, we can look at the code equivalent for it in the documentation. What we see is that it doesn't return the value found, but instead returns the iterator at the place where the value was found, thus requiring the extra level of dereferencing before printing.

Moving on to Spring 2009, we find bonus-tylerbreisacher.cpp demonstrating the set library with its insert, find, and end methods.

risk-darryldeweese.cpp demonstrates the map library which sort of behaves like an array that can be indexed with non-integer things like strings. For example, we have the declaration:

  map< string, double > valueFromModifier;
and then the assignment
   valueFromModifier["very"] = 2.0;
and the usages:
  if (valueFromModifier.count(newWord))
       multiplier *= valueFromModifier[newWord];
where count tells you how many instances of newWord are in valueFromModified (the answer will be either 0 or 1).

Also of interest in this code is the usage of strtok to split a string into pieces that were seperated by a delimiter in the original string. We also see the list library in use (push_back and an iterator).

In Fall 2009 we see the stack library being used in network-aadarshpatel.cpp.

In Spring 2010 no new C++ features show up.

In Fall 2010, explore-whang.cpp checks to see if opening a file failed

   input.open("explore.in");
   if (input.fail()) return 1;

In currents-deweese.cpp we actually see a function template defined:

    template < class T >
    void tokenize(string s, vector< T >& v)
    {
	istringstream ss(s);
	v.assign(istream_iterator< T >(ss), istream_iterator< T >());
    }
Note that this is built to work on any vector. Sadly, it isn't used, so we don't know what was intended by this code. When working under a time limit (as is common for programmers), dead code, like this, is often left behind in sources to puzzle later readers.

Incidently, starting with Fall 2010, the judges started posting C++ solutions. The judge's solution seymour.cpp gives an example of operator overloading

  struct point {
  int x, y;
  point() {}
  point(int _x, int _y) : x(_x), y(_y) {}
  bool operator< (const point &b) const { return g[y][x] < g[b.y][b.x]; }
  };
where it is defined what it means for one point to be less than another point.

Nothing new in Spring 2011. In Fall 2011 we get the judge's solution bartering_ryan.cpp, which makes significant usage of the multiset sublibrary of the set library. It also has a local class declaration

 struct State {
   int n;
   multiset< int > have;
   State() {}
   State(int _n, multiset< int > _have) : n(_n), have(_have) {}
   bool operator<(const State &b) const {
     return have < b.have;
   }
 };
that has two different constructors depending on the parameters it is invoked with. This shouldn't look odd to a Java programmer, but the usage of : followed by n(_n) to bind the field n to the parameter _n is yet another odd C++ notation.

For Spring 2012 , we have only judge's solutions in C++. None of them introduce new C++ concepts.

** If you try to implement the rations problem, the judge's solution: rations_noel.cpp found the data structure:

   vector< pair< int, int > > items;
useful, as well as the sort algorithm.

** To solve the numerology problem, the judges found it useful to declare a vector and a map. Can you figure out what vector and map would be useful for in this problem?

Moving on to Fall 2012, gives us 12 C++ programs, 6 student solutions and 6 judge's solutions. The judge solution water.cpp contained the following alternative to just including the standard namespace:

using std::cin;
using std::cout;
using std::string;
telling us that only certain parts of the standard namespace are being used. We also find the line using the string s:
   d[j][k] = d[j-1][k] + d[j][k-1] - d[j-1][k-1] + (s.find(c) != string::npos ? 1 : 0);
showing the use of npos to mark that something was not found in a sequence of size npos (where indices count from 0).

In the judge's solution to pranks.cpp, we see a more extensive use of typedef, than we saw before:

  typedef pair< double, double > PDD;
  typedef vector< PDD > VPDD;
  typedef vector< double > VD;
  typedef vector< VD > VVD;
  typedef vector< int > VI;
  typedef vector< VI > VVI;

** With regards to the pits problem, both the student and judge's solutions make use of a queue in the same way. Can you see why they both decided to use a queue for this problem and what that queue should be (as well as what objects should go in that queue)?

In Spring 2013 nothing new in terms of C++ concepts. However, there is an interesting programming style exhibited in fight.yu.cpp with the macros:

  #define FOR(i,a,b) for(int i=int(a);i<=int(b);++i)
  #define REP(i,n) FOR(i,0,(n)-1)
defined to handle standard forms of the for loop (they occur again in thinice.yu.cpp as well as in later contests). The fight problem involves pairs, so we also see more examples of operator overloading:
   pair< double,double > operator+(const pair< double,double >& a, const pair< double,double >& b) {
	return MP(a.first+b.first, a.second+b.second);
}

   pair< double,double > operator*(double a, const pair< double,double >& b) {
	return MP(a*b.first, a*b.second);
}

The Fall 2013 gives us 11 more C++ programs. In callback.zheng.cpp, we see:

    map < string, set< string > >::iterator iterator;
    for (iterator = ans.begin(); iterator!=ans.end(); ++iterator)
showing us that we can reuse standard terms, such as iterator, as regular variable names (just to make code more puzzling).

In codebreakers.darryl.cpp, we see usage of cassert (C++'s version of C's assert.h). In the usage:

 cin >> n >> e >> c;
 assert(n < 1000 * 1000 * 1000 && e <= 10000 && c < n);
we see assert checking that the inputs are valid. In the later usage:
   EGCD egcd = get_egcd(e, (p-1)*(q-1));
   // confirm e and (p-1)(q-1) are relatively prime
   assert(egcd.gcd == 1);
we see assert confirming a piece of program logic. As in C, checks done using C++ can be turned off by defining the macro NDEBUG before including cassert (this is usually done as a compile line option so that the source code doesn't have to be edited to turn the check on and off).

Nothing new wrt C++ in Spring 2014 although assert gets used in more of the solutions. Nothing new wrt C++ in Fall 2014 or Spring 2015 either (Spring 2015 is as far as the contest had gotten at this writing). As we have gotten further into the contests, the new material has gotten scarcer and more esoteric. There is also a temporal effect as different features of the language become popular at different times in different programmer communities.

One thing that is special about these programming contest programs is that they are all one file programs. So, in the next lesson, let's look at something that sits across multiple files.