From: Jeff Squyres Date: January 22, 2008 7:07:05 PM CST To: mpi-21@XXXXXXXXXXX,mpi-21@XXXXXXXXXXXXX Subject: [mpi-21] C++ predefined MPI handles, const, IN/INOUT/OUT, etc. Reply-To: mpi-21@XXXXXXXXXXXXX The 3 proposals that I sent about C++ issues are both intertwined and represent a very complex set of issues. Shorter version =============== Does anyone know/remember why the "special case" for the definition of OUT parameters exists in MPI-1:2.2? I ask because the C++ bindings were modeled off the IN/OUT/INOUT designations of the language neutral bindings. MPI_COMM_SET_NAME (and others) use the "special case" definition of the [IN]OUT designation for the MPI communicator handle parameter. Two facts indicate that we should either override this INOUT designation for the C++ binding (and therefore make the method const) and/or revisit the "special case" language in MPI-1:2.2: 1. The C binding does not allow the implementation to change the handle value 2. The following is a valid MPI code: MPI::Intracomm cxx_comm = MPI::COMM_WORLD; cxx_comm.Set_name("foo"); MPI::COMM_WORLD.Get_name(name, len); cout << name << endl; The output will be "foo" even though we set the name on cxx_comm and retrieved it from MPI::COMM_WORLD ***because the state changed on the underlying MPI object, not the upper-level handles*** (the same is true for error handlers). Hence, the Set_name() method should be const because the MPI handle will not (and cannot) change. Similar arguments apply to keeping the MPI predefined C++ handles as "const" (MPI::INT, etc.) -- their values must never change during execution. It then follows that unless there is a good reason for the "special case" language in MPI-1:2.2, it should be removed. Longer version / more details ============================= At the heart of the issue seems to be text from MPI-1:2.2 about the definition of IN, OUT, and INOUT parameters to MPI functions. This text was used to guide many of the decisions about the C++ bindings, such as the const-ness (or not) of C++ methods and MPI predefined C++ handles. The text states: ----- * the call uses but does not update an argument marked IN * the call may update an argument marked OUT * the call both uses and updates an argument marked INOUT There is one special case -- if an argument is a handle to an opaque object (these terms are defined in Section 2.4.1) and the object is updated by the procedure call, then the argument is marked OUT. It is marked this way even though the handle itself is not modified -- we use the OUT attribute to denote that what the handle _references_ is updated. ----- The special case for the OUT definition is important because the C++ bindings were created to mimic the IN, OUT, and INOUT behavior in a language that is stricter than C and Fortran: C++ will fail to compile if an application violates the defined semantics (which is a good thing). *** The big question: does anyone know/remember why this special case *** for the "OUT" definition exists? The special case seems to imply that *explicit* changes to MPI objects should be marked as an [IN]OUT parameter (e.g., SET_NAME and SET_ERRHANDLER). Apparently, *implicit* changes to the underlying MPI object (such as MPI_ISEND) do not count / should be IN (i.e., many MPI implementation *do* change the state either on the communicator or something related to the communicator when a send or receive is initiated, even though the communicator is an IN argument). But remember that MPI clearly states that the handle is separate from the underlying MPI object. So why does the binding care if the back-end object is updated? (regardless of whether the change to the object is explicit or implicit) For example, the language-neutral binding for MPI_COMM_SET_NAME has the communicator as an INOUT argument. This clearly falls within the "special case" definition because the function semantics explicitly change state on the underlying MPI object. But note that the C binding is "int MPI_Comm_set_name(MPI_Comm comm, ...)". Notice that the comm is passed by value, not by reference. So even though the language neutral binding called that parameter INOUT, it's not possible for the MPI implementation to change the value of the handle. My claim is that if we want to ensure that the C++ bindings match the C bindings (i.e., that the implementation cannot change the value of the MPI handle), then the method should be const (i.e., cxx_comm.Set_name(...)) *because the handle value will not, and ***cannot***, change*. Simply put: regardless of language or implementation, MPI handles must have true handle semantics. For example: MPI::Intracomm cxx_comm = MPI::COMM_WORLD; cxx_comm.Set_name("C++ r00l3z!"); MPI::COMM_WORLD.Get_name(name, len); cout << name << endl; The above will output "C++ r00l3z!" because cxx_comm and MPI::COMM_WORLD are handles referring to the same underlying communicator. Hence, the only state that the handles have is whatever refers to their back-end MPI object. Having Set_name() be const keeps the *handle* const, not the underlying MPI object. Tying this all together: 1. cxx_comm.Set_name() *cannot* change state on the cxx_comm handle because cxx_comm.Get_name() and MPI::COMM_WORLD.Get_name() must return the same results (the same is true for error handlers). Hence, regardless of the implementation of the C++ bindings, the handle value cannot change. Therefore, this method (and all the others like it) should be const. 2. As a related issue, if no one can remember why the "special case" exists for OUT, then I think we should remove this text and then change all those INOUT parameters for the functions I cited in my earlier proposal to IN. This would make the C++ bindings consistent with the IN/OUT/INOUT specifications of the language-neutral bindings. 3. All the MPI C++ predefined handles should be const for many of the same reasons. Regardless of what happens to the underlying MPI object, the value of the handle cannot ever change. This is guaranteed by MPI-2:2.5.4 pages 10 lines 38-41: "All named constants, with the exceptions noted below for Fortran, can be used in initialization expressions or assignments. These constants do not change values during execution. Opaque objects accessed by constant handles are defined and do not change value between MPI initialization MPI_INIT and MPI completion MPI_FINALIZE." Hence, they should all be "const". ----- In short: C++ gives us stronger protections to ensure that applications don't shoot themselves in the foot. If the MPI predefined handles are const, then statements like "MPI::INT = my_dtype;" will fail to compile. This is a Good Thing. The original C++ bindings tried to take advantage of const, but missed a few points. Ballot two and one of the items in ballot 3 incorrectly tried to fix these points by removing const in several places. That "fixes" the problem, but removes many of the good qualities that we can get in C++ with "const". So let's fix the real problem and leave "const" in the C++ bindings. Are you confused yet? :-) -- Jeff Squyres Cisco Systems