From: Quincey Koziol Date: January 10, 2008 7:00:30 AM CST To: mpi-21@XXXXXXXXXXXXX Subject: Re: [mpi-21] Proposal EH2: add const keyword to the C bindings Reply-To: mpi-21@XXXXXXXXXXXXX On Jan 10, 2008, at 6:37 AM, Dries Kimpe wrote: > > * Richard Treumann [2008-01-09 18:16:29]:>> If the "subset of processes within a communicator" is reasonably >> stable you >> can do better by making an additional communicator containing only >> those >> processes and doing MPI_Bcast on this subset communicator. If the >> subset >> keeps changing then making the subset communicator may be too costly >> compared to the savings. > > Everybody keeps assuming *exactly* the same buffer is sent to the > other > ranks; What if you're sending different parts of the same > datastructure to > different ranks, but some of the parts overlap? I have another use case, which has come up within the HDF5 library. We have an API call "H5Dwrite" that accepts a const pointer to a buffer of data elements to write to an HDF5 file. Since the MPI_File_write* calls take non-const pointers to their buffers, I have to cast away the 'constness' of the buffer before passing it to them. I definitely won't let this single, optional chance that an application wants to write data with MPI (as opposed to other forms of I/O the HDF5 can perform) make me change the buffer parameter to H5Dwrite() to be non-const, but I hate casting away the constness of the pointer when I pass it to MPI. This may be "wrong" from the MPI standards point of view, but our library is not solely dependent on using MPI for I/O and I want to assert to applications that call H5Dwrite() that we won't modify their buffer when we perform I/O on it. In a sense, I'm carrying up the semantics for the POSIX write() call (which has a const pointer to its buffer). So far, no MPI implementation has bitten me on this... (as Dries says below, etc.) Quincey > A common 'pattern' to do this is to create the datatypes, and a > mapping > of (rank,datatype). Then do Isend for every rank,datatype and > waitall at > the end. > > Why not use alltoall/scatter? Well, if every rank is only > communicating to > a limited (non-growing with universe size) number of other ranks, > using > collectives serializes the transfer. (nonblocking collectives? ;-) > > Although I wouldn't have the user program read from a in-use send > buffer, > I for sure am a sinner by having implemented the above described > scheme. > > I coincide with Gregor: > - I've done this, and by consequence am among those (few??) that do > write > wrong MPI programs. > - The program ran(and runs) without problems on 3 different > architectures > and at least 4 different MPI implementations. > - Not being able to do this forces me to copy every time and reduces > performance > > Maybe it is reasonable to allow the case of using the send buffer > multiple > times, but not allowing the user program to touch it? This wouldn't > prohibit byte-swapping tricks, and if there is an MPI > implementation that > has severe restrictions on send buffers, it might do a little bit > extra > bookkeeping (if it doesn't do it already) and handle the case of > overlapping send buffers... > > Implementations that don't care (most current implementations it > seems) > won't need to do anything special. > > Greetings, > Dries > > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm > >