Asked  7 Months ago    Answers:  5   Viewed   45 times

I don't understand when I should use std::move and when I should let the compiler optimize... for example:

using SerialBuffer = vector< unsigned char >;

// let compiler optimize it
SerialBuffer read( size_t size ) const
{
    SerialBuffer buffer( size );
    read( begin( buffer ), end( buffer ) );
    // Return Value Optimization
    return buffer;
}

// explicit move
SerialBuffer read( size_t size ) const
{
    SerialBuffer buffer( size );
    read( begin( buffer ), end( buffer ) );
    return move( buffer );
}

Which should I use?

 Answers

100

Use exclusively the first method:

Foo f()
{
  Foo result;
  mangle(result);
  return result;
}

This will already allow the use of the move constructor, if one is available. In fact, a local variable can bind to an rvalue reference in a return statement precisely when copy elision is allowed.

Your second version actively prohibits copy elision. The first version is universally better.

Tuesday, June 1, 2021
 
ojrac
answered 7 Months ago
20

Wow, there is just so much to clean up here...

First, the Copy and Swap is not always the correct way to implement Copy Assignment. Almost certainly in the case of dumb_array, this is a sub-optimal solution.

The use of Copy and Swap is for dumb_array is a classic example of putting the most expensive operation with the fullest features at the bottom layer. It is perfect for clients who want the fullest feature and are willing to pay the performance penalty. They get exactly what they want.

But it is disastrous for clients who do not need the fullest feature and are instead looking for the highest performance. For them dumb_array is just another piece of software they have to rewrite because it is too slow. Had dumb_array been designed differently, it could have satisfied both clients with no compromises to either client.

The key to satisfying both clients is to build the fastest operations in at the lowest level, and then to add API on top of that for fuller features at more expense. I.e. you need the strong exception guarantee, fine, you pay for it. You don't need it? Here's a faster solution.

Let's get concrete: Here's the fast, basic exception guarantee Copy Assignment operator for dumb_array:

dumb_array& operator=(const dumb_array& other)
{
    if (this != &other)
    {
        if (mSize != other.mSize)
        {
            delete [] mArray;
            mArray = nullptr;
            mArray = other.mSize ? new int[other.mSize] : nullptr;
            mSize = other.mSize;
        }
        std::copy(other.mArray, other.mArray + mSize, mArray);
    }
    return *this;
}

Explanation:

One of the more expensive things you can do on modern hardware is make a trip to the heap. Anything you can do to avoid a trip to the heap is time & effort well spent. Clients of dumb_array may well want to often assign arrays of the same size. And when they do, all you need to do is a memcpy (hidden under std::copy). You don't want to allocate a new array of the same size and then deallocate the old one of the same size!

Now for your clients who actually want strong exception safety:

template <class C>
C&
strong_assign(C& lhs, C rhs)
{
    swap(lhs, rhs);
    return lhs;
}

Or maybe if you want to take advantage of move assignment in C++11 that should be:

template <class C>
C&
strong_assign(C& lhs, C rhs)
{
    lhs = std::move(rhs);
    return lhs;
}

If dumb_array's clients value speed, they should call the operator=. If they need strong exception safety, there are generic algorithms they can call that will work on a wide variety of objects and need only be implemented once.

Now back to the original question (which has a type-o at this point in time):

Class&
Class::operator=(Class&& rhs)
{
    if (this == &rhs)  // is this check needed?
    {
       // ...
    }
    return *this;
}

This is actually a controversial question. Some will say yes, absolutely, some will say no.

My personal opinion is no, you don't need this check.

Rationale:

When an object binds to an rvalue reference it is one of two things:

  1. A temporary.
  2. An object the caller wants you to believe is a temporary.

If you have a reference to an object that is an actual temporary, then by definition, you have a unique reference to that object. It can't possibly be referenced by anywhere else in your entire program. I.e. this == &temporary is not possible.

Now if your client has lied to you and promised you that you're getting a temporary when you're not, then it is the client's responsibility to be sure that you don't have to care. If you want to be really careful, I believe that this would be a better implementation:

Class&
Class::operator=(Class&& other)
{
    assert(this != &other);
    // ...
    return *this;
}

I.e. If you are passed a self reference, this is a bug on the part of the client that should be fixed.

For completeness, here is a move assignment operator for dumb_array:

dumb_array& operator=(dumb_array&& other)
{
    assert(this != &other);
    delete [] mArray;
    mSize = other.mSize;
    mArray = other.mArray;
    other.mSize = 0;
    other.mArray = nullptr;
    return *this;
}

In the typical use case of move assignment, *this will be a moved-from object and so delete [] mArray; should be a no-op. It is critical that implementations make delete on a nullptr as fast as possible.

Caveat:

Some will argue that swap(x, x) is a good idea, or just a necessary evil. And this, if the swap goes to the default swap, can cause a self-move-assignment.

I disagree that swap(x, x) is ever a good idea. If found in my own code, I will consider it a performance bug and fix it. But in case you want to allow it, realize that swap(x, x) only does self-move-assignemnet on a moved-from value. And in our dumb_array example this will be perfectly harmless if we simply omit the assert, or constrain it to the moved-from case:

dumb_array& operator=(dumb_array&& other)
{
    assert(this != &other || mSize == 0);
    delete [] mArray;
    mSize = other.mSize;
    mArray = other.mArray;
    other.mSize = 0;
    other.mArray = nullptr;
    return *this;
}

If you self-assign two moved-from (empty) dumb_array's, you don't do anything incorrect aside from inserting useless instructions into your program. This same observation can be made for the vast majority of objects.

<Update>

I've given this issue some more thought, and changed my position somewhat. I now believe that assignment should be tolerant of self assignment, but that the post conditions on copy assignment and move assignment are different:

For copy assignment:

x = y;

one should have a post-condition that the value of y should not be altered. When &x == &y then this postcondition translates into: self copy assignment should have no impact on the value of x.

For move assignment:

x = std::move(y);

one should have a post-condition that y has a valid but unspecified state. When &x == &y then this postcondition translates into: x has a valid but unspecified state. I.e. self move assignment does not have to be a no-op. But it should not crash. This post-condition is consistent with allowing swap(x, x) to just work:

template <class T>
void
swap(T& x, T& y)
{
    // assume &x == &y
    T tmp(std::move(x));
    // x and y now have a valid but unspecified state
    x = std::move(y);
    // x and y still have a valid but unspecified state
    y = std::move(tmp);
    // x and y have the value of tmp, which is the value they had on entry
}

The above works, as long as x = std::move(x) doesn't crash. It can leave x in any valid but unspecified state.

I see three ways to program the move assignment operator for dumb_array to achieve this:

dumb_array& operator=(dumb_array&& other)
{
    delete [] mArray;
    // set *this to a valid state before continuing
    mSize = 0;
    mArray = nullptr;
    // *this is now in a valid state, continue with move assignment
    mSize = other.mSize;
    mArray = other.mArray;
    other.mSize = 0;
    other.mArray = nullptr;
    return *this;
}

The above implementation tolerates self assignment, but *this and other end up being a zero-sized array after the self-move assignment, no matter what the original value of *this is. This is fine.

dumb_array& operator=(dumb_array&& other)
{
    if (this != &other)
    {
        delete [] mArray;
        mSize = other.mSize;
        mArray = other.mArray;
        other.mSize = 0;
        other.mArray = nullptr;
    }
    return *this;
}

The above implementation tolerates self assignment the same way the copy assignment operator does, by making it a no-op. This is also fine.

dumb_array& operator=(dumb_array&& other)
{
    swap(other);
    return *this;
}

The above is ok only if dumb_array does not hold resources that should be destructed "immediately". For example if the only resource is memory, the above is fine. If dumb_array could possibly hold mutex locks or the open state of files, the client could reasonably expect those resources on the lhs of the move assignment to be immediately released and therefore this implementation could be problematic.

The cost of the first is two extra stores. The cost of the second is a test-and-branch. Both work. Both meet all of the requirements of Table 22 MoveAssignable requirements in the C++11 standard. The third also works modulo the non-memory-resource-concern.

All three implementations can have different costs depending on the hardware: How expensive is a branch? Are there lots of registers or very few?

The take-away is that self-move-assignment, unlike self-copy-assignment, does not have to preserve the current value.

</Update>

One final (hopefully) edit inspired by Luc Danton's comment:

If you're writing a high level class that doesn't directly manage memory (but may have bases or members that do), then the best implementation of move assignment is often:

Class& operator=(Class&&) = default;

This will move assign each base and each member in turn, and will not include a this != &other check. This will give you the very highest performance and basic exception safety assuming no invariants need to be maintained among your bases and members. For your clients demanding strong exception safety, point them towards strong_assign.

Saturday, June 5, 2021
 
Norgul
answered 7 Months ago
60

RVO/NRVO are clearly allowed under the "as-if" rule in C.

In C++ you can get observable side-effects because you've overloaded the constructor, destructor, and/or assignment operator to give those side effects (e.g., print something out when one of those operations happens), but in C you don't have any ability to overload those operators, and the built-in ones have no observable side effects.

Without overloading them, you get no observable side-effects from copy elision, and therefore nothing to stop a compiler from doing it.

Monday, August 2, 2021
 
shin
answered 5 Months ago
73

I am used to optimizations being constrained such that they cannot change observable behaviour.

This is correct. As a general rule -- known as the as-if rule -- compilers can change code if the change is not observable.

This restriction does not seem to apply to RVO.

Yes. The clause quoted in the OP gives an exception to the as-if rule and allows copy construction to be omitted, even when it has side effects. Notice that the RVO is just one case of copy-elision (the first bullet point in C++11 12.8/31).

Do I ever need to worry about the side effects mentioned in the standard?

If the copy constructor has side effects such that copy elision when performed causes a problem, then you should reconsider the design. If this is not your code, you should probably consider a better alternative.

What do I as a programmer need to do (or not do) to allow this optimization to be performed?

Basically, if possible, return a local variable (or temporary) with the same cv unqualified type as the function return type. This allows RVO but doens't enforce it (the compiler might not perform RVO).

For example, does the following prohibit the use of copy elision (due to the move):

// notice that I fixed the OP's example by adding <double>
std::vector<double> foo(int bar){
    std::vector<double> quux(bar, 0);
    return std::move(quux);
}

Yes, it does because you're not returning the name of a local variable. This

std::vector<double> foo(int bar){
    std::vector<double> quux(bar,0);
    return quux;
}

allows RVO. One might be worried that if RVO is not performed then moving is better than coping (which would explain the use of std::move above). Don't worry about that. All major compilers will do the RVO here (at least in release build). Even if a compiler doesn't do RVO but the conditions for RVO are met then it will try to do a move rather than a copy. In summary, using std::move above will certainly make a move. Not using it will likely neither copy nor move anything and, in the worst (unlikely) case, will move.

(Update: As haohaolee's pointed out (see comments), the following paragraphs are not correct. However, I leave them here because they suggest an idea that might work for classes that don't have a constructor taking a std::initializer_list (see the reference at the bottom). For std::vector, haohaolee found a workaround.)

In this example you can force the RVO (strict speaking this is no longer RVO but let's keep calling this way for simplicity) by returning a braced-init-list from which the return type can be created:

std::vector<double> foo(int bar){
    return {bar, 0}; // <-- This doesn't work. Next line shows a workaround:
    // return {bar, 0.0, std::vector<double>::allocator_type{}};
}

See this post and R. Martinho Fernandes's brilliant answer.

Be carefull! Have the return type been std::vector<int> the last code above would have a different behavior from the original. (This is another story.)

Sunday, August 8, 2021
 
danjah
answered 4 Months ago
42

I like to measure, so I set up this Object:

#include <iostream>

struct Object
{
    Object() {}
    Object(const Object&) {std::cout << "Object(const Object&)n";}
    Object(Object&&) {std::cout << "Object(Object&&)n";}

    Object& makeChanges() {return *this;}
};

And I theorized that some solutions may give different answers for xvalues and prvalues (both of which are rvalues). And so I decided to test both of them (in addition to lvalues):

Object source() {return Object();}

int main()
{
    std::cout << "process lvalue:nn";
    Object x;
    Object t = process(x);
    std::cout << "nprocess xvalue:nn";
    Object u = process(std::move(x));
    std::cout << "nprocess prvalue:nn";
    Object v = process(source());
}

Now it is a simple matter of trying all of your possibilities, those contributed by others, and I threw one in myself:

#if PROCESS == 1

Object
process(Object arg)
{
    return arg.makeChanges();
}

#elif PROCESS == 2

Object
process(const Object& arg)
{
    return Object(arg).makeChanges();
}

Object
process(Object&& arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 3

Object
process(const Object& arg)
{
    Object retObj = arg;
    retObj.makeChanges();
    return retObj; 
}

Object
process(Object&& arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 4

Object
process(Object arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 5

Object
process(Object arg)
{
    arg.makeChanges();
    return arg;
}

#endif

The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:

+----+--------+--------+---------+
|    | lvalue | xvalue | prvalue |    legend: copies/moves
+----+--------+--------+---------+
| p1 |  2/0   |  1/1   |   1/0   |
+----+--------+--------+---------+
| p2 |  2/0   |  0/1   |   0/1   |
+----+--------+--------+---------+
| p3 |  1/0   |  0/1   |   0/1   |
+----+--------+--------+---------+
| p4 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+
| p5 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+

process3 looks like the best solution to me. However it does require two overloads. One to process lvalues and one to process rvalues. If for some reason this is problematic, solutions 4 and 5 do the job with only one overload at the cost of 1 extra move construction for glvalues (lvalues and xvalues). It is a judgement call as to whether one wants to pay an extra move construction to save overloading (and there is no one right answer).

(answered) Why does RVO kick in the last option and not the second?

For RVO to kick in, the return statement needs to look like:

return arg;

If you complicate that with:

return std::move(arg);

or:

return arg.makeChanges();

then RVO gets inhibited.

Is there a better way to do this?

My favorites are p3 and p5. My preference of p5 over p4 is merely stylistic. I shy away from putting move on the return statement when I know it will be applied automatically for fear of accidentally inhibiting RVO. However in p5 RVO is not an option anyway, even though the return statement does get an implicit move. So p5 and p4 really are equivalent. Pick your style.

Had we passed in a temporary, 2nd and 3rd options would call a move constructor while returning. Is is possible to eliminate that using (N)RVO?

The "prvalue" column vs "xvalue" column addresses this question. Some solutions add an extra move construction for xvalues and some don't.

Friday, September 10, 2021
 
user113716
answered 3 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share