Asked  6 Months ago    Answers:  5   Viewed   50 times

In Go there are various ways to return a struct value or slice thereof. For individual ones I've seen:

type MyStruct struct {
    Val int
}

func myfunc() MyStruct {
    return MyStruct{Val: 1}
}

func myfunc() *MyStruct {
    return &MyStruct{}
}

func myfunc(s *MyStruct) {
    s.Val = 1
}

I understand the differences between these. The first returns a copy of the struct, the second a pointer to the struct value created within the function, the third expects an existing struct to be passed in and overrides the value.

I've seen all of these patterns be used in various contexts, I'm wondering what the best practices are regarding these. When would you use which? For instance, the first one could be ok for small structs (because the overhead is minimal), the second for bigger ones. And the third if you want to be extremely memory efficient, because you can easily reuse a single struct instance between calls. Are there any best practices for when to use which?

Similarly, the same question regarding slices:

func myfunc() []MyStruct {
    return []MyStruct{ MyStruct{Val: 1} }
}

func myfunc() []*MyStruct {
    return []MyStruct{ &MyStruct{Val: 1} }
}

func myfunc(s *[]MyStruct) {
    *s = []MyStruct{ MyStruct{Val: 1} }
}

func myfunc(s *[]*MyStruct) {
    *s = []MyStruct{ &MyStruct{Val: 1} }
}

Again: what are best practices here. I know slices are always pointers, so returning a pointer to a slice isn't useful. However, should I return a slice of struct values, a slice of pointers to structs, should I pass in a pointer to a slice as argument (a pattern used in the Go App Engine API)?

 Answers

23

tl;dr:

  • Methods using receiver pointers are common; the rule of thumb for receivers is, "If in doubt, use a pointer."
  • Slices, maps, channels, strings, function values, and interface values are implemented with pointers internally, and a pointer to them is often redundant.
  • Elsewhere, use pointers for big structs or structs you'll have to change, and otherwise pass values, because getting things changed by surprise via a pointer is confusing.

One case where you should often use a pointer:

  • Receivers are pointers more often than other arguments. It's not unusual for methods to modify the thing they're called on, or for named types to be large structs, so the guidance is to default to pointers except in rare cases.
    • Jeff Hodges' copyfighter tool automatically searches for non-tiny receivers passed by value.

Some situations where you don't need pointers:

  • Code review guidelines suggest passing small structs like type Point struct { latitude, longitude float64 }, and maybe even things a bit bigger, as values, unless the function you're calling needs to be able to modify them in place.

    • Value semantics avoid aliasing situations where an assignment over here changes a value over there by surprise.
    • It's not Go-y to sacrifice clean semantics for a little speed, and sometimes passing small structs by value is actually more efficient, because it avoids cache misses or heap allocations.
    • So, Go Wiki's code review comments page suggests passing by value when structs are small and likely to stay that way.
    • If the "large" cutoff seems vague, it is; arguably many structs are in a range where either a pointer or a value is OK. As a lower bound, the code review comments suggest slices (three machine words) are reasonable to use as value receivers. As something nearer an upper bound, bytes.Replace takes 10 words' worth of args (three slices and an int). You can find situations where copying even large structs turns out a performance win, but the rule of thumb is not to.
  • For slices, you don't need to pass a pointer to change elements of the array. io.Reader.Read(p []byte) changes the bytes of p, for instance. It's arguably a special case of "treat little structs like values," since internally you're passing around a little structure called a slice header (see Russ Cox (rsc)'s explanation). Similarly, you don't need a pointer to modify a map or communicate on a channel.

  • For slices you'll reslice (change the start/length/capacity of), built-in functions like append accept a slice value and return a new one. I'd imitate that; it avoids aliasing, returning a new slice helps call attention to the fact that a new array might be allocated, and it's familiar to callers.

    • It's not always practical follow that pattern. Some tools like database interfaces or serializers need to append to a slice whose type isn't known at compile time. They sometimes accept a pointer to a slice in an interface{} parameter.
  • Maps, channels, strings, and function and interface values, like slices, are internally references or structures that contain references already, so if you're just trying to avoid getting the underlying data copied, you don't need to pass pointers to them. (rsc wrote a separate post on how interface values are stored).

    • You still may need to pass pointers in the rarer case that you want to modify the caller's struct: flag.StringVar takes a *string for that reason, for example.

Where you use pointers:

  • Consider whether your function should be a method on whichever struct you need a pointer to. People expect a lot of methods on x to modify x, so making the modified struct the receiver may help to minimize surprise. There are guidelines on when receivers should be pointers.

  • Functions that have effects on their non-receiver params should make that clear in the godoc, or better yet, the godoc and the name (like reader.WriteTo(writer)).

  • You mention accepting a pointer to avoid allocations by allowing reuse; changing APIs for the sake of memory reuse is an optimization I'd delay until it's clear the allocations have a nontrivial cost, and then I'd look for a way that doesn't force the trickier API on all users:

    1. For avoiding allocations, Go's escape analysis is your friend. You can sometimes help it avoid heap allocations by making types that can be initialized with a trivial constructor, a plain literal, or a useful zero value like bytes.Buffer.
    2. Consider a Reset() method to put an object back in a blank state, like some stdlib types offer. Users who don't care or can't save an allocation don't have to call it.
    3. Consider writing modify-in-place methods and create-from-scratch functions as matching pairs, for convenience: existingUser.LoadFromJSON(json []byte) error could be wrapped by NewUserFromJSON(json []byte) (*User, error). Again, it pushes the choice between laziness and pinching allocations to the individual caller.
    4. Callers seeking to recycle memory can let sync.Pool handle some details. If a particular allocation creates a lot of memory pressure, you're confident you know when the alloc is no longer used, and you don't have a better optimization available, sync.Pool can help. (CloudFlare published a useful (pre-sync.Pool) blog post about recycling.)

Finally, on whether your slices should be of pointers: slices of values can be useful, and save you allocations and cache misses. There can be blockers:

  • The API to create your items might force pointers on you, e.g. you have to call NewFoo() *Foo rather than let Go initialize with the zero value.
  • The desired lifetimes of the items might not all be the same. The whole slice is freed at once; if 99% of the items are no longer useful but you have pointers to the other 1%, all of the array remains allocated.
  • Moving values around might cause you performance or correctness problems, making pointers more attractice. Notably, append copies items when it grows the underlying array. Pointers you got before the append point to the wrong place after, copying can be slower for huge structs, and for e.g. sync.Mutex copying isn't allowed. Insert/delete in the middle and sorting similarly move items around.

Broadly, value slices can make sense if either you get all of your items in place up front and don't move them (e.g., no more appends after initial setup), or if you do keep moving them around but you're sure that's OK (no/careful use of pointers to items, items are small enough to copy efficiently, etc.). Sometimes you have to think about or measure the specifics of your situation, but that's a rough guide.

Tuesday, June 1, 2021
 
hillz
answered 6 Months ago
25

Sydius outlined the types fairly well:

  • Normal pointers are just that - they point to some thing in memory somewhere. Who owns it? Only the comments will let you know. Who frees it? Hopefully the owner at some point.
  • Smart pointers are a blanket term that cover many types; I'll assume you meant scoped pointer which uses the RAII pattern. It is a stack-allocated object that wraps a pointer; when it goes out of scope, it calls delete on the pointer it wraps. It "owns" the contained pointer in that it is in charge of deleteing it at some point. They allow you to get a raw reference to the pointer they wrap for passing to other methods, as well as releasing the pointer, allowing someone else to own it. Copying them does not make sense.
  • Shared pointers is a stack-allocated object that wraps a pointer so that you don't have to know who owns it. When the last shared pointer for an object in memory is destructed, the wrapped pointer will also be deleted.

How about when you should use them? You will either make heavy use of scoped pointers or shared pointers. How many threads are running in your application? If the answer is "potentially a lot", shared pointers can turn out to be a performance bottleneck if used everywhere. The reason being that creating/copying/destructing a shared pointer needs to be an atomic operation, and this can hinder performance if you have many threads running. However, it won't always be the case - only testing will tell you for sure.

There is an argument (that I like) against shared pointers - by using them, you are allowing programmers to ignore who owns a pointer. This can lead to tricky situations with circular references (Java will detect these, but shared pointers cannot) or general programmer laziness in a large code base.

There are two reasons to use scoped pointers. The first is for simple exception safety and cleanup operations - if you want to guarantee that an object is cleaned up no matter what in the face of exceptions, and you don't want to stack allocate that object, put it in a scoped pointer. If the operation is a success, you can feel free to transfer it over to a shared pointer, but in the meantime save the overhead with a scoped pointer.

The other case is when you want clear object ownership. Some teams prefer this, some do not. For instance, a data structure may return pointers to internal objects. Under a scoped pointer, it would return a raw pointer or reference that should be treated as a weak reference - it is an error to access that pointer after the data structure that owns it is destructed, and it is an error to delete it. Under a shared pointer, the owning object can't destruct the internal data it returned if someone still holds a handle on it - this could leave resources open for much longer than necessary, or much worse depending on the code.

Tuesday, June 1, 2021
 
rojo
answered 6 Months ago
86

Danny Kalev explains this quite nicely:

The Underlying Representation of Pointers to Members

Although pointers to members behave like ordinary pointers, behind the scenes their representation is quite different. In fact, a pointer to member usually consists of a struct containing up to four fields in certain cases. This is because pointers to members have to support not only ordinary member functions, but also virtual member functions, member functions of objects that have multiple base classes, and member functions of virtual base classes. Thus, the simplest member function can be represented as a set of two pointers: one holding the physical memory address of the member function, and a second pointer that holds the this pointer. However, in cases like a virtual member function, multiple inheritance and virtual inheritance, the pointer to member must store additional information. Therefore, you can't cast pointers to members to ordinary pointers nor can you safely cast between pointers to members of different types.

To get a notion of how your compiler represents pointers to members, use the sizeof operator. In the following example, the sizes of a pointer to data member and a pointer to a member function are taken. As you can see, they have different sizes, hence, different representations:

struct A
{
 int x;
 void f();
};
int A::*pmi = &A::x;
void (A::*pmf)() = &A::f;
int n = sizeof (pmi); // 8 byte with my compiler
int m = sizeof (pmf); // 12 bytes with my compiler

Note that each of these pointers may have a different representation, depending on the class in question and whether the member function is virtual.

Saturday, July 3, 2021
 
Hilmi
answered 5 Months ago
32

In

int a = 5;
int *b = &a;   
int *c = &b;

You get a warning because &b is of type int **, and you try to initialize a variable of type int *. There's no implicit conversions between those two types, leading to the warning.

To take the longer example you want to work, if we try to dereference f the compiler will give us an int, not a pointer that we can further dereference.

Also note that on many systems int and int* are not the same size (e.g. a pointer may be 64 bits long and an int 32 bits long). If you dereference f and get an int, you lose half the value, and then you can't even cast it to a valid pointer.

Saturday, September 4, 2021
 
Martin Vseticka
answered 3 Months ago
74

In a function definition, you cannot use incomplete types: [dcl.fct]/12:

The type of a parameter or the return type for a function definition shall not be an incomplete (possibly cv-qualified) class type in the context of the function definition unless the function is deleted.

But a function declaration has no such restriction. By the time you define Bar::get and Bar::set, Foo is a complete type, so the program is fine.

Saturday, October 9, 2021
 
Michael Emerson
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share