It didn’t take long for Golang, or Go, to grow on me.
As someone with a background in C and Python, it’s interesting to analyze and compare how the same functionality is implemented in Go. For example, Go compiles down to assembly just like C does. To me, Go is a weird phenomenon as it seems to be a return to form of sorts for programmers. Take away all of the fancy classes, inheritance, and introspection and go back to creating low level structures that describe actual data… kind of.
The more you play around with it, the less Go will feel like a compiled language like C or C++ and the more it feels like an interpreted language like Python or Ruby.
Let’s look at how arrays compare in Go vs. C:
What even is an array?
Arrays in Go look similar to arrays in C:
myArr := [10]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
printRawData(uintptr(unsafe.Pointer(&myArr)), unsafe.Sizeof(myArr))
>>>
Printing ptr 000000c000094e28 size 80
000000c000094e28 : 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
000000c000094e38 : 03 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00
000000c000094e48 : 05 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
000000c000094e58 : 07 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00
000000c000094e68 : 09 00 00 00 00 00 00 00 0a 00 00 00 00 00 00 00
But how do we create a pointer to the array? In C, you can just create an
int*
and point it at the array itself. Then, you can use this value to
traverse over the array or pass it into another function by reference.
But when we try this with Go…
myArr2 := myArr
printRawData(uintptr(unsafe.Pointer(&myArr2)), unsafe.Sizeof(myArr2))
>>>
Printing ptr 000000c000094dd8 size 80
000000c000094dd8 : 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
000000c000094de8 : 03 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00
000000c000094df8 : 05 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
000000c000094e08 : 07 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00
000000c000094e18 : 09 00 00 00 00 00 00 00 0a 00 00 00 00 00 00 00
It doesn’t exactly do what we expected. It seems that Go has allocated a second
buffer somewhere and copied the old array’s data to it. This myArr2
value is
actually a [10]int
, not an int*
like we’d expect when programming with C. So,
how do we get what we want?
The answer lies in the same syntax as C, the '&'
character:
myArrPtr := &myArr
printRawData(uintptr(unsafe.Pointer(myArrPtr)), unsafe.Sizeof(*myArrPtr))
>>>
Printing ptr 000000c000094e28 size 80
000000c000094e28 : 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
000000c000094e38 : 03 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00
000000c000094e48 : 05 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
000000c000094e58 : 07 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00
000000c000094e68 : 09 00 00 00 00 00 00 00 0a 00 00 00 00 00 00 00
Finally, we have a pointer to our original array. So, what are we observing here? In Go, assigning an array by value to another variable creates a duplicate of it. From the perspective of a C developer this is an odd behavior because, in C, memory allocation does not happen automatically and you have to explicitly allocate it upfront. I mean, who’s going to free it later?
Breaking the Rules
Go performs a lot of magic behind the scenes relating to how it allocates and uses memory, but with some caveats, it allows you to break its rules.
The most significant caveat, I’ve found, is that the package that lets you break
rules is aptly named unsafe
. I used this earlier above playing with arrays.
This package gives you introspection into where the data you’re accessing is and
how big it is.
Below is my implementation of printRawData
:
func printRawData(ptr uintptr, size uintptr) {
fmt.Printf("Printing ptr %016x size %d\n", ptr, size)
i := ptr
var offset uintptr
for i < ptr + size {
if offset % 16 == 0 {
fmt.Printf("%016x : ", i)
}
fmt.Printf("%02x", *(*byte)(unsafe.Pointer(i)))
i++
offset++
if offset % 16 == 0 || offset == size {
fmt.Print("\n")
} else if offset % 8 == 0 {
fmt.Print(" ")
} else {
fmt.Print(" ")
}
}
}
As you can see, it requires a memory address (void *
in C) and size. Thus, we
need to use unsafe.Pointer
and unsafe.Sizeof
to query these values. In the
function itself, I iterate over the given buffer byte-by-byte and print its
contents making sure to annotate it with the memory location along the way.
This function is relatively safe because it’s called by the context in which
the input values exist, only reads the data in question, and doesn’t give the
pointer to anybody else. At any rate, it’s a useful tool for this exercise.
Who needs arrays when you have slices?
For the most part, traditional arrays you’d see in C are replaced with what are called slices in Go. In many cases these are more functional. For example, here’s a slice:
mySlice := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
You don’t give it a size because the type is an int
slice or []int
. Now,
what happens if we dump the memory of mySlice
?
>>>
Printing ptr 000000c000094f70 size 24
000000c000094f70 : 78 4e 09 00 c0 00 00 00 0a 00 00 00 00 00 00 00
000000c000094f80 : 0a 00 00 00 00 00 00 00
If this doesn’t look like an array, that’s because it’s not. However, if you
look closely at the first 8 bytes, they look like a memory address. The subsequent
two 8-byte regions also contain the value 0x0a
(10 in decimal). In Go this
is called a slice header. This structure is defined in the reflect
library, and
we can convert our slice into a slice header with the unsafe
package:
mySliceHdr := *(*reflect.SliceHeader)(unsafe.Pointer(&mySlice))
fmt.Printf("%#v\n", mySliceHdr)
>>>
reflect.SliceHeader{Data:0xc000094e78, Len:10, Cap:10}
Now, the data at 0xc000094e78
looks like:
Printing ptr 000000c000094e78 size 80
000000c000094e78 : 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
000000c000094e88 : 03 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00
000000c000094e98 : 05 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
000000c000094ea8 : 07 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00
000000c000094eb8 : 09 00 00 00 00 00 00 00 0a 00 00 00 00 00 00 00
Perfect, there’s our array! Now that we have a slice, we can try to assign it to a separate value. Watch what happens:
mySlice2 := mySlice
printRawData(uintptr(unsafe.Pointer(&mySlice2)), unsafe.Sizeof(mySlice2))
mySliceHdr2 := *(*reflect.SliceHeader)(unsafe.Pointer(&mySlice2))
printRawData(mySliceHdr2.Data, uintptr(mySliceHdr2.Len * 8))
mySlice2
references a different memory address, but the slice header at that
address still references the buffer in question:
Printing ptr 000000c000094f58 size 24
000000c000094f58 : 78 4e 09 00 c0 00 00 00 0a 00 00 00 00 00 00 00
000000c000094f68 : 0a 00 00 00 00 00 00 00
Printing ptr 000000c000094e78 size 80
000000c000094e78 : 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
000000c000094e88 : 03 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00
000000c000094e98 : 05 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
000000c000094ea8 : 07 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00
000000c000094eb8 : 09 00 00 00 00 00 00 00 0a 00 00 00 00 00 00 00
Let’s use a slice feature. How about if we append a new value to mySlice2
?
mySlice2 = append(mySlice2, 11)
>>>
Printing ptr 000000c000094f58 size 24
000000c000094f58 : 00 a0 09 00 c0 00 00 00 0b 00 00 00 00 00 00 00
000000c000094f68 : 14 00 00 00 00 00 00 00
reflect.SliceHeader{Data:0xc00009a000, Len:11, Cap:20}
Printing ptr 000000c00009a000 size 88
000000c00009a000 : 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
000000c00009a010 : 03 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00
000000c00009a020 : 05 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
000000c00009a030 : 07 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00
000000c00009a040 : 09 00 00 00 00 00 00 00 0a 00 00 00 00 00 00 00
000000c00009a050 : 0b 00 00 00 00 00 00 00
Go allocated a new buffer with double the capacity, copied the previous entries to it, and increased the length by one without any manual allocation by the user.
Why slices? They can hold a variable amount of data and still be passed by value into a function, making them a more intelligent pointer. There’s also useful functionality built around them that mask the memory allocations from the user.
All of this functionality is safe in Go as any data that’s not needed anymore will be garbage collected in the background.
For more information, on arrays and slices, see https://blog.golang.org/go-slices-usage-and-internals and https://golang.org/doc/effective_go.html#arrays.
The Code
Here’s the example code I used to generate this output:
package main
import (
"fmt"
"unsafe"
"reflect"
)
func printRawData(ptr uintptr, size uintptr) {
fmt.Printf("Printing ptr %016x size %d\n", ptr, size)
i := ptr
var offset uintptr
for i < ptr + size {
if offset % 16 == 0 {
fmt.Printf("%016x : ", i)
}
fmt.Printf("%02x", *(*byte)(unsafe.Pointer(i)))
i++
offset++
if offset % 16 == 0 || offset == size {
fmt.Print("\n")
} else if offset % 8 == 0 {
fmt.Print(" ")
} else {
fmt.Print(" ")
}
}
}
func main() {
fmt.Println("Simple array:")
myArr := [10]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
printRawData(uintptr(unsafe.Pointer(&myArr)), unsafe.Sizeof(myArr))
fmt.Println("\nArray assignment:")
myArr2 := myArr
printRawData(uintptr(unsafe.Pointer(&myArr2)), unsafe.Sizeof(myArr2))
fmt.Println("\nArray pointer:")
myArrPtr := &myArr
printRawData(uintptr(unsafe.Pointer(myArrPtr)), unsafe.Sizeof(*myArrPtr))
fmt.Println("\nSimple slice:")
mySlice := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
printRawData(uintptr(unsafe.Pointer(&mySlice)), unsafe.Sizeof(mySlice))
fmt.Println("\nSlice headers:")
mySliceHdr := *(*reflect.SliceHeader)(unsafe.Pointer(&mySlice))
fmt.Printf("%#v\n", mySliceHdr)
printRawData(mySliceHdr.Data, uintptr(mySliceHdr.Len * 8))
fmt.Println("\nSlice assignment:")
mySlice2 := mySlice
printRawData(uintptr(unsafe.Pointer(&mySlice2)), unsafe.Sizeof(mySlice2))
mySliceHdr2 := *(*reflect.SliceHeader)(unsafe.Pointer(&mySlice2))
printRawData(mySliceHdr2.Data, uintptr(mySliceHdr2.Len * 8))
fmt.Println("\nAppend to slice:")
mySlice2 = append(mySlice2, 11)
printRawData(uintptr(unsafe.Pointer(&mySlice2)), unsafe.Sizeof(mySlice2))
mySliceHdr2 = *(*reflect.SliceHeader)(unsafe.Pointer(&mySlice2))
fmt.Printf("%#v\n", mySliceHdr2)
printRawData(mySliceHdr2.Data, uintptr(mySliceHdr2.Len * 8))
}
In Summary
As a C programmer, one of my biggest hurdles when learning Go was giving up control to a separate entity when I was used to having it myself.
Going through this exercise in particular has helped me understand what’s going on behind the scenes, specifically with arrays and slices, while also leaving some things up to the imagination or future experimentation.
Programming with Go is a significant departure from C in terms of memory management, concurrency, dependency management, etc., and there are many pieces of its implementation that could catch a developer in their tracks.
If you’re a C programmer, I recommend taking a look into Go. There’s a lot in there you might like.