Tuesday, February 1, 2011

Wrapping iconv in Go with cgo

Currently, the standard library of Go doesn't have much support for character encoding conversions. Ideally, support should be built out in pure Go which I'm sure will eventually happen. However, iconv functionality is available on most systems and Go has nice support for calling into C code. So I decided to create an iconv wrapper myself - both to fulfill a personal need I had as well as to get familiar with cgo usage. I made my work available on GitHub:


First, I'd like to acknowledge that this wrapper was not the first to exist. In fact, right on GitHub you can find https://github.com/oibore/go-iconv. This probably worked fine, though I did not try it myself since I wanted to use my own project as an exercise for understanding cgo. So let's look at some points I learned.

Converting Go Strings to C Strings

To open an iconv context, I needed to pass the source and destination encoding names to iconv_open.  The  C package (only available when processing a go file with cgo) provides a conversion function C.CString that copies the contents of a Go string into a C string. I can then pass these values to the iconv_open function by calling it through the C package like:
toEncodingC := C.CString(toEncoding)
fromEncodingC := C.CString(fromEncoding)
C.iconv_open(toEncodingC, fromEncodingC)
Very straight forward so far, however there was another point that I didn't immediately realize. Since the memory is created for C usage, it can't / won't be garbage collected by Go. It has to be freed explicitly. This is where I initially made a mistake by passing the variables directly to C.free - this crashed my program. The correct way is

Getting at errno

The call to iconv_open returns a descriptor and sets errno if anything goes wrong - so to get at the value of errno we make the call with an expected second return value:
descriptor, err = C.iconv_open(toEncodingC, fromEncodingC)

Passing a char** value

For the actual conversion the package has to call the iconv function. This takes two char** parameters. This threw me a little as to how to accomplish this. From the cgo documentation I knew that the data of a  []byte could be passed as a char* by doing &sliceVar[0], but how to pass the data as a char** took some fiddling. I finally arrived at:
inputPointer := (*C.char)(unsafe.Pointer(&input[0]))
outputPointer := (*C.char)(unsafe.Pointer(&output[0]))
_, err = C.iconv(context, &inputPointer, &inputLeft, &outputPointer, &outpuLeft)

No comments:

Post a Comment