The code that Fabio proposes looks like this:
sx_slock(&data->lock);
if (data->buffer)
a = *data->buffer;
sx_sunlock(&data->lock);
This point is that without a memory barrier on the unlock, the CPU is
free to reorder the instructions into the order is his message.