[zeromq-dev] Introducing Caravan: Simple, Faithful ZMQ-Bindings for Objective Caml
Guillaume Yziquel
guillaume.yziquel at citycable.ch
Sun Apr 3 21:11:17 CEST 2011
Le Sunday 03 Apr 2011 à 14:06:29 (-0400), Brian Ledger a écrit :
> Guillaume,
>
>> OCaml GC memory management can get tricky. A simple advice: wrap up
>> your pointers to stuff outside the heap within so-called custom
>> blocks. Take care of declaring appropriate finalisers, and your binding will
>> be much safer.
>
> Thank you for your insight. Indeed, I will look into this matter, and
> I will be sure to file it in my Hub's issues page.
>
> I will admit, I am not the most careful systems programmer, here in a
> community of powerful systems programmers. Furthermore, I relied heavily upon the
> older OCaml-ZMQ implementation to inform my code, because I was mainly
> interested in perfecting the language interface. I've published this binding to draw
> on your insight and support in perfecting the technical details.
Let's say that there is here some impedance mismatch between C and
OCaml. OCaml is supposed to be both type-safe and fast. Unfortunately,
when writing bindings, it is very hard to be both type-safe and as fast
as the C code being wrapped.
If you want to be fast, the OCaml-ZMQ bindings follow the right
approach, mimicking the C API, and asking the programmer to close
sockets and do the C-style memory management of ZeroMQ's API by
themselves. In a sense, they could legitimately cast a socket * pointer
to a value if it's the responsibility (with a big red blinking warning
in the binding's documentation) of the binding's user to make sure
that OCaml code doesn't reference the socket * pointer after it is has
been properly freed from memory using ZMQ's API. But as I mentioned,
that's not safe (I mean no safer than ZeroMQ itself): it feeds the
socket * pointer directly into the OCaml compiled code's call stacks and
closure environments, thus faster, but less safe than having a level of
indirection and memory allocation using a pointer to a custom block
instead. That's the kind of speed/safety tradeoffs you have (among other
things).
If you want to be type-safe and integrate it fully with the GC, then
there is indeed more work to do. Such as putting ZeroMQ values inside
custom blocks with proper finaliser. However, as OCaml's GC is not a
referencing counting GC but a modified Cheney algorithm, it is possible
that a value such as a socket goes out of scope without being ever
finalised (this is the same issue as for file descriptors in OCaml's
Unix library). So you'd have to implement both the manual closing and
the finaliser to be on the safe side, and make sure they do not step on
each other's toes.
As a sidenote, I noticed that you seem to be using the
caml_enter_blocking_section() and caml_leave_blocking_section()
correctly for integrating system threads with OCaml's runtime.
I'm a bit more sceptical (but not sure) about the way you're using the
CAMLreturn macros inside C { } blocks. CAMLparam and CAMLreturn are
bunch of rather tricky GC-related macros. Naïvely, I'd avoid having a
CAMLparam macro inside a block and CAMLreturn inside a nested block as I
wouldn't be confident as to how these macros operate under these
circumstances. Begin_roots() and End_roots() are a bit more flexible,
though perhaps trickier.
> Thank you very much, Guillaume, for your help, and I look forward to
> further insight from the community.
You're welcome. As I'm somehow getting off-topic for the zeromq mailing
list, please feel free to contact me off-list.
> -Brian Ledger
--
Guillaume Yziquel
More information about the zeromq-dev
mailing list