[zeromq-dev] Debugging zeromq apps
Kevin Sapper
kevinsapper88 at gmail.com
Wed Oct 5 13:37:51 CEST 2016
Hi Uri,
as it is with all distributed applications, debugging is problematic. IMO
printf does work best if you're chasing a bug. But if you like to reproduce
a distributed program you need a causal relationship of all events. The
only solution I know off, that does offer this, are Mattern's Vector
Clocks.
During a student project last semester we implemented a dynamic vector
clock for Zyre to order log messages according to their causal
relationship. Once we assembled all peer logs we were able to generate a
global log and a space time diagram to see the event flow including the log
messages. It handles joining peers well but for leaving peers you'll need
global consensus through Paxos for example. The drawback of vector clocks
is of course that it does not scale. The more peers join the larger the
vector gets.
You can have a look a the project here
https://zenon.cs.hs-rm.de/causality-logger/zlogger/.
//Kevin
Am 04.10.2016 20:28 schrieb "Uri Moszkowicz" <uri at 4refs.com>:
> Hi Per,
> Thanks for the links. I should have mentioned that the compiler is the
> tool we're developing, not one we're using. It is also a non-traditional
> compiler. It doesn't take a program as input or produce an executable as
> output. We're trying to make it look more like traditional compilers in
> that it can be compiled in pieces and assembled at the end.
>
> Uri
>
> On Tue, Oct 4, 2016 at 1:15 PM, Per Sandberg <per.s.sandberg at bahnhof.se>
> wrote:
>
>> Sounds like you are reinventing.
>>
>> the old distcc https://github.com/distcc/distcc
>>
>> only for c-family only code.
>>
>> or
>>
>> the modern gprbuild https://github.com/AdaCore/gprbuild.
>>
>> For "almost any" compiled language.
>>
>> /Per
>>
>>
>> Den 2016-10-04 kl. 20:07, skrev Uri Moszkowicz:
>>
>> Hi,
>> My team is looking at ZeroMQ for taking a big monolithic non-traditional
>> compiler and distributing it. The biggest problem that comes to mind is
>> debug, how do you take a piece of your program and reproduce it in a
>> debugger after a crash?
>>
>> It seems to me that we need to checkpoint and order/log messages for this
>> to work but that seems very difficult to implement. There's plenty on the
>> topic in distributed systems literature but not much written about it in
>> practice. Did I miss it in the manual? What have you all done to solve this
>> problem?
>>
>> Thanks,
>> Uri
>>
>>
>> _______________________________________________
>> zeromq-dev mailing listzeromq-dev at lists.zeromq.orghttp://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20161005/891d4f81/attachment.htm>
More information about the zeromq-dev
mailing list