[zeromq-dev] czmq: Error Traceability with assert(...) and release code

Ivan Pechorin ivan.pechorin at gmail.com
Mon Mar 10 12:22:23 CET 2014


assert() gives you a core dump that has complete context, including stack
trace.

How do you propose to provide the same, if you replace assert with return?
On 11/03/2014 12:10 AM, "Christoph Zach" <czach at rst-automation.de> wrote:

> On Monday 10 March 2014 10:36:22 Pieter Hintjens wrote:
> > This theory is fine in theory.
> Please note that all the theoretical stuff I wrote is based on real-world
> projects, which have high requirements on safety and software resilience.
> Here's a requirement excerpt from one of our projects:
> 1) You have a customer which needs a test system (HIL tests).
> 2) Test system is for validating very safety critical stuff.
> 4) Every tests needs to be protocolled correctly. No matter of the tests
> state
> it must be ensure that when an engineer tests a hardware component it will
> be protcolled.
> 4) The system must be in the customer-provided safe states at all times
> possible
> to avoid any harm to the test engineers and to avoid destroying the tested
> hardware.
> 5) The customer wants to create it's own test scripts, which run ontop of
> various
> client RPC libraries (zmq).
> 6) The customer wants to use RAII (C++) or with/finally (Python) to ensure
> that he/she can clean up nicely.
>
> So by simply assert() and exit() the application the points 6, 5 and 4
> have just been
> violated. To solve this issue all the assert(...) statements could be
> replaced
> e.g. with ZMQ_ASSERT( toAssert, message ). Therefore, the API will
> internally
> assert 'toAssert' and in case of a violation a ZMQ_INVARIANT_ERROR error
> code
> will be returned. In addition the message 'message' will be logged to get
> a verbose information about the context. Latter, in case of an error, the
> customer
> can simply send a error report with all the verbose information. This
> allows to
> easily identify and fix the error.
>
>
> > In practice, could you provide a case
> > that reproduces the crash you got?
> https://github.com/imatix/zguide/blob/master/examples/C%2B%2B/mdcliapi.hpp
> line 150 - 156. If somebody sends garbage data the application will simply
> exit. In such a case the API should inform the user that there was gargabe
> so
> the user can clean up the context etc.
>
>
> >
> >
> > On Mon, Mar 10, 2014 at 10:10 AM, Christoph Zach
> > <czach at rst-automation.de> wrote:
> > > On Friday 07 March 2014 17:36:21 Pieter Hintjens wrote:
> > >> On Fri, Mar 7, 2014 at 3:13 PM, Christoph Zach <
> czach at rst-automation.de> wrote:
> > >>
> > >> > To further use zyre/czmq We are planing on replacing all the
> assert(...) statements
> > >> > with actual error handling routines.
> > >>
> > >> As Olaf explains, the asserts cannot ever happen in practice unless
> > >> there is a coding bug in your app or in CZMQ.
> > >>
> > >> If you can reproduce an assert under "normal" conditions, that is a
> > >> bug that we take very seriously and fix.
> > >>
> > >> Code that has hit an internal error _cannot_ continue to operate
> > >> sanely. The extensive use of asserts is a deliberate and long-standing
> > >> design choice, and though you may do what you like with your forks of
> > >> the codebase, such patches would be rejected without much pity.
> > >>
> > >> I'd not trust a system that had asserts disabled. Production code (and
> > >> I've made that my profession for decades) should run with all asserts
> > >> enabled. The correct response to a internal failure is crash fast,
> > >> recover fast. You cannot run a software system reliably when you have
> > >> internal errors. Adding error handling to recover from (by definition)
> > >> unforeseen internal errors makes things less, not more reliable.
> > > Semantically We are agreeing on detecting invalid/fatal states. Let me
> explain
> > > (in more detail), why error codes and not assertions should be used to
> > > detected these:
> > >
> > > 1) Context Awareness
> > > The issue with the old school assert statements is that they will
> > > simply quit your application immediately. Even when you have enabled
> > > them. E.g. If you have a C++ app with RAII:
> > > [...]
> > > {
> > >     RAIIWrapperX x (...);
> > >
> > >     libraryPotentiallyGoinigToAssert(....):
> > >
> > > } // Never reached here. --> Will never call dtor of x!
> > >
> > > The issue that when the library has detected that it has reached
> > > an invalid/unknown/fatal state it just quits and does not allow the
> > > RAIIWrapperX to clean up nicely.
> > >
> > > The issue with the assumption
> > >     "You cannot run a software system reliably when you have internal
> errors"
> > > is that 'reduced functionality' states are ignored.
> > > This means that when a library has entered an unknown/invalid state it
> > > does NOT mean that the other parts of the system have too!
> > > Therefore, the other parts must be given a chance to clean up as much
> > > as possible.
> > > Please note that this does not protect against Machiavellian errors,
> where
> > > someone simply corrupts the whole memory of your application. But then
> > > again there's Unit Testing and valgrind to determine such things.
> > >
> > > 2) Unit Testing
> > > By unit testing a library there are different kinds of tests. E.g. a
> test
> > > can validate that the function f() does what it should do.
> > > Then another test can validate that f() protects itself against
> invalid input.
> > > This means that no matter how invalid the given arguments are the
> > > function f() will report an error and does not crash the application.
> > > This test (a.k.a 'invalid parameter detection') is only possible by
> using
> > > error codes. If assert(...) statements are used it can never be fully
> tested.
> > >
> > > 3) Design Principle: "An API must be easy to use correctly and hard to
> > > use incorrectly".
> > > This is part of Scott Meyers' article, called "The Most Important Deign
> > > Guideline?". Besides this article he also wrote some pretty good books
> > > on how to write/design good C++ software. They have the same level as
> > > the books of Herb Sutter.
> > >
> > >>
> > >> What can be helpful is to replace the assert() macro with a more
> > >> extensive error reporting system.
> > > That was my original intention. Instead of assert() and kill the
> program
> > > simply provide the user with a verbose error & message. Then it's the
> > > user's responsibility to handle it correctly and clean up everything
> else.
> > >
> > >> However be careful you don't try to
> > >> do to much: the state of the application when it hits an assert is
> > >> unknown. You can have arbitrary memory corruption, for instance. Doing
> > >> *anything* more than "print error & exit" leaves you open to worse
> > >> damage.
> > > To protect against such an issue the only thing We can do is to write
> > > defensive code:
> > >  * const as much as possible
> > >  * validate invalid input
> > >  * report verbose errors (to better track the issue when the customer
> > >    reports it)
> > >  * use unit testing (test against good and bad cases)
> > >  * use the type system as much as possible
> > >  * use valgrind when running unit tests
> > >  * etc.
> > >
> > > By applying all these (and many more) methods it's possible to
> > > reduces the probability of such an event. That's everything We can
> > > do, because at run-time if We detect and invariant We can not tell
> > > if it's wise to shutdown immediately. Therefore, We shall try to clean
> > > up as much as possible.
> > >
> > >>
> > >> -Pieter
> > >> _______________________________________________
> > >> zeromq-dev mailing list
> > >> zeromq-dev at lists.zeromq.org
> > >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > Best Regards
> > >
> > > Christoph Zach
> > >
> > >
> -----------------------------------------------------------------------------
> > > RST Industrie Automation GmbH * Carl-Zeiss-Str. 51, D-85521 Ottobrunn
> > > Tel. +49-89-9616018-00 * Fax +49-89-9616018-10 *
> http://www.rst-automation.de
> > >
> > > Geschäftsführer: Dipl.-Ing.(FH) Robert Schachner
> > > Amtsgericht München: HRB 103 626 * ID-Nr. DE 811 466 035
> > >
> -----------------------------------------------------------------------------
> Mit freundlichen Grüßen
> Best Regards
>
> Christoph Zach
>
>
> -----------------------------------------------------------------------------
> RST Industrie Automation GmbH * Carl-Zeiss-Str. 51, D-85521 Ottobrunn
> Tel. +49-89-9616018-00 * Fax +49-89-9616018-10 *
> http://www.rst-automation.de
>
> Geschäftsführer: Dipl.-Ing.(FH) Robert Schachner
> Amtsgericht München: HRB 103 626 * ID-Nr. DE 811 466 035
>
> -----------------------------------------------------------------------------
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20140311/33d20abe/attachment.htm>


More information about the zeromq-dev mailing list