[zeromq-dev] accept() and EMFILE/ENFILE

Dhammika Pathirana dhammika at gmail.com
Tue Jun 2 04:52:30 CEST 2009


Hi Martin,


On Fri, May 29, 2009 at 2:35 AM, Martin Sustrik<sustrik at fastmq.com> wrote:
> Hi Dhammika,
>
> Sorry for delay...
>
>>>> IMHO, it sounds like a bad design.
>>>> Aren't we making assumptions about access to well known ports, root
>>>> privileges etc. This rules out user space apps.
>>>
>>> I've missed the point here, can you explain?
>>
>>  We discussed this before,
>> http://lists.zeromq.org/pipermail/zeromq-dev/2009-May/000749.html
>> http://lists.zeromq.org/pipermail/zeromq-dev/2009-May/000745.html
>>
>> Apart from security issues, how do we eliminate thread contention on
>> released fd?
>
> If there was a POSIX function to drop the backlog of the living listening
> socket, that would solve the problem. That not being the case, the only way
> afaict is to close the listening socket, try to reopen it and if it fails,
> retry in periodic intervals.
>
> It may seem awkward, however, note that the same thing (periodical retries)
> has to be done anyway in case there are no free sockets when listener is
> being created for the first time. Thus the solution doesn't add to the
> complexity of the system in any way.
>
> However, the main problem I see with your solution is philosophical rather
> than practical. Giving clients control of server behaviour (by requiring
> them to close timouted connection, i.e. making them part of servers state
> machine) is generally not a good idea. The point is that there is difference
> in reliability requirements. Servers are often required to be highly
> reliable, able to recover whatever happens on the network. Clients, on the
> other hand, are often considered unreliable or even deliberately
> misbehaving. The consequence for the design of a distributed system is to
> decouple server functionality from the client functionality as much as
> possible.
>



True, this adds complexity. But I don't think this affects server's state.
Server has to handle client immediately closing its socket after tcp
handshake anyway. Client timing out and closing its socket is not very
different. In both cases server's accept() will return ECONNABORTED.



>>>> Also closing the listening socket sends ECONNREFUESED to new
>>>> connection requests, this is a hard tcp error.
>>>> How does the client know if the server is temporarily busy or not
>>>> running at all? Client can query zmq_server, but then aren't we going
>>>> to endup with one of those byzantine problems?
>>>
>>> My feeling is that both out-of-sockets and component-not-running should
>>> be
>>> handled in the same way, specifically by attempt to reconnect after a
>>> while.
>>>
>>
>> Isn't it better to draw a distinction and let the application decide?
>
> Dunno. Can you think of a real-world use case where it makes sense to handle
> network-outage and out-of-sockets errors in different manner?
>



How about email?
An invalid address returns a permanent error, but if the MTA is out of
resources ie. out of disk space returns a transient error.



More information about the zeromq-dev mailing list