Subject: Re: [c-ares] reinit implementation thoughts and questions

Re: [c-ares] reinit implementation thoughts and questions

From: Brad House <brad_at_brad-house.com>
Date: Sat, 13 Jun 2020 21:32:39 -0400

Correct, any patch should take other OS's into account.  Even MacOS doesn't use resolv.conf these days.  A linux-only
solution is a non-starter.  That said, I'd think at least initially any focus should be on an externally-callable
re-init.  Some systems, like android, have notifications for network changes that you might have to capture from Java,
so there may never be a way to 100% implement a solution in c-ares for detecting network changes... however focusing on
a user-callable re-init should be able to support all OS's with no additional effort over supporting a single OS.

You'd definitely need to have a fair amount of understanding of the inner workings of c-ares to be able to properly
implement a re-init without having unintended consequences. Especially if you have outstanding queries it would most
certainly break things if you simply do what PR#272 is doing.  Many servers use only a single c-ares channel that may
basically *always* have outstanding queries, so a re-init that can only be called when there are no outstanding queries
would also not be acceptable either.

The most minimally invasive might be to queue any new requests while old requests process then reload the config when
there is nothing outstanding and then start processing... but that's not a great solution if the DNS changed caused a
condition where it is going to reach the max timeout all those queries would be on hold for a great deal of time. 
Terminating all outstanding queries (and processing those events) is also not ideal as it would throw errors back when
maybe the change wouldn't have affected the inflight-queries. Maybe a combination of the 2 would be acceptable, allow a
user-specifiable timeout for outstanding queries and block new requests.

The most complex solution of course would be some sort of shadow structure where each query would use its own 'view' of
the server configuration, and when the last query terminates, if that view is no longer the most current would be
cleaned up.  However, I'd imagine this would be fairly invasive, and would require a fair amount of review and would
need good test cases to ensure it works properly.  However, it would behave in the most expected manner with no caveats
(short of any bugs of course).

-Brad

On 6/12/20 8:50 AM, Nayanā Topolsky wrote:
>
> Hi Nicolas,
>
> I would also like to see the official support of c-ares for detection of resolv configuration change.
> However the main blocker was that the implementation of my patch was not taking into account other OS types .. and
> more general solution was needed. That is not just specific for Linux/Mac and its resolv.conf - which is fair point.
>
> I am just a user, so this is more like opinion, but as I see it (correct me if I am wrong, or missing a point), it
> must be designed to take into account also other systems  - from the very beginning.
> My solution is currently used at our company, and it suits our need - which is Linux specific.
> For instance I am not sure what exactly happens to the pending resolve requests when the change is detected and reinit
> is done.. and stuff like that.
> So it would be better to have real, verified solution - properly tested.
>
> Last thing - the support for resolv.conf change is already working in glibc.. and the inspiration was taken from that ..
> You can easily find the patch from 2017-07-03 ..
> In the patch I am checking ctime,mtime, size, inode ..
>
> Good luck,
>
> Nayanā
>
> Dňa 11. 6. 2020 o 20:48 Nicolas Sterchele napísal(a):
>> Hi all,
>>
>> This is would be my first contribution to the library and before starting to code, I would like to have your opinion on the following points:
>>
>> I am thinking on implementing a "reinit" function based on the following closed PR (https://github.com/c-ares/c-ares/pull/272).
>> My main concern is to trigger a "reinit" when the content of the resolv.conf file has changed.
>>
>> Ideas:
>> 1. "ares_reinit" would be a public function, library consumers must have the possibility to trigger a reinit.
>> 2. Is it possible to call the following functions "init_by_options", "init_by_env" or "init_by_resolv" without worrying to break
>> something on a channel? Would like to reuse those functions...
>> 3. Based on the resolvconf scenario, we should call "reinit" when the file's "mtime" has changed. In order to track a change, a new variable would be added to the "ares_channeldata" struct.
>> 3.1 Each time "ares_gethostbyname" is triggered, we should check if the file has changed (by comparing what we have in "ares_channeldata" and the actual "mtime").
>> 4. Having c-ares to trigger a reinit by itself should be optional, the consumer of the library must be aware of that.
>>
>> I look forward to have your feedback!
>>
>> Nicolas.
Received on 2020-06-14