Subject: Re: New release?

Re: New release?

From: John Engelhart <john.engelhart_at_gmail.com>
Date: Thu, 29 Oct 2009 02:15:11 -0400

On Wed, Oct 28, 2009 at 6:37 AM, Steinar H. Gunderson <sesse_at_google.com>wrote:

>
> FWIW, if you're talking about the hash of doubly-linked lists, it was
> indeed
> added after profiling real-world performance problems here at Google. I
> don't
> know the specifics (I only forward-ported it from our internal repository
> into c-ares CVS), but I'm pretty sure it wasn't made for fun. :-)
>

Fair enough. Could you elaborate a bit more on this? ares_timeouts() still
uses a linear scan of the list, so is this function not in your critical
path?

How many queries were you seeing in the queue? Intuitively it would seem
like it would have to be "a lot", probably even "an awful lot" (>1000).
 While I admittedly haven't had to deal with the gory details of managing
DNS packets, I would think that a timer resolution of ~0.1 second (10 times
a second) would be more than sufficient for most purposes, and a resolution
of 0.01 seconds (100 times per second) should pretty much cover everything,
and allow sufficient resolution to add some random jitter to retry attempts
so things don't fall in to some "global retry attempt" synchronization mode.
 I have to admit that even a linear scan of a list of a 1000 items, with
nothing more than a trivial "less than" test for each node, 100 times per
second would be a performance problem that would even show up in profiling
(I say this from OpenGL experience, which typically involves tens of
thousands of "nodes" to be processed with something much more complex than a
simple relational test ~ 60 times / second).

I'm not saying you're lying or making things up, objective data is what it
is. It's just strikes me as a bit unusual. Two possible explanations I can
think of is you have a HUGE amount of queries (>10,000), but this seems
unlikely, or the list is being scanned very, very frequently.

When I noticed my problem, c-ares was obviously a hot spot in the profiling
tools. ares_timeout(), ares_process(), and all the related bits and pieces
were all heavy hitters. ares_process() was being called on the order of
500,000 times/sec. So an inefficient process_timeouts() under these
conditions is likely to stand out, but the root of my problem turned out to
be that ares_timeout() was returning a 0+0 wait time for select(). Since
I'm developing on a Mac, I happen to use Menu Meters with a CPU graph. It
just happens to show user and system time in different colors. In a bit of
serendipitous luck, I noticed that the system time went right through the
roof during the problem, so I immediately suspected some kind of
select()'ish kind of problem even though the profiling tools were telling me
something else (user-land centric and didn't account for the kernel time).

I'm sort of wondering now if maybe the problem that you (or whoever) was
trying trying to fix by adding the hash table stuff was actually caused by
ares_timeout() returning 0+0, or something similar / related?

And, hypothetically speaking, if someone were to re-work the code, could you
provide feedback on whether or not the new code "works"? Sadly, the query
that was causing me problems now works just fine (returns 'instantly'). And
it's always hard to test if you don't have a reliable way to reproduce the
problem. :)

Personally, my $0.02, regardless of any of the above or possible bugs, I
think it'd be a smart move to tweak the logic in ares_timeout() to return a
"minimum" timeval on at least the order of 1000 micoseconds / 1 millisecond,
even if the loop scan returned a value less than this (and in particular the
current 0+0). This minimum timeval would obviously be compared against any
"max" timeval passed in, the lesser of the two would be the final result.
 It just seems like a good idea to not hand back a timeval of 0+0 unless you
have a very, very good reason, particularly for time outs on the "main" work
loop path.
Received on 2009-10-29