Coda File System

Re: DNS lookups and disconnected mode

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Sat, 27 Oct 2007 01:34:19 -0400
On Fri, Oct 26, 2007 at 10:29:50AM +0200, u+codalist-p4pg_at_chalmers.se wrote:
> I have discovered a sad consequence of how Venus is using DNS.
> 
> - it relies on DNS to find realm servers -- perfectly correct
> - it tries to refresh its knowledge about the ip-addresses of the servers,
>   which is very, very good
> 
> Unfortunately, as soon as we physically disconnected a machine
> from the net so that it can not reach DNS, it becomes hardly possible to
> use Coda. Venus is making DNS queries all the time and waiting for answers
> which never come.

There must be something very different in the way your system is set up
compared to mine.

First of all, as far as I know we only refresh the realm addresses when
the client is restarted, so if you don't actually shutdown the client or
machine it works just fine when the network disappears (module 60-90
second RPC2 timeout).

Now if venus is restarted but the network is not available, DNS queries
actually do time out if there is no response from the DNS servers in
this case Coda falls back on previously cached information for the realm.

The DNS timeout could be quite long, because there are several levels of
fallback going on, at the highest level, venus first tries SRV records
and then falls back on doing a normal A record lookup. Below that the
resolver library may try various aliases that are defined by the
'search' option in /etc/resolv.conf (although I think I've tried to
disable that type of expansion) as well as sequentially trying each
defined upstream DNS server. So if you have 3 servers it actually
iterates over each of those before it gives up. And on each following
query it will probably just try all of the servers again. libresolv by
itself doesn't do any caching, so each lookup has to go across the net.

Having a local DNS cache helps a lot because it caches both successful
as well as failed lookups and avoids a lot of network traffic, but in
some cases also handles things like only sending DNS queries to servers
that are known to be reachable. I've successfully used both dnsmasq as
well as pdns.

> One workaround is to ensure that the network is logically shut down,
> there are no interfaces/routes for Venus to use.
>
Something like ifplugd or networkmanager can automatically bring the
interface up or down when the network cable is connected or removed.

> Another workaround is to ensure that /etc/resolv.conf does not contain
> any DNS server addresses.
> 
> Both have two major drawbacks though:
> - require root privileges and possibly questionnable changes
>   in the local setup
> - represent an extra burden
>
> (Apparently, some setups tend to do it automatically at network disconnection
> but we can not rely on that!)
> 
> Any other workaround-via-client-setup like using "realms" file seems
> as bad or worse. Disconnected operation is an inherent feature of Coda
> and should be supported out of the box without relying on extra local tweaks.

Install networkmanager or ifplugd, or use a local dns cache like
dnsmasq, pdnsd, or even a caching only bind/named.

> Venus can gracefully handle rpc2 timeouts,
> it might be possibly taught to handle DNS timeouts gracefully as well?
> 
> How hard is it? Do we need a non-standard/non-existent DNS-resolver
> library?

Right, the lack of DNS caching is really a libresolve issue. DNS has
some knowledge about how long a record is valid. You can see it if you
use 'dig -ta coda.cs.cmu.edu', the number before 'IN' is after how many
seconds that records needs to be revalidated. But this information is
not available for an application that uses the standard gethostbyname or
getaddrinfo libc calls. So really the application can't be expected to
correctly handle caching.

But libc6 or ibresolv in many cases isn't caching either, maybe it does
if you install something like nscd, but I found that installing dnsmasq
tends to be simpler and more predictable (and more configurable).

Jan
Received on 2007-10-27 01:36:34