[colug-432] Recent change in retry behavior from large e-mail sites, affecting greylisting?
wyang at gcfn.net
Thu Jan 13 02:58:13 EST 2011
On Wed, 2011-01-12 at 11:08 -0500, Angelo McComis wrote:
> A couple other ideas came to mind here as this thread was piling up
> * Your user logs that have all relevant troubleshooting information
> shaved off of them -- is there any indicator of at least the friendly
> error message? e.g. was it adress not found, was it host/domain not
> found, etc?
The messages did not suggest DNS-related concerns to me, and -- to put
some of this silliness to rest -- I have been debugging mail problems
since the early '90s. This isn't my first (or even fiftieth) trip
around the block. They appeared to be bad recipient rejections but,
given that they come without the numeric error status or server's
verbose message, it's hard to say exactly what the error was as far as
the sending server interpreted it. Again, no corresponding log entries
on either my primary or secondary servers.
> * On your server, can you tell if perhaps the comeback was hitting
> your secondary MX server, and was getting greylisted again there?
> Perhaps they are coming in and hitting your primary first, but
> returning to your secondary?
This hypothesis is not supported by the logs on the secondary; I already
checked this, even though greylist data is synchronized between both
primary and secondary and generally up to date within a few seconds.
> * Least likely scenario: the route to your IP is getting intermittent
> interruption, and you're only noticing the errors when try #2 from a
> greylist doesn't make it.... is it possible that try #1 isn't getting
> there at all in some cases?
This seems a bit far-fetched to me. If this were happening, I'd expect
such a situation to show up in the connectivity logs for my ipsec
tunnels. The "not getting through once" seems possible, but the ways I
can see that happening, in light of the firewall connect and SMTP logs I
have, seem to basically assert a network-layer (routing/duplicate IP
grade) fault with exactly no evidence and really, no ability to collect
evidence. I'm not seeing any of the network effects from outside that
would suggest this is an ongoing issue. It really seems to be limited
to Hotmail/MSN and Yahoo addresses.
The simplest explanation, to me, would be that something's wrong on one
of my servers. While I've made no changes, there's a lot of state
information that gets updated in near real time (like the greylist tuple
db) without my intervention which could be the core of the problems.
wyang at gcfn.net
More information about the colug-432