LDAP authentication troubleshooting
A couple of odd issues have arisen that suggest that I need to tweak my config files.
Context: new machine build. No netlink users have logged in yet. Local
accounts log in fine. Auth stack config'd to talk to ldap1p on port 636.
The daemon I'm using (nslcd) is config'd to continue trying until it
gets a response. NOTE: I can't find details on how long it waits before
reporting a failure (no obvious TTL settings), but the results suggest
that it's either getting back a "auth failed" message from the server,
or the TTL is incredibly short.
Sequence:
1) Netlink user attempts login on machine "carrot". Fails with either
"Authentication Failure" or "no user account available" message at GDM.
This recurs 3 or 4 times (messages can switch back and forth) until
finally the login succeeds.
2) User immediately goes to another identically built machine and
successfully logs in first time.
I've repeated this 3 times this morning. I'll continue to monitor it
over the next few days as I'm still wondering if the local machine
caches users: as in, will previously-auth'd netlink users have no more
trouble logging in once they've succeeded?
Is it possible that if a user hasn't hit ldap1p for "a while" it'll take
several attempts to resuscitate the record from the db, but once it's
been accessed it responds more rapidly? One of the users logged in
successfully yesterday, but this morning (on a different machine) it
took 4 attempts to get auth'd. She then logged in to a third machine
with no trouble at all.
A conversation with an ldap sysadmin showed a decided lag in query/response time, but a completely local concern is that logged in users consistently (as in every hour or so) "lose" their name. If they launch a terminal they get a prompt like this:
I have no name!@carrot#:
I installed the ldap-utils package and ran ldapwhoami when the user lost their name and I got this:
ldap_sasl_interactive_bind_s: Can't contact LDAP server (-1)
I looked up the error and it's quite common when openldap clients try to connect with a Sun One Directory Server. OpenLDAP *wants* to use SASL, and SODS *won't* use it.
Apparently, client machines have an ongoing conversation with the ldap server, which may explain why we're getting the "I have no name!" problem: if the server doesn't respond, you have no name. I've also noticed that a test box that I have exhibits problems when coming out of screensaver mode. The screen remains black for as long as a minute before coming back with a gdm screen. I *think* it's not giving me a gdm screen until it makes contact with the server - which may be a while.
I've edited nslcd.conf to include:
sasl_mech ANONYMOUS
which apparently means no SASL (i.e. for unauthenticated guest access). If the "documentation" is accurate, this will stop trying to use SASL and all security will rest in the SSL domain.
NOTE: I got this hint from here.
Update: Looking at this, I *may* be able to cache credentials using the libpam-ccert package. The only question is, does it play nice with the new packages like nslcd etc.