802.1x : RADIUS : IAS : Fiasco

We use 802.1x for our wireless security at work. The wireless controller uses Microsoft’s IAS as the RADIUS server. Recently during one of our maintenance windows, we installed a couple of critical patches and rebooted the IAS server. This was over the weekend and we didn’t check if wireless was working after the maintenance (one of the lessons learnt from his story :), put in automated monitoring so that you don’t have to worry about what services have come up or not after a maintenance window).

On Monday, our helpdesk gets swamped with calls of “wireless is not working”. We checked the controller and everything looked okay. Only error on the controller was that the RADIUS server was not responding. We checked the RADIUS server and the IAS service was running fine. But there were a ton of errors in the System event log with the following details

Access request for user XXXX\XXXXXX was discarded.
Fully-Qualified-User-Name = XXX.XXXX.NET/XXX.XXX/TECHNOLOGY/DEVELOPMENT/XXX XXXXXX
NAS-IP-Address = 192.168.128.10
NAS-Identifier = ACHIAS01IT
Called-Station-Identifier = 00-0B-85-06-0C-A0:wacker
Calling-Station-Identifier = 00-14-A4-28-4C-EE
Client-Friendly-Name = ACHIAS01IT
Client-IP-Address = 192.168.128.10
NAS-Port-Type = Wireless – IEEE 802.11
NAS-Port = 1
Proxy-Policy-Name = Use Windows authentication for all users
Authentication-Provider = Windows
Authentication-Server =
Reason-Code = 23
Reason = Unexpected error. Possible error in server or client configuration.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Unexpected error. Possible error in server or client configuration.“, now that is real informational :). We scratched our heads.. Thought it might be an issue with the controller (it was on an older firmware). Upgraded the firmware and rebooted the controller. Still no go. Same error. Finally frustrated, we opened up a case with MSFT. Even the eng. from MSFT was flabbergasted. The usual “Everything looks good, it should work!!”.

Finally we resolved the issue to an expired computer certificate for the IAS server. The certificate had expired a couple of weeks ago, but looks like the authentication was cached and when the server was rebooted, it caused the IAS service to error out. Renewing the cert caused the wireless clients to start authenticating immediately.

Am looking into what other services depend on a valid cert to work properly.