Apparently, if you run IIS,
and you run ColdFusion,
and maybe you run Commonspot,
For over a week, the client's website had been going down.
I fixed this by swapping some cfinvoke calls for straight cfhttp calls to the webservice - there's an issue with cfinvoke where at certain stages of the webservice invocation (like waiting for a response), ColdFusion will go into a holding pattern of just waiting, regardless of timeout settings. Cfhttp calls worked every bit as well and timed out successfully, thus eliminating the problem.
A process serving application pool 'DefaultAppPool' exceeded time limits during shut down. The process id was '6332'. Event ID 5013
A worker process '6332' serving application pool 'DefaultAppPool' failed to stop a listener channel for protocol 'http' in the allotted time. The data field contains the error number. Event ID 5138
A process serving application pool 'DefaultAppPool' suffered a fatal communication error with the Windows Process Activation Service. The process id was '3748'. The data field contains the error number. Event ID 5011
This is not what you want to see.
That's when the truly diabolical thing was happening here - these hung requests are tough. Persistent. They do not want to let go. IIS spends several minutes trying to wait for them to shut down so it can properly restart the application pool during a recycle - and that's the period when website would be dead, stuck in limbo with an old application pool that refused to die and unable to fully transition to the new one.
I was right. I found the Chinese Spider's requests stuck in two more ColdFusion / Commonspot / IIS websites that we manage (though several others got a clean bill of health, we nonetheless have blocked the IP on all our clients for now as a safety precaution). In those cases it appears that the crawler hadn't been hitting the sites too hard, so the application pool would recycle naturally before dying from too many hung requests.