Ephemeral Port Exhaustion

We’ve been having some trouble with our main web server at work over the last few months. It all boils down to ephemeral port exhaustion, which sounds kind of like a post-COVID side-effect, but is actually something that can happen to a Windows server if you’re opening too many ports and then not releasing them. The post linked above contains some useful troubleshooting information regarding this problem.

I actually think the best explanation of this issue is in a 2008 TechNet article titled Port Exhaustion and You. (That link goes to the original version of the article via archive.org. Here’s a link to it’s current location at Microsoft’s site.)

The basic issue is that you can run out of ports and then anything that relies on opening a new one fails, and you just need to reboot the server. So, not the end of the world, but not good for a production server. We’ve been working around it for awhile. We had it scheduled to reboot once a week, but upped that to twice a week when it seemed like once wasn’t enough. And now it’s gotten to the point where I really think we need to find the underlying issue and correct it.

In our case, the server is running a bunch of web services under IIS. There are more than a dozen separate services, written by various programmers, at various points in time. They’re all (probably) C# programs, but they’re written under various versions of .NET Framework and .NET Core. They’re grouped into three or four app pools.

The first thing that makes sense to look at here is how the individual programs are handling outgoing network connections. Normally, in C#, you’d use HttpClient for that. I wrote a blog post in 2018 about HttpClient and included a link to this article about how to properly use HttpClient without opening a bunch of unnecessary connections. I think I’ve got all of my own code using HttpClient correctly and efficiently, though I’m not sure about everyone else’s.

It can be hard to tell what’s going on behind the scenes, though, if you need to rely on closed-source third-party libraries that also open up HTTP connections. I’ve got a few of those, and I think they’re not causing problems, but I don’t really know.

To try to monitor and track down port exhaustion issues, there are a few tools you can use. A number of the articles I’ve linked above mention “netstat -anob” or some variation of that, and I’ve found that helpful. One issue with that, if you’re running a lot of web services, is that you can’t easily see which service is causing a problem.

My big breakthrough yesterday was realizing that I could use “appcmd list wp” to get a list of the PIDs and app pool names associated with the various IIS worker processes. From that, you can tie the netstat output back to a specific app pool at least. (Of course, if you have ten web services under one app pool, then you’ve still got some more work to do.) See here for some info on appcmd.

Anyway, we still haven’t quite got our problem solved, but we’re getting closer. For now, we’ll still just need to keep an eye on it and use the old IT Crowd solution: “Have you tried turning it off and on again?”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.