The Artima Developer Community
Sponsored Link

.NET Buzz Forum
MS Windows Services for Unix + Client for NFS + EMC = Kernel Memory Leak

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Steve Hebert

Posts: 218
Nickname: sdhebert
Registered: Apr, 2005

Steve Hebert is a .NET developer who has created the .Math compiler library.
MS Windows Services for Unix + Client for NFS + EMC = Kernel Memory Leak Posted: Dec 16, 2005 3:45 AM
Reply to this message Reply

This post originated from an RSS feed registered with .NET Buzz by Steve Hebert.
Original Post: MS Windows Services for Unix + Client for NFS + EMC = Kernel Memory Leak
Feed Title: Steve Hebert's Development Blog
Feed URL: /error.htm?aspxerrorpath=/blogs/steve.hebert/rss.aspx
Feed Description: .Steve's .Blog - Including .Net, SQL Server, .Math and everything in between
Latest .NET Buzz Posts
Latest .NET Buzz Posts by Steve Hebert
Latest Posts From Steve Hebert's Development Blog

Advertisement
Here's a topic I thought I'd lend a little google juice since Microsoft has created a hotfix.  This problem is nasty - difficult to diagnose and difficult to track down. 

We have an application where we are sharing a NAS device with Unix servers.  We were seeing in our pre-production and production environments that a group of Windows 2000 boxes would go belly up after a week of use.  These machines would gradually become slow and suddenly unable to communicate on the network.  When looking at the event logs, the system stopped communicating over the network and showed repeated errors.  It looked like someone tripped over a network cable whenever these systems went down.  To make diagnosis worse, I did not have physical access to the boxes - only VNC access.  Here's the path to diagnosing the problem, perhaps this will save someone else some time.

After a few crashes we saw that socket creation was being denied because resources were low.  This led us to looking at System PTEs.  Once we were focused on the System PTEs, we monitored system PTEs in perfmon and saw that the leak didn't start happening for 4 hours, but then steadily declined on rate loosely tied to traffic volume.  Without any traffic we would see PTEs decrease at a rate of ~ 5/hour, with traffic we saw a range from 60-100 PTEs per hour.  The PTEs always decremented in blocks of 10.

At this point we weren't sure what was causing it - typically a driver because these consume kernel memory.  After spending a couple of days trying to track this down, we found that the Windows Services for Unix were at fault.  We contacted MS support and they shipped us the hotfix.  The problem disappeared and we haven't seen the behavior since.

I find it hard to believe that Microsoft has had this product in the field for so long and only now they see this critical of a leak. For some reason we only saw the problem with our EMC/NFS connection.  We have a solaris/NFS connection in development that has never exhibited the problem. I guess Microsoft doesn't test wsFU against small 3rd party vendors like EMC. </sarcasm>  We spent a ton of time tracking this problem and questioned everything on these systems. It's interesting to note that the problem also happens in Windows 2003, but because 2K3 always has significantly more System PTEs than Win2k the box will take much longer to fail.


Read: MS Windows Services for Unix + Client for NFS + EMC = Kernel Memory Leak

Topic: [Development] 2006 PASS European Conference Previous Topic   Next Topic Topic: Rails to 1.0

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use