Snapdrive services failing to start on Windows Server 2008 x64

Snapdrive for Windows  is Netapp’s storage management software that allows you to easily provision storage, backup and restore your data on a Windows server. It’s a great tool when it works but when it doesn’t it’s a bear. I just recently had the experience of troubleshooting some of our servers that had some Snapdrive issues connecting to our filer. The server’s iSCSI connection was not affected so the issue went unnoticed for some time until a request to expand luns was made….That’s when it was discovered that the Snapdrive service was not running and failing to start.

When Snapdrive was opened the mmc would crash which then resulted in the following error in the Snapdrive MMC:

Web Service Client Channel was unable to connect to the LUNProvisioningService instance on machine ServerName.
Could not connect to ‘net.tcp://ServerNameSnapDrive/LUNProvisioningService.’ The connection attempt lasted for a time span of 00:00:00. TCP error code 10061: No connection could be made because the target machine actively refused it 

The event that appeared in the application logs:

Log Name: Application

Source: SnapDrive
Date: 1/05/2013 10:41:33 AM
Event ID: 101
Task Category: Generic event
Level: Error
Keywords: Classic
User: N/A
SnapDrive service failed to start.
Error code : SnapDrive Web Service failed to start Reason: ‘The TransportManager failed to listen on the supplied URI using the NetTcpPortSharing service: failed to start the service. Refer to the Event Log for more details.’

I immediately jumped onto Netapp’s support site and starting searching for known issues. One post had indicated to check the permissions of the account accessing the filer and make sure it had local admin rights to the server, I knew that wasn’t issue because the account already had local admin rights. Plus, Snapdrive was working up until recently so permissions would be on the bottom of the list of culprits.The next few hits on the forums indicated that IIS admin needed to be enabled and ensure that the .NetTCPSharing service was enabled. When I checked for the services , IIS admin wasn’t even installed  and the .NetTCPPortSharing was in a disabled state.  I attempted to re-enable the service but it failed as I expected it too. Odd, I thought, Where is the IIS admin service?  What would prevent these services from starting?

Since IIS admin wasn’t available I went to Server Manager and confirmed it wasn’t installed and installed the feature through server manager. After the installation was completed I attempted to start the .NetTCPSharing server and the Snapdrive services again but all of them failed. Back to scratching my head again.

It took some digging but eventually I came to Netapp KB2013168 . The article noted  the following “.NetFramework and the Net.Tcp PortSharing Service. If .Net is not properly installed or the Net.Tcp PortSharing Service service are not functioning correctly, SnapDrive will not be able to connect to the LUNProvisioningServices and the ability to manage LUNs via the MMC can be impaired.”

Oh Snap! Anybody that knows me in “real” life knows how much the word .Net just gets under my skin. I’ve had to deal with so many issues that involved corrupted installs of .Net or some sort of Microsoft patch that would  “break” .Net and the application that depended on it, that I’ve grown a hatred for the word .Net.

Now that I’ve something to go on,  I followed the steps in the KB article for issue #2  and issue #3 ( the symptoms I was experiencing);

Issue 2:
Directory permissions to C:\WINDOWS\Microsoft.NET\Framework\v3.0\Windows Communication Foundation\SMSvcHost.exe.
For the NT Authority\Local Service account to be able to start this service, users must have read and execute permissions to the above path.

Resolution to Issue 2:
Incorrect permissions where configured on the C:\windows directory.
Verify that users have read and execute permissions to the path C:\WINDOWS\Microsoft.NET\Framework\v3.0\Windows Communication Foundation\SMSvcHost.exe.

Well, permissions wasn’t it because everything was there. Now onto issue #3

Issue 3:
SnapDrive 6.x service did not start because the ‘Net.Tcp Port Sharing service’ will not stay started. This is a dependency SnapDrive 6.x has that earlier versions do not.

Resolution to Issue 3:

Reinstall Microsoft .Net.

Reinstall .Net? Great , this should be fun  I thought to myself. I confirmed via Add/Remove Programs that the .Net 3.5 was installed but  the document referenced that Snapdrive required .Net 3.0  sp1 and that particular version was not listed anywhere. On a hunch, I went to server manager > Features > to see if the .Net 3.0 framework features were installed and Yes it was! Using the Server Manager wizard I removed the .Net 3.0 Framework Features, which requires a reboot to complete.

Once the uninstall was completed I re-installed the .Net 3.0 Framework using the same Server Manager wizard.When the installation completed I rebooted the server for good measure, once the server came back online the Snapdrive service was running again. Whew! What a morning now onto expanding the Luns as the applications owner requested.