Cluster Validation Storage Test ‘List All Disks’ Fails with Status 87

April 5, 2010, 4:39 am

≫ Next: Windows Server 2008 Failover Clusters: Networking (Part 4)

≪ Previous: Windows Server 2008 Failover Clusters: Networking (Part 3)

Greetings CORE blog fans! It has been awhile so I thought it was time for another blog. In recent weeks, we have seen an issue where the Windows Server 2008 R2 storage validation test List All Disks is failing with a Status 87. Figure 1, is an example of what is displayed in the cluster validation report.

Figure 1: List All Disks failure in Cluster Validation Report.

This error is also reflected in the ValidateStorage log (Figure 2) located in %systemroot%\Cluster\Reports directory.

000016f4.00001714::01:02:06.180 CreateNtFile: Path \Device\HarddiskVolume2, status 87
000016f4.00001714::01:02:06.180 GetNtldrDiskNumbers: Failed to open device \Device\HarddiskVolume2, status 87
000016f4.00001714::01:02:06.180 GetNtldrDiskNumbers: Exit GetNtldrDiskNumbers: status 87
000016f4.00001714::01:02:06.180 CprepPrepareNodePhase2: Failed to obtain boot disk list, status 87
000016f4.00001714::01:02:06.180 CprepPrepareNodePhase2: Exit CprepPrepareNodePhase2: hr 0x80070057, pulNumDisks 0

Figure 2: ValidateStorage log entry

The decode for these errors is shown in figure 3.

# for decimal 87 / hex 0x57 :
ERROR_INVALID_PARAMETER winerror.h
# The parameter is incorrect.

Figure 3: Error decode

The cause for this failure to this point is unknown. What we do know is the path that is called out as seen in Figure 3: above always points to the 100 megabyte partition that is created at the root of the system drive. This partition is created by default and is in place to support BitLocker. The approved workaround is to assign a drive letter to the 100 megabyte partition and re-run the validation process. The List All Disks storage test should pass at that point. There is no adverse impact to assigning the drive letter to this partition. As a reminder, BitLocker is not supported in a cluster environment. This is documented in KB 947302. If an attempt is made to enable BitLocker in a cluster node, the error in Figure 4 is displayed.

Figure 4: Error when trying to enable BitLocker on a cluster node

I have an ‘ask’ of our readership. If anyone reading this blog can ‘on demand’ repro this issue, we want to hear from you. This goes beyond just telling us, “Yeah, I’ve had that issue myself.” I am interested in hearing from anyone who has perhaps manipulated a setting in their controller card that can either cause validation to fail in this way or make it pass. I am interested in hearing from someone who had this failure, changed a setting of some kind, either in software or hardware, and the error went away. Be sure to provide the details (Make and model of controller, Firmware and driver versioning information, steps to reproduce the issue, etc…)

As always, we hope this has been informative for you.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support

↧

Windows Server 2008 Failover Clusters: Networking (Part 4)

April 15, 2010, 12:00 pm

≫ Next: Help us, help you.

≪ Previous: Cluster Validation Storage Test ‘List All Disks’ Fails with Status 87

The Windows Server 2008 Failover Clustering: Networking three-part blog series has been out for a little while now. Hopefully, it has been helpful. Little did I know there would be an opportunity to write another part. This segment will be short as it covers a very specific scenario. One that we rarely see, but we have encountered it enough that I felt it might be worth writing about it.

There are applications written to access resources that are being hosted in Microsoft clusters running on Windows Server 2008 (RTM + R2). The resource could be a File Server, could be a SQL database, or whatever. The point is that the required resource is being hosted in a Failover Cluster. It is hoped that applications that need to function in this manner are written properly to locate the required resource being hosted in a cluster. By that I mean I would expect an application to be written in a manner where it would first query a name server (DNS server) and then use the information obtained to make a proper connection to the required cluster resource. In a Failover Cluster, that connection point is known as a Client Access Point (CAP). A CAP consists of a Network Name (NetBIOS) resource and one or more IP Address resources. The default behavior in a Windows Server 2008 cluster is to dynamically register CAP information in a DNS server provided it is configured to support Dynamic Updates. This occurs when the CAP is brought Online in the cluster. There are applications that are not written in this manner. There are some application that are written in such a way that they will make a local connection on a cluster node by binding to the first network adapter and then use the IP address configured for that adapter. The end result is in a cluster, the first connection listed in the binding order by default is the Microsoft Failover Cluster Virtual Adapter. This adapter uses an IP address that is drawn from the APIPA (Automatic Private IP Addressing) address range which is non-routable and not registered in DNS.

To assist with helping make these types of applications work better, we can use a utility that has been released for public download on the Microsoft MSDN site. The utility is called ‘nvspbind.’ So, the first step is to download and install the utility on each cluster node. The options we will be using are shown in Figure 1.

Figure 1: Options for nvspbind

First we need to identify the adapter that is the Microsoft Failover Cluster Virtual Adapter by using the nvspbind /n command (Figure 2). The adapter is ‘Local area connection* 9’.

Figure 2: Identify the Microsoft Failover Cluster Virtual Adapter

Next, we use the 'nvspbind /o ms_tcpip’to determine the binding order for IPv4 (Figure 3).

Figure 3: Listing the bindings for IPv4

We can see here, that the adapter is listed at the top of the binding order for IPv4 which is causing the problem for some applications. We need to move the adapter down in the binding order so we will use the following command to accomplish that –

C:\nvspbind /- “local area connection* 9” ms_tcpip (Figure 4).

Figure 4: Moving the adapter down in the binding order for IPv4

Note: The adapter can be moved further down by using /-- if desire.

Once the adapter has been positioned correctly in the binding order, the application can be tested to see if it now works as desired.

To further highlight the effect of this utility, we can inspect the registry. First, we need to locate some information for the Microsoft Failover Cluster Virtual Adapter. Navigating to the following registry key (Figure 5), and locate the adapter –

HKEY_LOCAL_MACHINE\SYSTEM\CurrenControlSet\Class\{4D36E972-11CE-BFC1-08002BE10318}

Figure 5: Microsoft Failover Cluster Virtual Adapter NetCfgInstanceId

The same information shown in Figure 5 is also displayed in Figure 2.

With the information in hand, navigate to the following registry key (Figure 6) to verify the adapter is no longer listed at the top of the binding order.

Figure 6: HKLM\SYSTEM\CurrentControlSet\services\Tcpip\Linkage

That’s about it. Thanks for your time and, as always, we hope the information here has been useful to you.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support

↧

Help us, help you.

June 29, 2010, 10:41 am

≫ Next: Working with File Shares in Windows Server 2008 (R2) Failover Clusters

≪ Previous: Windows Server 2008 Failover Clusters: Networking (Part 4)

Catchy title, huh. Perhaps not, but it is really what we want you to do. This will be a pretty short blog to get out some information that is important for you to know as it may help resolve a Hyper-V issue quickly, or, better yet, prevent one from happening at all. Inside Microsoft, we have what we call Supportability Program Managers (SPM). They help drive product quality by looking at the types of issues that come through our Customer Support organization. They also look at issues being reported in technology forum posts. They track trends so we can improve the product. In a conversation I had recently with the Hyper-V SPM, I was made aware of a number of issues that were resolved last quarter by simply installing a hotfix. So, here I am. Help us, help you by spending some time checking out these two online resources:

Hyper-V Update List for Windows Server 2008:http://technet.microsoft.com/en-us/library/dd430893(WS.10).aspx

Hyper-V Update List for Windows Server 2008 R2:http://technet.microsoft.com/en-us/library/ff394763(WS.10).aspx

While not updated on a daily basis, these resources should be the first stop when you run into an issue with Hyper-V. We cannot make every fix for the operating system and its components available via Windows Update. Some may require downloading using a link provided in a KB article.

As always, we hope this has been informative for you.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support

↧

Working with File Shares in Windows Server 2008 (R2) Failover Clusters

August 19, 2010, 10:47 am

≫ Next: Using Multiple Client Access Points (CAP) in a Windows Server 2008 (R2) Failover Cluster

≪ Previous: Help us, help you.

I know what you are thinking, “How hard can it be to work with cluster file shares?”. I would be willing to bet a lot of you have been working with File Server clusters since NT 4.0 days. If you are still working with them today in Windows Server 2008 R2, you know things have changed. In this blog, I hope to give you some insight into a piece of functionality both within Failover Cluster and Explorer that may alter the way you work with file shares in your organization. It may even help finally solve a mystery that has been plaguing some of you for a while now.

I will be working with a 2-Node Windows Server 2008 R2 Failover Cluster (Figure 1).

Figure 1

In the cluster, I created a highly available File Server (CONTOSO-FS1). I created a series of folders, using the Explorer interface, on the storage in the File Server resource group (Figure 2).

Figure 2

I use the folders to make shares highly available in the CONTOSO-FS1 File Server resource group.

There are three main ways to provision shares in a Failover Cluster using built-in GUI tools.

1. Failover Cluster Management snap-in

2. Share and Storage Manager snap-in

3. Explorer interface

In the Failover Cluster Management interface, the Add a shared folder function is available in the Actions pane (Figure 3).

Figure 3

In the Share and Storage Management interface, the Provision Share function is available in the Actions pane (Figure 4).

Figure 4

In Explorer, you simply Right-Click on the folder and Share with users (or nobody to stop sharing) (Figure 5).

Figure 5

The end result using any of these three methodologies is shared folders appearing in the Failover Cluster Manager snap-in in the CONTOSO-FS1 resource group (Figure 6).

Figure 6

A similar display can be seen in Share and Storage Manager (Figure 7).

Figure 7

Inspecting the cluster registry hive, we can see the shares defined under the appropriate File Server Resource (FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk)) (Figure 8).

Figure 8

At this point you may be thinking, “So what Chuck. This isn’t rocket science. We know all this stuff.” And, you may be right. Setting up the shares is the easy part, and we provide you with several methods with which to accomplish this, but what happens when you no longer want to share ‘stuff’ anymore? This is where it could get a little interesting.

If you do not want to share a folder anymore, there are three correct ways to do this.

Option 1: In the Failover Cluster Management interface, Right-Click on the shared folder and select Stop Sharing (Figure 9).

Figure 9

Option 2: In the Share and Storage Manager interface. Right-Click on the share and select Stop Sharing (Figure 10).

Figure 10

Option 3: In Windows Explorer, Right-Click on the folder and select Share with Nobody (Figure 11).

Figure 11

The unexpected behavior occurs in the Explorer interface if instead of choosing to stop sharing by executing the process in Figure 11, the user chooses to Delete the folder (Figure 12). There could be unintended consequences for that action.

Figure 12

In Explorer, when the folder is selected for deletion, a pop-up Confirmation window is displayed. An example of one is shown in Figure 13.

Figure 13

If Yes is selected, the folder is deleted. In the Failover Cluster Management interface, however, the shared folder that was just deleted in Explorer is still displayed and appears to be Online (Figure 14).

Figure 14

Even the cluster registry hive will show the share present under the File Server resource (Figure 15).

Figure 15

Note: In previous versions of clustering, the cluster service maintained cluster file share information in the registry key HKLM\System\CurrrentControlSet\Services\LanmanServer\Shares.

Here is the punch line – the next time the File Server Resource is cycled Offline and then back Online again (like during a Failover of the resource group to another node in the cluster), an Error (Event ID 1588) will be registered in the System Event Log (Figure 16). The error indicates that the share that cannot be found also cannot be brought Online by the File Server resource.

Figure 16

The cluster log reports a problem as well but it is only a Warning (Figure 17).

00000944.00000688::2010/08/07-18:05:31.183 WARN [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Failed in NetShareGetInfo(CONTOSO-FS1, Pictures), status 2310. Tolerating...

00000944.00000b04::2010/08/07-18:06:31.185 WARN [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Failed in NetShareGetInfo(CONTOSO-FS1, Pictures), status 2310. Tolerating...

00000944.00000590::2010/08/07-18:07:31.190 WARN [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Failed in NetShareGetInfo(CONTOSO-FS1, Pictures), status 2310. Tolerating...

00000944.00000830::2010/08/07-18:08:31.194 WARN [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Failed in NetShareGetInfo(CONTOSO-FS1, Pictures), status 2310. Tolerating...

00000944.00000b48::2010/08/07-18:09:31.197 WARN [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Failed in NetShareGetInfo(CONTOSO-FS1, Pictures), status 2310. Tolerating...

Figure 17

Decoding Status 2310 (Figure 18)

Figure 18

These errors in the System Event Log do not prevent the File Server resource from coming Online and bringing all the other valid shared folders Online (except if it were the last shared folder associated with the File Server resource. See the ‘bonus material’ at the end of the blog). However, I think you can quickly see that the process of deleting shared folders instead of just stopping them from being shared can, over time, accumulate orphaned entries in the cluster registry hive and the Event ID 1588 Error messages will continue to be registered for each of the ‘orphaned’ shares.

One way this behavior manifests itself is if a shared folder is created in Failover Cluster Manager or Share and Storage Manager, and is then deleted in Explorer. The Event ID 1588 is registered because the cluster registry hive is not ‘cleaned’ up properly. If the folder is shared in Explorer and then subsequently deleted in Explorer, a different pop-up Warning is displayed (Figure 19).

Figure 19

If folders are not deleted but instead are just stopped from being shared, then the cluster is cleaned up properly and the error should not be registered. If the pop-up in Figure 19 is displayed (as opposed to the pop-up shown in Figure 13), then the share will be properly removed from the Failover Cluster and the cluster registry hive will be properly cleaned up.

Another scenario where we could see an Event ID 1588 registered, but not be the result of the cluster registry hive not being cleaned up properly, would be where the System account had been removed from the default security setting for a folder that was shared in a Failover Cluster.

Bonus Material:

What happens if the final shared folder that is associated with a File Server Resource is deleted? At the first LooksAlive\IsAlive check, the File Server resource will fail. A failover will be initiated, but in the end, the File Server Resource will remain in a Failed state. An Event ID 1587 (Figure 20) could be registered along with the customary Event ID 1069 reporting a cluster resource failure.

Figure 20

The cluster log entry will be different from the previous entry (Figure 17) as shown in the highlighted section below (Figure 21). This time it is not a Warning but an Error ([ERR]) that is seen in the cluster log.

00000720.00000a70::2010/08/10-22:25:13.616 INFO [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Shares 'are being scoped to virtual name CONTOSO-FS1

00000720.00000a70::2010/08/10-22:25:13.616 DBG [RHS] Resource FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk)) called SetResourceStatus: checkpoint 2. Old state OnlinePending, new state OnlinePending

00000720.00000a70::2010/08/10-22:25:13.616 WARN [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Failed to open path e:\Documents. Error: 2. Maybe a reparse point...

00000720.00000a70::2010/08/10-22:25:13.616 ERR [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Failed to open path e:\Documents with reparse flag. Error: 2.

00000720.00000a70::2010/08/10-22:25:13.616 ERR [RES] File Server <FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk))>: Failed to online a single share among 1 shares.

00000720.00000a70::2010/08/10-22:25:13.616 DBG [RHS] Resource FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk)) called SetResourceStatus: checkpoint 2. Old state OnlinePending, new state Failed

00000720.00000a70::2010/08/10-22:25:13.616 ERR [RHS] Online for resource FileServer-(CONTOSO-FS1)(Contoso-FS1 (Disk)) failed.

Bonus Material [Resolution]: If you have intentionally deleted the last shared folder via Explorer and found yourself in the above described state, you can safely delete the File Server resource. If you need to share out folders on that same drive, the first time you do so, the File Server resource will automatically recreate.

I hope this information has been helpful and perhaps solved a few mysteries out there.

Thanks for your attention and come back.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support

↧

Using Multiple Client Access Points (CAP) in a Windows Server 2008 (R2) Failover Cluster

August 24, 2010, 4:17 am

≫ Next: Microsoft Professional Advisory Services

≪ Previous: Working with File Shares in Windows Server 2008 (R2) Failover Clusters

Quite a while back I wrote a blog on a new functionality in Windows Server 2008 Failover Clusters called ‘file share scoping’ (http://blogs.technet.com/b/askcore/archive/2009/01/09/file-share-scoping-in-windows-server-2008-failover-clusters.aspx). I was informed recently that our Networking Support Team refers to this blog frequently when working with customers who are migrating to Windows Server 2008 Failover Clusters and discover that CNAME (Canonical Names) records in DNS, that had been in-place to support their Windows Server 2003 File Server clusters, no longer work with Windows Server 2008 Failover Clusters. Users keep asking if there is a way to disable this functionality or if it can be changed by adding a registry key or something. At this time, there is no disabling this behavior and our Product Team has been made aware of the feedback we have been receiving on this. No official plans have been announced with respect to making any changes in future releases of the Operating System.

While we wait and see what the future holds, I have been asked to write a short blog on how users can better work within the constraints of this functionality. In a File Server Resource Group you typically have a Client Access Point (CAP), a File Server Resource, a Physical Disk resource and some Shared Folders (Figure 1).

Figure 1

Suppose, in a Windows Server 2003 cluster environment, there were several CNAME records created in DNS that pointed to the same File Server Cluster so users from various organizations within a company could access their data files. For example, suppose we had CNAME records for OPS-FS1, Academics-FS1 and Executive-FS1. After completing a migration to a Windows Server 2008 R2 File Server cluster, these CNAME records no longer work and end users can no longer access their data. How can we fix that?

To remedy the situation, create additional CAPs in the File Server Resource group that contains the shared folders that contain the data the users need to access. To do this will require stepping outside of the normal wizard-based process that was used to create the original highly available File Server resource group and instead use the procedures described in KB 947050.

Start by selecting the File Server resource group and in the Right-hand Actions pane select Add a resource (Figure 2).

Figure 2

From the list of available resources, select Client Access Point (Figure 3).

Figure 3

Provide the requested information and complete the wizard. Do this for all required Client Access Points. When completed, bring all the CAPs Online. Here is my result (Figure 4).

Figure 4

At this point, decide which shared folders need to be available to users when each Client Access Point connection is made. Then, create the shared folders in the correct context. Figure 5 shows the selections available when executing the Add shared folder action in the Actions pane.

Figure 5

As an example, in my 2-Node cluster, all folders shown in Figure 1 were shared in the context of CONTOSO-FS1. After adding the additional Client Access Points that were needed, a decision was made that the Academics share was needed in the Academics-FS1 context, the Executive and Archive folders were needed in the Executive-FS1 context and finally the Operations folder was needed in the OPS-FS1 context. When sharing folders in multiple contexts, the display can start getting a little cluttered (Figure 6).

Figure 6

When all File Server resources are Online, all shared folders associated with those resources are displayed. If a multiple File Server resources are associated with the same shared folder, multiple entries are displayed (Figure 6). This is in addition to the administrative share for the associated physical disk resource.

To help clarify some of the confusion, modify the Description on the Sharing tab for the Property page of the shared folder to reflect its associated File Server resource (Figure 7).

Figure 7

This provides some organization to what can be a cluttered display (Figure 8).

Figure 8

Additional administrative overhead is incurred here as well because multiple Access Control List (ACLs) entries must be maintained on the same set of folders. Depending on the tools used to migrate the data to a windows Server 2008 Failover cluster, that information could already be present on the storage and not be an issue.

I hope this helps provide a solution for you organization. See you next time.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support

↧

Microsoft Professional Advisory Services

September 14, 2010, 5:10 am

≫ Next: Troubleshooting ‘Redirected Access’ on a Cluster Shared Volume (CSV)

≪ Previous: Using Multiple Client Access Points (CAP) in a Windows Server 2008 (R2) Failover Cluster

I am sure many of you are aware that Microsoft provides several options for our customers in terms of support services. The Support website provides information about our support offerings. We have Consumer support, Professional support and various levels of Premier Support. There are even several Self Support options available. These solutions are primarily focused on break-fix scenarios. What if you do not have something that is broken that needs fixing but instead would like some help implementing one of Microsoft’s technologies? We can help with that as well. This kind of help can be provided via Advisory type services.

If you are a small company, of even just an individual, and usually obtain support on a pay-per-incident basis, it is difficult to obtain advisory services. This is where Pro Advisory services can assist. Microsoft now offers Professional Advisory Services that is paid for on an hourly basis without having to have a Premier contract or having to work through Microsoft Consulting Services. The service is still in pilot, and only covers specific scenarios, but more are being added all the time. Each group has their own supported scenarios, and there are too many to list here. Here is a list of what the CORE Team has to offer at this point:

2276908 Windows Server 2008 R2 - RDWeb Access and RemoteApp Configuration (http://support.microsoft.com/kb/2276908)

2276905 Windows Server 2008 R2 - Microsoft VDI Configuration (http://support.microsoft.com/kb/2276905)

2276880 Windows 2008 Session Broker Load Balancing (http://support.microsoft.com/kb/227688)

2276874 Windows Server 2008 R2 RD Web Single Sign On (http://support.microsoft.com/kb/2276874)

2275811 TS Web Access And RemoteApp Configuration (http://support.microsoft.com/kb/2275811)

2275629 Windows Server 2003 Server Print Queue Migration (http://support.microsoft.com/kb/2275629)

2253278 Windows Server 2008 R2 RD Connection Broker (http://support.microsoft.com/kb/2253278)

2253250 Windows Server 2008 R2 Hyper-V Installation (http://support.microsoft.com/kb/2253250)

982909 Windows Server 2003 Server Cluster Disaster Recovery Planning (http://support.microsoft.com/kb/982909)

982908 Windows Server 2008 or Windows Server 2008 R2 Failover Cluster Disaster Recovery Planning (http://support.microsoft.com/kb/982908)

982872 Windows Server 2008 R2 RD Web Single Sign On (http://support.microsoft.com/kb/982872)

980643 Windows 2008 R2 Cluster Installation with Hyper-V (http://support.microsoft.com/kb/980643)

980459 Windows 2008 R2 Cluster Installation (http://support.microsoft.com/kb/980459)

979130 Windows 7 Deployment Activation Guidance (http://support.microsoft.com/kb/979130)

979129 Demonstration of Microsoft Deployment Toolkit With Q&A (http://support.microsoft.com/kb/979129)

978867 Windows 7 Deployment Question and Answer (http://support.microsoft.com/kb/978867)

974386 Platform Application Compatibility (http://support.microsoft.com/kb/974386)

What can you expect from Microsoft Professional Advisory services? The process is pretty straightforward:

1. Expect to be contacted by a Support Engineer who specializes in the technology area you are interested in.

2. The Support Engineer will review the Professional Advisory Services offering with you as it applies to the scenario you selected to ensure you both understand the scope of the work involved before an official support incident is created and work can begin.

3. The Support Engineer will carefully track the time involved in providing the solution so you will not be overcharged.

4. Once the work has been completed, and both you and the Support Engineer agree the solution has been provided, a summary will be provided and the case will be closed.

If you are interested in seeing other thecnology offerings that are available, navigate to http://support.microsoft.com and search on the keyword ‘kbProAdvisory’ and you will be able to browse the current offerings.

Hope this helps.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support

↧

Troubleshooting ‘Redirected Access’ on a Cluster Shared Volume (CSV)

December 16, 2010, 5:26 am

≫ Next: CNO Blog Series: Increasing Awareness around the Cluster Name Object (CNO)

≪ Previous: Microsoft Professional Advisory Services

Cluster shared Volumes (CSV) is a new feature implemented in Windows Server 2008 R2 to assist with new scale-up\out scenarios. CSV provides a scalable fault tolerant solution for clustered applications that require NTFS file system access from anywhere in the cluster. In Windows Server 2008 R2, CSV is only supported for use by the Hyper-V role.

The purpose of this blog is to provide some basic troubleshooting steps that can be executed to address CSV volumes that show a Redirected Access status in Failover Cluster Manager. It is not my intention to cover the Cluster Shared Volumes feature. For more information on Cluster Shared Volumes consult TechNet.

Before diving into some troubleshooting techniques that can be used to resolve Redirected Access issues on Cluster Shared Volumes, let’s list some of the basic requirements for CSV as this may help resolve other issues not specifically related to Redirected Access.

Disks that will be used in the CSV namespace must be MBR or GPT with an NTFS partition.
The drive letter for the system disk must be the same on all nodes in the cluster.
The NTLM protocol must be enabled on all nodes in the cluster.
Only the in-box cluster “Physical Disk” resource type can be added to the CSV namespace. No third party storage resource types are supported.
Pass-through disk configurations cannot be used in the CSV namespace.
All networks enabled for cluster communications must have Client for Microsoft Networks and File and Printer Sharing for Microsoft Networks protocols enabled.
All nodes in the cluster must share the same IP subnets between them as CSV network traffic cannot be routed. For multi-site clusters, this means stretched VLANs must be used.

Let’s start off by looking at the CSV namespace in a Failover Cluster when all things appear to be ‘normal.’ In Figure 1, all CSV volumes show Online in the Failover Cluster Management interface.

Figure 1

Looking at a CSV volume from the perspective of a highly available Virtual Machine group (Figure 2), the Virtual Machine is Online on one node of the cluster (R2-NODE1), while the CSV volume hosting the Virtual Machine files is Online on another node (R2-NODE2) thus demonstrating how CSV completely disassociates the Virtual Machine resources (Virtual Machine; Virtual Machine Configuration) from the storage hosting them.

Figure 2

When all things are working normally (no backups in progress, etc…) in a Failover Cluster with respect to CSV, the vast majority of all storage I/O is Direct I/O meaning each node hosting a virtual machine(s) is writing directly (via Fibre Channel, iSCSI, or SAS connectivity) to the CSV volume supporting the files associated with the virtual machine(s). A CSV volume showing a Redirected Access status indicates that all I/O to that volume, from the perspective of a particular node in the cluster, is being redirected over the CSV network to another node in the cluster which still has direct access to the storage supporting the CSV volume. This is, for all intents and purposes, a ‘recovery’ mode. This functionality prevents the loss of all connectivity to storage. Instead, all storage related I/O is redirected over the CSV network. This is very powerful technology as it prevents a total loss of connectivity thereby allowing virtual machine workloads to continue functioning. This provides the cluster administrator an opportunity to evaluate the situation and live migrate workloads to other nodes in the cluster not experiencing connectivity issues. All this happens behind the scenes without users knowing what is going on. The end result may be slower performance (depending on the speed of the network interconnect, for example, 10 GB vs. I GB) since we are no longer using direct, local, block level access to storage. We are, instead, using remote file system access via the network using SMB.

There are basically four reasons a CSV volume may be in a Redirected Access mode.

The user intentionally places the CSV Volume in Redirected Access mode.
There is a storage connectivity failure for a node in which case all I\O is redirected over a cluster network designated for CSV traffic to another node.
A backup of a CSV volume is in progress or failed.
An incompatible filter driver is installed on the node.

Lets’ take a look at a CSV volume in Redirected Access mode (Figure 3).

Figure 3

When a CSV volume is placed in Redirected Access mode, a Warning message (Event ID 5136) is registered in the System Event log. (Figure 4).

Figure 4

For additional information on event messages that pertain specifically to Cluster Shared Volumes please consult TechNet.

Let’s look at each one of the four reasons I mentioned and propose some troubleshooting steps that can help resolve the issue.

1. User intentionally places a CSV volume in Redirected Access mode: Users are able to manually place a CSV volume in Redirected Access mode by simply selecting a CSV volume, Right-Click on the resource, select More Actions and then select Turn on redirected access for this Cluster shared volume (Figure 5).

Figure 5

Therefore, the first troubleshooting step should be to try turning off Redirected Access mode in the Failover Cluster Management interface.

2. There is a storage connectivity issue: When a node loses connectivity to attached storage that is supporting a CSV volume, the cluster implements a recovery mode by redirecting storage I\O to another node in the cluster over a network that CSV can use. The status of the cluster Physical Disk resource associated with the CSV volume is Redirected Access and all storage I\O for the associated virtual machine(s) being hosted on that volume is redirected over the network to another node in the cluster that has direct access to the CSV volume. This is by far the number one reason CSV volumes are placed in Redirected Access mode. Troubleshoot this as you would any other loss of storage connectivity on a server. Involve the storage vendor as needed. Since this is a cluster, the cluster validation process can also be used as part of the troubleshooting process to test storage connectivity.

Look for the following event ID in the system event log.

Log Name: System

Source: Microsoft-Windows-FailoverClustering

Date: 10/8/2010 6:16:39 PM

Event ID: 5121

Task Category: Cluster Shared Volume

Level: Error

Keywords:

User: SYSTEM

Computer: Node1.cluster.com

Description:Cluster Shared Volume 'DATA-LUN1' ('DATA-LUN1') is no longer directly accessible from this cluster node. I/O access will be redirected to the storage device over the network through the node that owns the volume. This may result in degraded performance. If redirected access is turned on for this volume, please turn it off. If redirected access is turned off, please troubleshoot this node's connectivity to the storage device and I/O will resume to a healthy state once connectivity to the storage device is reestablished.

3. A backup of a CSV volume fails: When a backup is initiated on a CSV volume, the volume is placed in Redirected Access mode. The type of backup being executed determines how long a CSV volume stays in redirected mode. If a software backup is being executed, the CSV volume remains in redirected mode until the backup completes. If hardware snapshots are being used as part of the backup process, the amount of time a CSV volume stays in redirected mode will be very short. For a backup scenario, the CSV volume status is slightly modified. The status actually shows as Backup in progress, Redirected Access (Figure 6) to allow you to better understand why the volume was placed in Redirected Access mode. When the backup application completes the backup of the volume, the cluster must be properly notified so the volume can be brought out of redirected mode.

Figure 6

A couple of things can happen here. Before proceeding down this road, ensure a backup is really not in progress. The first thing that needs to be considered is that the backup completes but the application did not properly notify the cluster that it completed so the volume can be brought out of redirected mode. The proper call that needs to be made by the backup application is ClusterClearBackupStateForSharedVolume which is documented on MSDN. If that is the case, you should be able to clear the Backup in progress, Redirected Access status by simulating a failure on the CSV volume using the cluster PowerShell cmdlet Test-ClusterResourceFailure. Using the CSV volume shown in Figure 6, an example would be –

Test-ClusterResourceFailure “35 GB Disk”

If this clears the redirected status, then the backup application vendor needs to be notified so they can fix their application.

The second consideration concerns a backup that fails, but the application did not properly notify the cluster of the failure so the cluster still thinks the backup is in progress. If a backup fails, and the failure occurs before a snapshot of the volume being backed up is created, then the status of the CSV volume should be reset by itself after a 30 minute time delay. If, however, during the backup, a software snapshot was actually created (assuming the application creates software snapshots as part of the backup process), then we need to use a slightly different approach.

To determine if any volume shadow copies exist on a CSV volume, use the vssadmin command line utility and run vssadmin list shadows (Figure 7).

Figure 7

Figure 7 shows there is a shadow copy that exists on the CSV volume that is in Redirected Access mode. Use the vssadmin utility to delete the shadow copy (Figure 8). Once that completes, the CSV volume should come Online normally. If not, change the Coordinator node by moving the volume to another node in the cluster and verify the volume comes Online.

Figure 8

4. An incompatible filter driver is installed in the cluster: The last item in the list has to do with filter drivers introduced by third party application(s) that may be running on a cluster node and are incompatible with CSV. When these filter drivers are detected by the cluster, the CSV volume is placed in redirected mode to help prevent potential data corruption on a CSV volume. When this occurs an Event ID 5125[EC4] Warning message is registered in the System Event Log. Here is a sample message –

17416 06/23/2010 04:18:12 AM Warning <node_name> 5125 Microsoft-Windows-FailoverClusterin Cluster Shared Vol NT AUTHORITY\SYSTEM Cluster Shared Volume 'Volume2' ('Cluster Disk 6') has identified one or more active filter drivers on this device stack that could interfere with CSV operations. I/O access will be redirected to the storage device over the network through another Cluster node. This may result in degraded performance. Please contact the filter driver vendor to verify interoperability with Cluster Shared Volumes. Active filter drivers found: <filter_driver_1>,<filter_driver_2>,<filter_driver_3>

The cluster log will record warning messages similar to these –

7c8:088.06/10[06:26:07.394](000000) WARN [DCM] filter <filter_name> found at unsafe altitude <altitude_numeric>
7c8:088.06/10[06:26:07.394](000000) WARN [DCM] filter <filter_name> found at unsafe altitude <altitude_numeric>
7c8:088.06/10[06:26:07.394](000000) WARN [DCM] filter <filter_name> found at unsafe altitude <altitude_numeric>

Event ID 5125 is specific to a file system filter driver. If, instead, an incompatible volume filter driver were detected, an Event ID 5126 would be registered. For more information on the difference between file and volume filter drivers, consult MSDN.

Note: Specific filter driver names and altitudes have been intentionally left out. The information can be decoded by downloading the ‘File System Minifilter Allocated Altitudes’ spreadsheet posted on the Windows Hardware Developer Central public website.

Additionally, the fltmc.exe command line utility can be run to enumerate filter drivers. An example is shown in Figure 9.

Figure 9

Once the Third Party filter driver has been identified, the application should be removed and\or the vendor contacted to report the problem. Problems involving Third Party filter drivers are rarely seen but still need to be considered.

UPDATE 4/9: A Hotfix has been released to address an issue where filter drivers can cause the 'redirected access' issue:

FIXED: Cluster Shared Volumes (CSV) in redirected access mode after installing McAfee VSE 8.7 Patch 5 or 8.8 Patch 1

Hopefully, I have provided information here that will get you started down the right path to resolving issues that involve CSV volumes running in a Redirected Access mode.

Thanks!

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support

↧

CNO Blog Series: Increasing Awareness around the Cluster Name Object (CNO)

September 25, 2012, 4:38 am

≫ Next: Logon Failures Involving Virtual Machines in Windows Server 2012

≪ Previous: Troubleshooting ‘Redirected Access’ on a Cluster Shared Volume (CSV)

I am starting a 'CNO Blog Series', which will consist of blogs written by the CORE team cluster engineers and will focus primarily on the Cluster Name Object (CNO). The CNO is the computer object in Active Directory associated with the Cluster Name; it is used as a common identity in the cluster. If you have been working with Failover Clusters since Windows Server 2008, you should be very familiar with the CNO and the role it plays with respect to the cluster security model. Looking over the CORE Team blog site, there have already been some blogs written that focus primarily on the CNO:

Recovering a Deleted Cluster Name Object (CNO) in a windows Server 2008 Failover Cluster (April 2009)

Recovering a Deleted Cluster Name Object (CNO) in a Windows Server 2008 Failover Cluster, Part 2 (May 2011)

Why is the CNO in a Failed State? (March 2012)

Rights needed for user account when pre-creating a Cluster Name Object (CNO) on Windows Server 2008 R2 Failover Cluster (June 2011)

With the release of Windows Server 2012, there have been several enhancements added to the Failover Clustering feature that provide for better integration with Active Directory. The Product Team blog (http://blogs.msdn.com/b/clustering/), has a post that discusses creating Windows Server 2012 Failover Clusters in more restrictive Active Directory environments. That blog discusses some of the changes that have been made in the product that directly involve the CNO.

On to today's blog - increasing awareness around the Cluster Name Object (CNO)….

Beginning with Windows Server 2008, when a cluster is created, the computer objected associated with the CNO, unless pre-staged in some other container, is placed, by default, in the Computers container. Windows Server 2012 Failover Clusters give cluster administrators more control over the computer object representing the CNO. The Product Group's blog mentioned earlier, details new functionality in Windows Server 2012, which includes:

Using Distinguished Names when creating the cluster to manually control CNO placement

New default behavior where a CNO is placed in the same container as the computer objects for the nodes in the cluster

The Virtual Computer Objects (VCOs) created by a CNO are placed in the same container as the CNO

Having more control over cluster computer object(s) placement, while desirable, requires a bit more 'awareness' on the part of a cluster administrator. This 'awareness' involves knowing that, by default, the CNO when placed in the non-default location may not have the rights it needs for other cluster operations such as creating other cluster computer objects (VCOs). The first indication of a problem may be when a Role is made highly available in the cluster and that Role requires a Client Access Point (CAP). After the Role creation process completes, and the Network Name associated with the CAP attempts to come Online, it fails with an Event ID 1194.

Log Name:	System
Source:	Microsoft-Windows-Failover-Clustering
Even ID:	1194
Level:	Error

This event reports a computer object associated with a cluster Network Name resource could not be created. The error message itself provides good troubleshooting guidance to help resolve the issue -

In this case, it is a simply a matter of modifying the security on the AD container so the CNO is allowed to Create Computer Objects. Once this setting is in place, the Network Name comes online without issue. Additionally, the CNO is also given another critical right, the right to change the password for any VCO it creates.

If Active Directory is properly configured (more on that in a bit), the VCO, along with the CNO, can be also protected from accidental deletion.

Protecting Cluster Computer Objects

A call often handled by our support engineers involves the accidental, or semi-intentional, deletion of the computer objects associated with Failover Clusters. There are a variety of reasons this happens, but we will not go into those here. Suffice it to say, things function more smoothly if the computer objects associated with a cluster are protected.

I mentioned new functionality in Windows Server 2012 Failover Clusters where cluster objects will be strategically placed in targeted Active directory containers (OU) automatically. Using this methodology also makes it easier to discern which objects are associated with a Failover Cluster. As you can see in this screenshot of a custom OU (Clusters) that I created in my domain, the objects associated with the cluster carry the description of Failover cluster virtual network name account. The cluster nodes, which are located in the same OU, are traditional computer objects, which do not carry this description.

Examining the properties of one of these accounts using the Attribute Editor, one can see it is clearly an attribute (Description field) of the computer object.

Properly protecting cluster computer objects (from accidental deletion) requires Domain Administrator intervention. This can be either a 'proactive' or a 'reactive' intervention. A proactive intervention requires a Domain Administrator set a Deny ACE (Access Control Entry) for Delete all child objects for the Everyone group on the container where the cluster computer objects will be located.

A reactive intervention occurs after a CNO is placed in the designated container. At this point, the Domain Administrator has a choice. He can either:

1. Set the Deny ACE for Delete all child objects on the container, or

2. Check the Protect object from accidental deletion checkbox on the CNO computer object (which would then set the correct Deny ACE on the container)

Let us step through a scenario from a recent case I worked for one of our customers deploying a new Windows Server 2012 Failover Cluster.

Customer Case Study

In this case, a customer was deploying a 2-Node Windows Server 2012 Hyper-V Failover Cluster dedicated to supporting virtualized workloads. The cluster creation process was completed without issue and the Cluster Core Resources group could move freely between the nodes without any resource failures. The customer had already created four highly available virtual machines, some of which were already in production. The customer wanted to test live migration for the virtual machines. When he attempted to execute a live migration for a virtual machine, it failed immediately on the source cluster node. He attempted a quick migration and that succeeded.

Reviewing the cluster logs obtained from the customer, the live migration error appeared in the cluster log of the source cluster node. The live migration failure was registered with an error code of 1326.

00001274.00001c24::2012/09/18-17:50:16.301 ERR [RES] Virtual Machine <Virtual Machine MRS1SAPPBW31>: Live migration of 'Virtual Machine MRS1SAPPBW31' failed.

00001274.00001c24::2012/09/18-17:50:16.301 ERR [RHS] Resource Virtual Machine MRS1SAPPBW31 has cancelled offline with error code 1326.

00000aa8.00001cf4::2012/09/18-17:50:16.301 INFO [RCM] HandleMonitorReply: OFFLINERESOURCE for 'Virtual Machine MRS1SAPPBW31', gen(0) result 0/1326.

The error code resolved to - 'The user name or password is incorrect'.

Examining the rest of the cluster log indicated the CNO could not log on to the domain controller to obtain necessary tokens. This failure was also causing a failure registering with DNS (customer is using Microsoft dynamic DNS).

00001228.00001a20::2012/09/18-17:43:00.466 WARN [RES] Network Name: [NNLIB] LogonUserEx fails for user HPVCLU03$: 1326 (useSecondaryPassword: 0)

00001228.00001a20::2012/09/18-17:43:00.550 WARN [RES] Network Name: [NNLIB] LogonUserEx fails for user HPVCLU03$: 1326 (useSecondaryPassword: 1)

00001228.00001a20::2012/09/18-17:43:00.550 INFO [RES] Network Name: [NNLIB] Logon failed for user HPVCLU03$ (Error 1326), DC \\<FQDN_of_DC_here>

00001228.00001a20::2012/09/18-17:43:00.550 INFO [RES] Network Name <Cluster Name>: Identity: Obtaining Windows Token for Name: HPVCLU03, SamName: HPVCLU03$, Type: Singleton, Result: 1326, LastDC: \\<FQDN_of _DC_here>

00001228.00001a20::2012/09/18-17:43:00.550 INFO [RES] Network Name <Cluster Name>: Identity: Slow Operation, FinishWithReply: 1326

00001228.00001a20::2012/09/18-17:43:00.550 INFO [RES] Network Name <Cluster Name>: Identity: InternalReplyHandler with event: 1326

00001228.00001a20::2012/09/18-17:43:00.550 INFO [RES] Network Name <Cluster Name>: Identity: End of Slow Operation, state: Error/Idle, prevWorkState: Idle

00001228.00001a8c::2012/09/18-17:43:00.550 WARN [RES] Network Name <Cluster Name>: Identity: Get Token Request, currently doesn't have a token!

00001228.00001a8c::2012/09/18-17:43:00.550 INFO [RES] Network Name: [NN] got sync reply: 0

00001228.00001e0c::2012/09/18-17:43:00.550 ERR [RES] Network Name <Cluster Name>: Dns: Obtaining token threw exception, error 6

00001228.00001e0c::2012/09/18-17:43:00.550 ERR [RES] Network Name <Cluster Name>: Dns: Failed DNS registration with error 6 for Name: HPVCLU03 (Type: Singleton)

Examination of the DNS zone verified there was no A-Record for the cluster name.

At this point, we logged into the domain controller the cluster was communicating with and tried to locate the CNO using the Active Directory Users and Computers (ADUC) snap-in. When the computer object was not found in the Computers container, a full search of active directory revealed it was located in a nested OU structure four levels deep. Coincidentally, it was located with the cluster node computer accounts, which is the expected new behavior beginning with Windows Server 2012 Failover Clusters as previously described. It was clear to me; however, the cluster administrator was not aware of this new behavior.

At this point, it appeared to be a case of the CNO account password being out of synch in the domain. I had the customer execute the following process:

Temporarily move the CNO account into the Computers container

Log into one of the cluster nodes with a domain account that had the Reset Password right in the domain

Simulate failures for the cluster Network Name resource until it was in a permanent failed state

Once the resource was in a Failed state, right-click on the resource, choose More Actions and then click Repair

The previous action caused the password for the CNO to be reset in the domain

After executing the procedure, the cluster name came back online, and the customer noticed an automatic registration in DNS. He then executed a live migration for a virtual machine and it worked flawlessly. He also checked and verified the dNSHostName attribute on the computer object was now correctly populated. Issue resolved. Case closed.

Moral of the story - Not only do cluster administrators need to become familiar with the new functionality in Windows Server 2012 Failover Clusters (and there are many), but they should also realize that the CNO can have impact in areas that are not necessarily obvious.

Thanks, and come back again soon.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support
High Availability\Virtualization Team

↧

Logon Failures Involving Virtual Machines in Windows Server 2012

October 31, 2012, 8:19 am

≫ Next: Just when you thought…..(Part 1)

≪ Previous: CNO Blog Series: Increasing Awareness around the Cluster Name Object (CNO)

Welcome back to the CORE Team blog. The General Availability date for Windows 8 and Windows Server 2012 has come and gone, and we here on the CORE Team expect more of you will be diving in and taking part in all of the excitement around these new products. To make sure you have a great experience, we endeavor, whenever possible, to make you aware of situations that may temporarily 'inconvenience' you. That is the purpose of this blog.

We have recently encountered several instances where a specific group policy configuration can affect the proper functioning of a Windows Server 2012 Hyper-V virtualization solution. This is due primarily to changes made in Windows Server 2012 Hyper-V functionality that will also be explained here.

The two scenarios we have seen thus far are:

Virtual machines failing to start
Virtual machines failing to live migrate

In both of these scenarios, the problem is the result of a logon failure. Here is an example of a pop-up error message you may see when a virtual machine fails to start.

The critical piece of the error message is, "Logon failure: the user has not been granted the requested logon type at this computer (0x80070569.)"

In the second scenario where a virtual machine fails to live migrate, an Event ID 21502 error message is registered in the Hyper-V-High-Availability log. The critical piece of the error message is, "Failed to create Planned Virtual Machine at migration destination. Logon failure: the user has not been granted the requested logon type at this computer (0x80070569.)"

Investigation of these events revealed that a custom Group Policy was modifying the user accounts that are allowed to Logon on as a Service on each Hyper-V server.

In Windows Server 2012, a special security group, NT VIRTUAL MACHINE\Virtual Machines is created when the Hyper-V Role is installed. Members of this group require the right to Create Symbolic Links (SeCreateSymbolicLinkPrivilege) and to Log on as a Service (SeServiceLogonRight). The SID associated with the group is S-1-5-83-0. The security group is maintained by the Hyper-V Management Service (VMMS). To ensure members of the NT VIRTUAL MACHINE\Virtual Machines security group maintain the rights they need, VMMS registers with Group Policy in order to update the local security policy whenever Group Policy is refreshed.

The NT VIRTUAL MACHINE\Virtual Machines group did not exist in previous versions of Hyper-V. As each virtual machine is started on a Hyper-V server, its account (Virtual Machine ID (VM_ID)) is added to the NT VIRTUAL MACHINE\Virtual Machines group and VMMS creates a Virtual Machine Worker Process (vmwp.exe). Examples of these processes are visible in Task Manager:

The VM_ID is the virtual machine account that is used to gain access to its own resources and prevent other virtual machines from gaining access to those same resources. As an example, if I run the following PowerShell command, it is easy to see the rights given to the virtual machine account to one of its resources (a virtual hard disk in this case):

Get-Acl -Path E:\Virtual Machines\Contoso-FS1\Virtual Hard Disks\contoso-fs1.vhdx | FL AccessToString

AccessToString : NT VIRTUAL MACHINE\E57917F3-31C3-456E-B1BA-5E45B4CC7E0C Allow Write, Read, Synchronize

BUILTIN\Administrators Allow FullControl

NT AUTHORITY\SYSTEM Allow FullControl

NT AUTHORITY\Authenticated Users Allow Modify, Synchronize

BUILTIN\Users Allow ReadAndExecute, Synchronize

Since the VMWP is an extension of VMMS, VMMS performs a service logon to create an access token that is used to run the VMWP. In order for this to work, the NT VIRTUAL MACHINE\Virtual Machines security group must be granted the Log on as a Service right. In previous versions of Hyper-V, the VMWP ran in the context of a different account, NETWORK SERVICE, which is an account defined by SYSTEM.

Windows Server 2008 R2 SP1 Hyper-V Server

To find out more information about the NETWORK SERVICE account, review this MSDN resource (http://msdn.microsoft.com/en-us/library/windows/desktop/ms684272(v=vs.85).aspx).

The error message, previously mentioned, refers to a 'user' not being granted a 'logon type'. That user, again as seen in Task Manager, is the Virtual Machine ID (VM_ID), and the logon type is 'Log on as a Service.'

Now that we understand the new changes, what needs to be done? A detailed Knowledge Base (KB) article was written in cooperation with the Directory Services team that provides additional details.

KB2779204
Starting or Live Migrating Hyper-V virtual machines may fail with error 0x80070569 on Windows Server 2012-based computers
http://support.microsoft.com/kb/2779204

Briefly, one of two things must happen:

Hyper-V Administrators need to get with their Domain Administrators to review Group Policies to see if any involve specific user accounts being granted the Log on as a Service right, and, if so, have the policy modified appropriately
Create an OU in Active Directory and place all hyper-V servers in that OU and block policy inheritance

Note: Option (2) is recommended by the Hyper-V Product Team

Tip: Administrators can temporarily, but quickly, recover from this error by opening an elevated command prompt and running gpupdate /force which forces a group policy refresh

Before we wrap-up, I would like to re-state one of Microsoft's long standing 'best practices' with respect to Hyper-V servers, and that is, the only Roles or Services that should ever be installed on a Hyper-V server is the Hyper-V Role and only those additional Roles or Features that directly support virtualization. The classic example is Hyper-V Failover Clusters where the Hyper-V Role and the Failover Clustering Feature complement each other by providing highly available virtualized workloads, which are the foundation of Microsoft's Cloud Strategy. If this 'best practice' is followed, no user rights modifications that could impact virtualization services should be needed.

I hope this has been helpful.

Thanks, and come back again soon.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support
High Availability\Virtualization Team

↧

Just when you thought…..(Part 1)

November 19, 2012, 9:37 am

≫ Next: Just when you thought… (Part 2)

≪ Previous: Logon Failures Involving Virtual Machines in Windows Server 2012

Just when you thought you had things figured out - in the words of the legendary Bob Dylan, "the times they are a-changin." With the release of Windows Server 2012, Microsoft introduces a load of new features, which, in some cases, translates into doing some of the same things in different ways. Up to now, highly available virtualized workloads meant multi-node Hyper-V Failover Clusters configured with Cluster Shared Volumes (CSV) hosting virtual machines. In Windows Server 2012 Hyper-V, the rules have changed. Now, virtual machine files can be stored on SMB 3.0 file shares hosted in standalone Windows Server 2012 File Servers, or in Windows Server 2012 Scale-Out File Servers.

This multi-part blog will walk through a new scenario, one that we may start seeing more and more as IT Professionals realize they can capitalize on their high-speed networking infrastructure investment while at the same time saving themselves a little money. The scenario involves both Windows Server 2012 Hyper-V Failover Clusters and Windows Server 2012 Scale-Out File Servers.

In this multi-part blog, I will cover the following:

Setting up a Windows Server 2012 Hyper-V Failover Cluster with no shared storage
Setting up a Windows Server 2012 Failover Cluster with the Scale-Out File Services Role
Configuring an SMB Share that supports Application Data with Continuous Availability in the Scale-Out File Server
Deploying virtual machines in the Hyper-V Failover Cluster while using the Scale-Out File Server SMB 3.0 shares to host the virtual machine files

To demonstrate the scenario, I created a 3-Node Windows Server 2012 Hyper-V Failover Cluster with no shared storage and a 2-Node Windows Server 2012 Failover Cluster connected to iSCSI storage to provide the shared storage for the Scale-Out File Server Role.

Create a 3-Node Windows Server 2012 Hyper-V Failover Cluster

First, create the 3-Node Hyper-V Failover Cluster. Since the cluster will not be connected to storage, and it is always a 'best practice' from a Quorum calculation perspective, to keep the number of votes in the cluster equal to an odd number, I chose a 3-Node cluster. I could have just as easily configured a 2-Node cluster and manually modified the Quorum Model to Node and File Share Witness. To support this Quorum Model, the Scale-Out File Server could be configured with a General Purpose file share to support the File Share Witness resource.

Recommendation: Since the cluster is not connected to storage, you do not have to run the storage tests in the cluster validation process.

In the interest of highlighting some of the other new features in Windows Server 2012 Failover Clustering, I created the cluster using a Distinguished Name format which provides greater control over the placement of cluster computer objects in a custom Organization Unit (OU) I created in Active Directory. It is recommended that you configure the OU to protect the Failover Cluster computer objects from 'accidental' deletion prior to creating the cluster. To accomplish this, implement a custom Access Control Entry (ACE) on the OU to deny Everyone the right to Delete all child objects.

This configuration on the container automatically checks the Protect object from accidental deletion on cluster computer objects when they are created.

Specify a Distinguished Name for the Cluster Name when creating the cluster (Create Cluster Wizard).

The Create Cluster report reflects the Active Directory path (container) where the CNO computer object is located.

Create a 2-Node Windows Server 2012 Scale-Out File Server

Configure a 2-Node Windows Server 2012 Failover Cluster to provide Scale-Out File Services to the virtual machines hosted by the 3-Node Hyper-V Failover Cluster.

Note: To read about Scale-Out File Services access the TechNet content here - http://technet.microsoft.com/en-us/library/hh831349.aspx

The Scale-Out File Services cluster requires storage to support the Cluster Shared Volumes (CSV) that will host the virtual machine files. To ensure the entire configuration is supported, run a complete cluster validation process, including the storage tests, before creating the cluster. Be sure to create the cluster with sufficient storage to support a Node and Disk Majority Quorum Model (Witness disk required) and the CSV volumes to host the virtual machine files.

Note: While a single CSV volume supports multiple virtual machines, a 'best practice' is to place virtual machines across several CSV volumes to distribute the I/O to the backend storage. Additionally, consider enabling CSV caching (scenario dependent). To find out more about CSV Caching, review the Product Team blog on the topic - http://blogs.msdn.com/b/clustering/archive/2012/03/22/10286676.aspx

With the cluster up and running, configure the Scale-Out File Server Role by following these steps:

In Failover Cluster Manager, in the left-hand pane, right-click on Roles and choose Configure Role to start the High Availability Wizard
Review the Before You Begin screen and click Next
In the Select Role screen, choose File Server and click Next
For the File Server Type, choose Scale-Out File server for application data and click Next
Provide a properly formatted NetBIOS name for the Client Access Point and click Next
Review the Confirmation screen information and click Next
Verify the wizard completes and the Role comes Online properly in Failover Cluster Manager

A properly configured Scale-Out File Server Role should look something like this -

What happens if the Scale-Out File Server Role fails to start? Check the Cluster Events and you may find an Event ID: 1194 indicating a Network Name Resource failure occurred.

The Event Details section provides information for proper corrective action. In this case, since we are placing the cluster computer objects in a custom OU, we need to give the Scale-Out File Server CNO the right to Create Computer Objects. Once this is accomplished, and Active Directory replication has occurred, the Scale-Out File Server Role should start properly. Verify the Role comes online on all nodes in the cluster.

To review what we have accomplished:

Active Directory is configured properly to protect the accidental deletion of cluster computer objects
A 3-Node Hyper-V Failover Cluster has been created and validated
A 2-Node Scale-Out File Server Failover Cluster has been created and validated
The Scale-Out File Server CNO permissions have been properly configured on a custom OU

Well CORE Blog fans, that wraps it up for Part 1. Stayed tuned for Part 2 where we will:

Configure SMB 3.0 shares on the Scale-Out File Server
Configure highly available virtual machines in the Hyper-V Failover Cluster using the SMB shares on the Scale-Out File Server Cluster
Demonstrate Live Migration of virtual machines in the Hyper-V Failover Cluster

Thanks, and come back soon.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support
High Availability\Virtualization Team

↧

Just when you thought… (Part 2)

November 26, 2012, 4:31 am

≫ Next: Working with multiple network adapters in a virtual machine

≪ Previous: Just when you thought…..(Part 1)

In Part 1, I covered configuring the Hyper-V Failover Cluster and the Scale-Out File Server solution. In Part two, I will cover:

Creating the file shares in the Scale-Out File Server
Creating a virtual machine to use the SMB3.0 shares in the Scale-Out File Server
Verifying we can Live Migrate the virtual machines in the Hyper-V Failover Cluster

Creating the File Share

Execute the following steps to create a file share in the Scale-Out File Server

In Failover Cluster Manager, right-click on the Scale-Out File Server role in the center pane and choose Add File Share. This starts the New Share Wizard
In the Select Profile screen, choose SMB Share - Applications and click Next
For the Share Location, choose one of the CSV Volumes and click Next
Provide a Share Name, verify the path information and click Next
In the Other Settings screen, Enable Continuous Availability is checked by default. Click Next
Note: Some selections are greyed-out. This is because they are not supported for this share profile in a Failover Cluster
In the Permissions screen, click Customize Permissions. In the Advanced Security Settings screen, note the default NTFS and Share permissions and then proceed to add the Hyper-V Failover Cluster Nodes Computer Accounts to the NTFS permissions for the share and ensure they have Full Control. If the permissions listing does not include the cluster administrator(s), add it and give the account (or Security Group) Full Control. Click Apply when finished

Complete configuring the file shares.

As a test, connect to each of the shares from the Hyper-V Failover Cluster and verify you can write to each location before proceeding to the next step.

Creating a Virtual Machine to use an SMB 3.0 Share

Execute the following steps to create a new virtual machine

On one of the nodes in the Hyper-V Cluster, open Failover Cluster Manager
In the left-hand pane, click on Roles and then in the right-hand Actions pane click on Virtual Machines and choose New Virtual Machine
Choose one of the cluster nodes to be the target for the virtual machine and click OK
This starts the New Virtual Machine Wizard. Review the Before You Begin screen and click Next
In the Specify Name and Location screen, provide a name for the virtual machine and enter an UNC path to a share on the Scale-Out File Server and then click Next
Configure memory settings and click Next
Configure network settings and click Next
In the Connect Virtual Hard Disk screen, make a selection and click Next
Review the Summary screen and click Finish
Verify the process completes successfully and click Finish

Testing Live Migration

Once all the virtual machines are created, you may want to test Live Migration. Depending on how many simultaneous live migrations you want to support, you may have to modify the Live Migration settings on each of the Hyper-V Failover Cluster nodes. The default is to allow two simultaneous live migrations. Here is a little PowerShell script you can run to take care of the settings for all the nodes in the cluster -

$Cred = Get-Credential

Invoke-Command -Computername Fabrikam-N21,Fabrikam-N22,Fabrikam-N23 -Credential $Cred -scriptblock {Set-VMHost -MaximumVirtualMachineMigrations 6}

In my cluster, I have all the virtual machines running on the same node -

I will use a new feature in Windows Server 2012 Failover Clusters, multi-select, and select all of the virtual machines and live migrate them to another node in the cluster -

Since there are only four virtual machines and the maximum number of live migrations is equal to six, all will migrate.

If I were to rerun my script and make a change back to two, then two migrations will be queued until at least one of the in progress migrations completes.

You can use the Get-SmbSession PowerShell cmdlet on any node in the Scale-Out File Server to determine the number of sessions. For illustration purposes, I have all virtual machines running on the same Hyper-V Failover Cluster node (Fabrikam-N21) and the CSV volumes are running on the same node in the Scale-Out File Server (Fabrikam-N1) -

Distributing the virtual machines across the multi-node Hyper-V Failover Cluster (Fabrilam-N21, Fabrikam-N22, and Fabrikam-N23) is reflected on the Scale-Out File Server -

Finally, I re-distribute the CSV volumes across the Scale-Out File Server nodes as shown here -

This is reflected in the Get-SmbSession PowerShell cmdlet output -

Thanks, and come back again soon.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support
High Availability\Virtualization Team

↧

Working with multiple network adapters in a virtual machine

November 28, 2012, 1:56 pm

≫ Next: Establishing Network Connectivity to a Share in the Windows Recovery Environment

≪ Previous: Just when you thought… (Part 2)

Thanks for coming back to the CORE Team blog site. This blog will address working with multiple network adapters in a virtual machine. Many of you out there may not be interested in this because you work with virtual machines that only use a single network adapter. However, for those of us that frequently work with virtualized Failover Clusters, virtualized iSCSI Target Servers or even virtualized RRAS servers, we find ourselves in a position where virtual machines require more than one network adapter. I hope that the information here will provide some needed relief for you.

We all know that we cannot 'hot-add' network adapters to running virtual machines. The choices we have are to either configure all the network adapters before starting the virtual machine, or configure them one at a time, which requires the virtual machine to be shut down first. As an example, here is what I typically have to deal with when configuring nodes in a Failover Cluster. I require three networks; one for Public access, one for Cluster only communications, and one for connectivity to the shared storage provided by an iSCSI target.

I think we can all agree that the information displayed is insufficient to assist with the configuration of each of the networks. Let us address each scenario individually.

Windows Server 2012

The great thing about Windows Server 2012 is that there is lots of PowerShell help available to assist. We will use PowerShell to work through configuring the networks in the virtual machine. As shown above, here is the starting point -

One thing that I like to do is to make sure the Hyper-V virtual switch configuration makes sense for what I am doing. My virtual switch names are Public, Cluster and iSCSI because they make sense and meet my needs. Using that information, I use the Get-VMNetworkAdapter cmdlet to get the information I will need.

Get-VMNetworkAdapter -VMName 2012-Test | ft -Autosize Name,SwitchName,MacAddress,IPAddresses

In the virtual machine, I use the Get-NetAdapter cmdlet to get additional information I will need.

Get-NetAdapter | ft -Autosize Name,InterfaceDescription,ifIndex, MacAddress

Using the MacAddress information, I can sort out the 'players.'

Using the Get-NetAdapter and Rename-NetAdapter cmdlets, change the name of the connections in the virtual machine.

Get-NetAdapter -Name 'Ethernet' | Rename-NetAdapter -NewName Cluster

Get-NetAdapter -Name 'Ethernet 2' | Rename-NetAdapter -NewName ISCSI

Get-NetAdapter -Name 'Ethernet 3' | Rename-NetAdapter -NewName Public

Once the names of the adapters are changed, it is time to configure the IP addressing. To accomplish this, use the New-NetIPAddress cmdlet.

New-NetIPaddress -InterfaceIndex 13 -IPAddress 1.0.0.3 -PrefixLength 8 -DefaultGateway 1.0.0.10

New-NetIPaddress -InterfaceIndex 14 -IPAddress 192.168.0.3 -PrefixLength 24

New-NetIPaddress -InterfaceIndex 15 -IPAddress 172.16.0.3 -PrefixLength 16

If name resolution is required, configure a DNS server address on the Public interface

Set-DnsClientServerAddress -InterfaceIndex 13 -ServerAddresses ("1.0.0.110","1.0.0.100")

To verify the new IP addresses, in the Hyper-V Host, re-run the Get-VMNetworkAdapter cmdlet or in the virtual machine run ipconfig /all.

This completes the configuration of the network adapters and it was accomplished without having to reboot the virtual machine.

Windows Server 2008 R2

Windows Server 2008 R2 includes PowerShell as well, but it does not come close to being as useful, or as powerful, as the PowerShell functionality found in Windows Server 2012. To complete the very same process as was executed in Windows Server 2012 requires a little different strategy. As shown above, here is the starting point.

Use the Get-VMNetworkAdapter cmdlet to get the information I will need.

Get-VMNetworkAdapter -VMName Contoso-FS2 | ft -Autosize Name,SwitchName,MacAddress,IPAddresses

Windows Server 2008 R2 comes with PowerShell Version 2.0 installed. Use PowerShell to obtain the network adapter information we need.

Get-WmiObject -query "select * from Win32_NetworkAdapter where name like 'Microsoft Hyper-V Network Adapter%'" | FL Name,MACAddress

Using the MacAddress information, I can sort out the 'players.'

Next, use the netsh command to finish the configuration. First, rename the adapters.

Netsh interface set interface name="Local Area Connection" NewName="Public"

Netsh interface set interface name="Local Area Connection 2" NewName="Cluster"

Netsh interface set interface name="Local Area Connection 3" NewName="ISCSI"

Use the netsh interface show interface command sequence to show the new names for the interfaces.

Set the IP Address configuration (set the Default Gateway on Public) on each interface and verify using ipconfig /all.

Netsh interface ip set address name="Public" static 1.0.0.4 255.0.0.0 1.0.0.10 1

Netsh interface ip set address name="Cluster" static 172.16.0.4 255.255.0.0

Netsh interface ip set address name="ISCSI" static 192.168.0.4 255.255.255.0

If name resolution is required, configure a DNS server.

Netsh interface ip set dnsservers name= "Public" static 1.0.0.110 primary

This completes the configuration for the Windows Server 2008 R2 virtual machine network adapters. Again, the configuration was accomplished without rebooting the virtual machine.

I would also like to acknowledge the help from my teammate - Sean Dwyer. Thanks, and come back again soon.

Chuck Timon
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support
High Availability\Virtualization Team

Sean Dwyer
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support
High Availability\Virtualization Team

↧

Establishing Network Connectivity to a Share in the Windows Recovery Environment

June 10, 2016, 5:22 pm

≫ Next: Display Scaling changes for the Windows 10 Anniversary Update

≪ Previous: Working with multiple network adapters in a virtual machine

Hi there! My name is Neil Dsouza and I’m a Support Escalation Engineer with the Windows Core team.

Today I’m going to cover a scenario where you have a server that fails to boot and all you want to do is copy the data off the machine to a network share. In most cases connecting a USB flash drive/hard drive is the easiest solution to copy off the data. However, if you don’t have physical access to the server, but you do have remote console access, then you can copy the data to a network share. These steps will also help gather logs or data when troubleshooting Windows in a no-boot scenario.

For Operating systems newer than Windows 7, by default Windows Recovery Environment (WinRE) is installed, unless this was changed during deployment/installation of Windows. The steps should work for most operating systems Windows 7 and newer.

When the Operating system fails to boot, by default it will take you to a boot menu with an option to boot into WinRE which would say ‘Repair your computer’ or ‘Launch Startup repair’.

Image1: Boot menu to go to WinRE in Windows 7 or Windows Server 2008 R2

Image 2: Boot Menu to go to WinRE in Windows 8 / 2012 / 2012 R2 / 8.1 / 10

Choosing the ‘Startup Repair’ option will run the ‘Startup Repair Wizard’ and attempt to fix the most common issues that cause operating system boot failures.

Image 3: Startup Repair running in Windows 8 and newer OS

Images 4: Startup Repair running in Windows version before Windows 8

In the end a report is provided with the tests that ran to detect issues and what the result was. This information can be useful to understand why Windows failed to boot.

Image 5: Startup repair Results

If you miss seeing this in the wizard, you can always go to the Command Prompt in WinRE and open the below file which has the information logged: %WINDIR%\System32\LogFiles\SrtTrail.txt

If you do not see the ‘Repair Your Computer’ or ‘Launch Startup Repair’ option, it means that WinRE was not installed when the OS was installed. In such cases you can still boot to WinRE by using the operating system disk and selecting ‘Repair your computer’ at the install screen.

Image 6: Boot from CD/DVD/ISO screen for Windows 7

Image 7: Boot from CD/DVD/ISO screens for Windows 8 / 2012 / 8.1 / 2012 R2 / 10

On Windows 8 and newer OS’s, you have to navigate through the options further as shown below:

Select ‘Troubleshoot’

Select ‘Advanced Options’

Select ‘Command Prompt’, or you could run the ‘Startup Repair’ from here

For OS versions of Vista thru Windows Server 2008 R2

Click ‘Next’

Select ‘Command Prompt’, or you could run the ‘Startup Repair’ from here

Once we are at the command prompt we can do our magic.

First thing we want to do is see what the drive letters are for each partition. DISKPART is our friend here.

Run the command: diskpart

List Volume

Now we come to the most interesting part.

To establish network connectivity with a file share on some machine where you need to copy your data, log files or that memory dump Microsoft support is asking for when the machine blue screens on startup, you can run ‘wpeinit’ from the command prompt. This is a program built into WinPE from which WinRE is created.

Now you can run ‘ipconfig’ and you will see that an IP address is assigned to the WinRE session. This will work only if you have a DHCP server assigning IP addresses.

In certain cases, ‘wpeinit’ runs, however does not initialize the NIC or does not assign IP address. There are couple of reasons why that happens.

1. NIC driver is not loaded

In this scenario you can manually load the NIC driver. First you need to identify the right driver that may already reside on the machine. All drivers that were installed on the machine are maintained, unless explicitly removed, in the path %WINDIR%\System32\DriverStore\FileRepository with a folder name starting with the driver inf file name followed by a GUID. You may have multiple folders starting with inf filename if you have installed multiple versions of the same driver. In any case you can download the driver and extract it on to a USB stick. We need the .sys, .inf, etc files uncompressed to be able to load the driver manually.

An example of the driver files in FileRepository is below.

Run the below command to load the NIC driver from the above image:

drvload c:\Windows\system32\DriverStore\FileRepository\netwew01.inf_amd64_9963f911be06feae\netwew01.inf

2. There’s no DHCP Server in the environment that could automatically assign an IP address

What do you do if there isn’t a DHCP server assigning IP addresses? Well, you can assign a static IP address using the below netsh command. You may use the same IP address of the server, however if you have trouble with that use a different IP address.

netsh int ipv4 set address “<Connection Name>” static <IP> <Subnet Mask> <Default Gateway>

The Connection Name can be obtained by running ‘ipconfig /all’ command. It’s the text highlighted in blue in the below image.

Once you have an IP address, you can map a network drive using the command below to a file server or a simple share on another machine.

net use y: \\ServerName\ShareName

ServerName is the Computer Name of the server or IP Address in case name resolution is not working and ShareName is the name of the share. You will be asked for credentials to access the network share.

You could run the command below for it to take next available drive letter and display the letter.

net use * \\ServerName\ShareName

Now you can copy files and folders from the non-booting machine to a network share using copy, xcopy or much better use robocopy.

I hope this helps you save some time when you have a machine that is not booting up, whether it’s a server or a client machine and help you copy/backup important data or logs to investigate the issue

Neil Dsouza
Support Escalation Engineer
Windows Core Team

↧

Display Scaling changes for the Windows 10 Anniversary Update

August 16, 2016, 5:03 pm

≫ Next: Surface Ethernet Drivers

≪ Previous: Establishing Network Connectivity to a Share in the Windows Recovery Environment

Today we have guest author for this blog. Peter Felts is a Senior Program Manager in the developer platform group and is going to discuss Display Scaling improvements and changes with Windows 10 Anniversary Update (version 1607)

Overview

Steve Wright’s previous blog post about display scaling for high dots-per-inch (DPI) displays in Windows 10 does a great job of giving an overview of the concepts of how Windows handles DPI scaling. In this article I’m going to focus more on the technical side of what we’ve been working on for the Windows 10 Anniversary Update to help improve the display-scaling story for desktop applications. Note that most of what this article discusses does not apply to Universal Windows Applications (UWA) as they already handle display scaling well.

During Windows 10 significant work was done to improve the display-scaling story for Windows itself, which Steve’s article covers. While this resulted in an improved experience for some of the in-box UI of Windows itself and for UWA, many third-party (and Microsoft’s own) desktop applications were not able to benefit from this work and could still display blurry or sized incorrectly in some common scenarios. For the Windows 10 Anniversary Update we wanted to tackle this problem so we focused on making it easier (and less expensive) for software developers to update their desktop applications to scale properly.

Problem Statement:

As was discussed in Steve’s article, many desktop applications do not render well on some of the latest high DPI displays. There are three symptoms of display-scaling problems we typically see with desktop applications:

1. Blurry text and UI components.

2. Applications sized incorrectly (too big or too small).

3. Applications are sized correctly and are not blurry, but have other layout issues (such as clipped text or other UI components).

These problems are most frequently seen whenever the display scale factor of a Windows PC changes while the user is logged in and/or if an application is moved from the “main display” to a display that has a different display scale factor.

One very common scenario where applications start to experience these problems is when a device with a high display scale factor (say 200% display scaling) is docked or undocked with an external display that has a different display scale factor (and the external display is used as the “main display” or the PC uses “Second screen only” display mode). In this scenario applications render as expected on the internal display (#1 below) before the PC is docked but once connected to the external display they are stretched by Windows such that they are sized correctly on the external display (#2 below). This stretching results in the application looking blurry. At this point the only thing that the end user can do is to close all of their applications and completely log out and back into Windows. Once the user has logged out and logged back into Windows, most applications should render correctly on the external display (#3 below). Needless to say this is not an acceptable workaround as it interrupts a user’s workflow. To add insult to injury, if the user does completely log out and back into Windows, once they un-dock their device the same problem will occur in the reverse (#4 below). This scenario forms the cycle shown below:

Background:

There are many reasons why some desktop applications do not render correctly, but at a high-level one of the biggest challenges desktop applications face in this space is that many apps were written without considering that the display factor on Windows could change while the app was running, and so they don’t respond to those changes. This is true if you don’t ever connect an external display that has a different display scale factor, remote into Windows from a device with a different display scale factor, or change the display scaling settings. Once any of these things happen though, the scale factor of the system is suddenly different from what the application was told it was when it launched, and applications that are not expecting the display-scale factor to change are not made aware of this change, and therefore they do not know that they should respond. When this happens, Windows jumps in and stretches the on-screen image of the application such that it will be sized appropriately for the new scale factor. This, at least, results in applications being physically sized correctly on a display but they can be blurry as result of being stretched.

PCs today are being equipped with displays with increasingly high pixel densities, also referred to as dots-per-inch (DPI). The Surface Pro 4, for example, ships with a display that accommodates a 200% display scale factor. This means that if you were to connect a Surface Pro 4 to a “standard” external display (a display with 96 DPI or 100% display scaling) there would be a 2-to-1 difference between the scale of an application on the Surface display and that of when it was rendered on the external display. For applications that don’t handle dynamic display scaling, this means that the image of the application shown on screen would either be reduced by half when displayed on an external display or doubled (depending on which monitor was configured as the “main display” when the user logged into Windows). As the difference between the display scaling of monitors increases the blurriness of applications becomes more and more noticeable. This is a problem that is only going to get worse as display manufactures produce displays with even higher DPI.

While it is technically possible for legacy desktop applications to be updated to understand the concept that the display scale factor can change at any time, it is clear that not all applications will be updated and that some many never be updated for this. Furthermore, until the Windows 10 Anniversary Update, Windows did not even offer enough of the functionality that an application developer required in order to do this work. So much key functionality was missing that it was not practical for developers to update their desktop applications in many cases, even if the will was there to do so. So, this is what we focused on for the Windows 10 Anniversary Update.

During the development cycle for the first Windows 10 release we started to tackle this problem by updating the Windows File Explorer application to dynamically handle a display-scale-factor change. Through this process we learned a great deal about the type of challenges Windows desktop developers will hit when trying to update their applications, and we wanted to address as many of those as we could.

Why doesn’t Microsoft Just Fix Display Scaling on Windows?

This is a valid question that many of us have asked ourselves when joining teams that are working on this problem space. The main challenge that we face, however, is that many, many, applications that run on Windows are using a design pattern where they ask Windows for information about the system when they launch (questions such as how big the display is, what is the display scale factor, what is the size of the font that should be used for default text, as well as others) and then cache this information and never expect it to change. Because of this, even if Windows did start giving these applications information about a DPI change, most, if not all, of these applications wouldn’t even be asking and therefore would not respond correctly. Furthermore, if Windows did start providing dynamic display-scale-factor-related information this would be a nightmare for application compatibility and would probably cause more problems for application stability than it would help in terms of high DPI display issues.

RS1 Improvements

Non-client area scaling

One of the first (and biggest) blockers than any desktop application developer runs into when they try to update their application to handle dynamic display-scale changes is that what we refer to as the “non-client area” of a window does not respond to scale-factor changes. The term non-client area (NCA) refers to parts of a typical desktop application window that the application itself does not draw… such as the title/caption bar, system menus, traditional menu bars (such as in Notepad), scrollbars, and other UI that application developers require the system to handle on their behalf. In other words: all the standard “Windows stuff” that make up a typical desktop application window, but that aren’t drawn by the applications themselves.

Before the Windows 10 Anniversary Update, if an application developer tried to update their desktop application to respond to a display-scale-factor changes, they would soon discover that the NCA would not resize when the scale factor changed. This meant that their application would have undersized or oversized titlebars when the scale-factor changed (Figure 1). This is not something that application developers could live with and the only option available to them to address this was to have the developer handle all of the drawing of these UI components themselves, which is a prohibitively-expensive proposition for most developers (note that some applications, such as Google Chrome and the FireFox browser do draw most of the UI that is typically NCA themselves, because they have highly-stylized application UI).

Figure 1. Non-Client Area not scaling for DPI (left) and scaling correctly (right)

For the Windows 10 Anniversary Update we now support automatic scaling of NCA via use of a new “EnableNonClientDpiScaling” API.

Mixed-Mode DPI scaling

One lesson that we learned while making File Explorer dynamically handle display-scale-factor changes was that the current model for an application to tell Windows how it wanted to handle display scaling was too inflexible for complicated applications. The model has been that an application would either tell Windows that it knew how to scale when it started (System DPI awareness), that it could handle dynamic display-scale factor changes (Per-Monitor DPI awareness), or that it would say nothing and Windows would stretch/scale it appropriately.

This is an adequate model for simple applications, or applications that are being created from scratch, but when a developer tries to update even a moderately-complicated application with many windows, they’ll be in a position where they have to update all of their UI… an all-or-nothing proposition. They either update all of their UI or live with some UI not rendering at the correct size. For any application with many windows this could become a ton of work quickly. Also, developers for applications that present third-party content (such as plugins) might not even have access to the source code for this content, so handling the display scaling for these windows wouldn’t even be an option.

To make it easier for desktop applications to be updated to handle display scaling well, we realized that this had to be changed. So we’ve broken the process-wide constraint on an application’s display-scaling mode such that developer can now specify a different scaling mode for each (top-level) window. In other words: developers can focus their development time on making the important parts of their UI handle display scaling well, while letting Windows handle the other windows in the application. The API that enables this functionality is SetThreadDpiAwarenessContext. This should significantly reduce the cost for developers to update their desktop applications.

Figure 2 and Figure 3 show an example of an application that utilizes this functionality to make its primary UI render crisply while having Windows handle DPI scaling of less-frequently used UI. Notepad’s primary window renders natively at the DPI of the display it’s primarily located on while the Print dialog is scaled by Windows (and may be blurry). Figure 3 shows a close-up of the two windows, showing that the system-scaled Print dialog is somewhat blurry while the Notepad UI is crisp:

Figure 2. Notepad’s primary window is natively scaling while Windows is scaling the Print dialog

Figure 3. Close up of the Notepad window and the system-scaled Print dialog

Office

Some of the biggest feedback we’ve received about display scaling has been related to Lync/Skype for Business and PowerPoint being sized incorrectly in scenarios such as docking a Surface Pro, Surface Book, or any high-DPI device to a standard DPI display (or any display with a different display scale factor). I’m happy to say that with the new functionality (mentioned above), that is part of the Windows 10 Anniversary Update, the Office team is now working on updates to these applications that will enable them to render at the correct size when the display scale factor changes (the updates to these Office applications will only apply when running on PCs with the Windows 10 Anniversary Update (or newer)).

WPF

Windows Presentation Framework (WPF) is a heavily used application framework used for making many desktop applications. Unfortunately, WPF applications hit the same problems with non-client area (NCA) scaling as other desktop applications did (mentioned above) when they were updated to handle display-scale-factor changes on the fly. For the Windows 10 Anniversary Update WPF is being updated to support automatic NCA scaling.

What we didn’t get to:

For the Windows 10 Anniversary Update we focused on some of the biggest rocks that needed to be moved in order to make it easier for developers to update desktop applications to handle dynamic display-scale-factor changes, but there are still more things we need to tackle:

Desktop Icon Scaling

In previous Windows releases, desktop icons would not scale properly when the scale factor changed (they would be too big or too small in some scenarios). We’ve improved this for many common scenarios such as docking and undocking with displays that have a different scale factor, but desktop icons still do not scale on a per-display basis if you are in “extend” display mode. This means that if a user has their desktop spread across displays (“extend” display mode) with different scale factors, the icons will be sized incorrectly on some displays.

Common Control scaling and WinForms

Unfortunately, we weren’t able to deliver per-monitor display-scale-factor scaling support for Win32 common controls in the Windows 10 Anniversary Update. Application developers that want to create a native Win32 application that uses common controls and want to natively scale the controls will still face challenges due to lack of support for this in Windows.

WinForms is a very widely used framework for creating desktop applications, which is partially built upon Win32 common controls. Unfortunately, WinForms controls have not been updated to support dynamic display-scale-factor changes.

Addressing the need to log out and log back into Windows after a display-scale factor change

Due to the common architecture of applications asking Windows what the display-scale factor is once at startup, and not asking again while they’re running, often the only way to have an application pick up the new display-scale factor is to log out of Windows and log back in. Until we find a way to work around the constraints that this pattern imposes on applications, users will continue to have to log out and log back into Windows, unfortunately.

Conclusion

For the Windows 10 Anniversary Update our goal was to make it easier and less expensive for application developers to update their desktop applications to handle display-scale-factor changes while they’re running, so that they don’t show up blurry or sized incorrectly in common use cases. Hopefully developers will find this work useful and we’ll start to see more desktop applications updated to render correctly. We’ve still got a lot of work ahead of us in the high-DPI space until we get to a point where most desktop applications scale properly, but we recognize how critical this is for Windows users. We feel the same pain ourselves.

↧

Surface Ethernet Drivers

August 18, 2016, 7:56 pm

≫ Next: New Office Update to address some of the scaling issues with Skype for Business 2016 and PowerPoint 2016

≪ Previous: Display Scaling changes for the Windows 10 Anniversary Update

Hi,

My name is Scott McArthur and I am a Supportability Program Manager for Surface. Today I have a quick blog on some important deployment information regarding the Surface Ethernet Drivers. When doing a deployment, it is necessary to add the Surface Ethernet drivers to boot images (ConfigMgr, MDT, WDS). This blog will discuss where to get the drivers.

First some technical details on the drivers themselves:

PNPID: VID_045E&PID_07C6. This is partial PNPID
Windows 10 X86
- msux86w10.inf
- msux86w10.sys
Windows 10 X64
- msux64w10.inf
- msux64w10.sys
Windows 8.1 X86
- msu30x86w8.inf
- msu30x86w8.sys
Windows 8.1 X64
- msu30x64w8.inf
- msu30x64w8.sys

Where to get the driver:

Starting with Windows 10 Version 1607 the Surface Ethernet driver is included so if you are deploying it should just work. Note the inbox version is 10.2.504.2016. A later version has already been released.
The .MSI package which is used to update existing installs of Surface contains the Surface Ethernet Drivers.
The .ZIP package used for bare metal deployments (For example Surface Pro 4) do not contain the Surface Ethernet drivers.

To download the latest drivers for Surface Ethernet adapter, do the following:

Browse to the Microsoft Update Catalog here.
In the search box enter Surface Ethernet Drivers.

When looking at the drivers pay special attention to the version column. At the time of this blog publication the latest drivers were:

Windows 10: 10.2.704.2016
Windows 8.1: 8.18.303.2015

Notes:

Check back regularly for any new versions.
There are 2 entries. One for x86 and one for x64. x86 is primarily for customers who have purchased the Surface Ethernet Adapter and using it on another device although you may be booting X86 based boot image but installing X64 operating system. Best practice is to include both in your deployments.
The drivers support the older 10/100 adapter (model 1552), newer USB 3.0 1GB adapter (model 1663), and Surface docks

Hope this helps with your deployments.

↧

New Office Update to address some of the scaling issues with Skype for Business 2016 and PowerPoint 2016

October 14, 2016, 3:04 pm

≫ Next: Building a KMS host on Windows 7

≪ Previous: Surface Ethernet Drivers

Hi Everyone,

This is Kim Johnson from the Windows 10 client supportability team. I wanted to share some exciting news around some new updates for display scaling issues in Skype for Business 2016 and PowerPoint 2016. In previous blogs we discussed some of the issues encountered with display scaling on high dpi devices:

· Display Scaling changes for the Windows 10 Anniversary Update

· Display Scaling in Windows 10

We also have a KB article that discusses these issues: Windows scaling issues for high-DPI devices

The Office team has released updates to address some of these scaling issues with Skype for Business 2016 and PowerPoint 2016. Please take a look at the following support article which tells you prerequisites and how to get the update:

· Office apps appear the wrong size or blurry on external monitors

↧

Building a KMS host on Windows 7

October 14, 2016, 9:31 pm

≫ Next: Windows Server 2016 Volume Activation Tips

≪ Previous: New Office Update to address some of the scaling issues with Skype for Business 2016 and PowerPoint 2016

Windows 7 with SP1

Support Lifecycle: https://support.microsoft.com/en-us/lifecycle?C2=14019

This blog post is part of a series of posts, detailing the build process and activating capabilities of a KMS host on a particular host operating system. The operating system dictates which KMS host key (CSVLK) can be installed on that particular host, and that CSVLK determines what KMS-capable clients can be activated. When implementing KMS activation in an environment, it is best to determine all of the potential volume license operating systems for your KMS clients and then pick the best key. To simplify this, it is recommended that the most current KMS CSVLK be used, insuring that all KMS-capable operating systems that have been released at that time can be activated. For newer KMS CSVLKs to be hosted on previously released operating systems, a hotfix is needed to make the host operating system aware of the newer operating system.

Note: Desktop KMS CSVLKs can only be installed on hosts with desktop operating systems (that support that CSVLK) and Server KMS CSVLKs can only be installed on hosts with server operating systems (that support that CSVLK).

This blog post pertains to a KMS host with Windows 7 with SP1 as the operating system.

Windows 7 can host the following desktop KMS CSVLKs:

Windows 7
Windows 8
Windows 8.1
Windows 10

The KMS CSVLKs can activate the following KMS clients:

KMS CSVLK	KMS Clients Activated	Hotfix Required
Windows 7	Windows Vista Windows 7	None needed.
Windows 8	Windows Vista Windows 7 Windows 8	As Windows 7 was released prior to Windows 8, it is not aware of Windows 8. KB Article 2757817 will address this.
Windows 8.1	Windows Vista Windows 7 Windows 8 Windows 8.1	As Windows 7 was released prior to Windows 8.1, it is not aware of Windows 8.1. KB Article 2885698 will address this.
Windows 10	Windows Vista Windows 7 Windows 8 Windows 8.1 Windows 10	As Windows 7 was released prior to Windows 10, it is not aware of Windows 10. KB Article 3079821 will address this.

KMS Host Build Steps:

1. Install Windows 7 with SP1
2. Patch completely
3. If a firewall is used, verify that there is an exception for KMS
4. Obtain the desired CSVLK from the VLSC site
5. If the KMS CSVLK is newer than the Windows 7, install the hotfix required as per the table above
6. Install the KMS CSVLK

a. Open an elevated command prompt
b. Run cscript.exe slmgr.vbs /ipk XXXXX-XXXXX-XXXXX-XXXXX-XXXXX using your KMS CSVLK
c. Wait for success message

7. Activate the KMS CSVLK

a. If system has external internet connectivity:

i. Open an elevated command prompt
ii. Run cscript.exe slmgr.vbs /ato
iii. Wait for success message

b. If system does not have external internet connectivity:

i. Phone activate with UI

1. Open an elevated command prompt
2. Run slui.exe 4 to open the Phone Activation wizard
3. Follow the prompts to complete

ii. Phone activate via command prompt

1. Open an elevated command prompt
2. Run cscript.exe slmgr.vbs /dti to obtain the installation ID
3. Call Microsoft’s Phone Activation using a phone number listed in %SystemRoot%System32\SPPUI\Phone.inf
4. Follow the prompts to obtain the confirmation ID
5. Run cscript.exe slmgr.vbs /atp <ConfirmationID w/o hyphens> to apply the confirmation ID
6. Wait for a success message

8. Run cscript.exe slmgr.vbs /dlv and verify that the License Status indicates that the KMS host is licensed.

The Windows 7 KMS host is now ready to begin accepting KMS activation requests. The host needs to meet the minimum threshold of twenty-five unique KMS (desktop) client requests before it will begin activating KMS (desktop) clients. Until the minimum threshold is met, KMS (desktop) clients attempting to activate against this host will report the following error:

When the threshold is met, all KMS (desktop) clients requesting activation (that are supported by the CSVLK installed) will begin to activate. Those KMS clients that previously erred out with 0xC004F038 will re-request activation (default interval is 120 minutes) and will be successfully activated without any user interaction. An activation request can be prompted on a KMS client immediately by running cscript.exe slmgr.vbs /ato in an elevated command prompt.

Scenario:

You want to build a KMS host on Windows 7, to activate Windows 7 and Windows 10 KMS clients. Here are the steps necessary to achieve your goal.

1. Determine what CSVLK is needed – You determine that CSVLK needed to activate both Windows 7 and Windows 10 is the Windows 10 CSVLK as per this TechNet article, under the “Plan for Key Management Services activation” section.
2. Obtain the CSVLK – Log onto your Volume License Service Center site and locate the Windows 10 KMS key listed. Note this for Step #5.
3. Build a Windows 7 system from Volume License media and patch – Using volume license media, build a system or utilize a system that is already built. Completely patch the system using Windows Update or whatever solution you use for applying updates/hotfixes.
4. Apply the required hotfix – Because Windows 7 was released before Windows 10, the system needs to become aware of the newer operating system. Applying the hotfix from KB Article 3079821 will accomplish this and enable your Windows 7 KMS host to activate Windows 10 KMS clients (along with Windows 7, Windows 8, and Windows 8.1 KMS clients).
5. Install the CSVLK – Open an elevated command prompt. Install the CSVLK on the KMS host by running the following command: cscript.exe slmgr.vbs /ipk <your CSVLK as it appears on the VLSC site>
6. Activate the CSVLK – In the elevated command prompt, activate the CSVLK by running the following command: cscript.exe slmgr.vbs /ato
7. Verify – In the elevated command prompt, display the licensing information by running the following command: cscript.exe almgr.vbs /dlv
8. Phone activate if necessary – If you have issues with online activation from Step #6, you can open the phone activate by running the following command: slui.exe 0x4 and follow the prompts to activate your system. Once complete, repeat verification if necessary.

The KMS host is now ready to begin activating any Windows 7, Windows 8, Windows 8.1, and Windows 10 KMS clients. Here is a quick video to show the steps.

Note: Reminder, the minimum required threshold of twenty-five KMS client activation requests to this new host will need to be met before the KMS host begins activating as per Step #8 under “KMS Host Build Steps” above.

References:

“Planning for Volume Activation” – https://technet.microsoft.com/en-us/library/dd996589.aspx
“Determine Product Keys Needs” – https://technet.microsoft.com/en-us/library/ff793411.aspx
“Understanding KMS” – https://technet.microsoft.com/en-us/library/ff793434.aspx
“Deploying KMS Activation” – https://technet.microsoft.com/en-us/library/ff793409.aspx
“Reactivating Computers” – https://technet.microsoft.com/en-us/library/ff793428.aspx

Links for other blogs in this series:

Windows Server 2016 Volume Activation Tips

https://blogs.technet.microsoft.com/askcore/2016/10/19/windows-server-2016-volume-activation-tips/

↧

Windows Server 2016 Volume Activation Tips

October 19, 2016, 6:29 pm

≫ Next: Adobe Flash support on Windows Server 2016

≪ Previous: Building a KMS host on Windows 7

Hi,

This is Scott McArthur, a Supportability Program Manager for Windows and Surface. With the launch of Windows Server 2016 I wanted to share some information on volume activation:

Updating your existing KMS hosts to support Windows Server 2016
Setting up a new Windows Server 2016 KMS host
Activating Windows 10 Enterprise 2016 LTSB

Updating existing KMS Hosts

If your KMS host is Windows Server 2012 you need to install the following updates

If your KMS host is Windows Server 2012 R2 you need to install the following updates:

Once updated you need to obtain a Windows Server 2016 CSVLK. Do the following

Log on to the Volume Licensing Service Center (VLSC).
Click License.
Click Relationship Summary.
Click License ID of their current Active License.
After the page loads, click Product Keys.
Look for a key called “Windows Srv 2016 DataCtr/Std KMS”

If you are unable to locate your product key please contact the Volume licensing service center

Once you have the key then run the following commands at elevated cmd prompt

1. Install the Windows Server 2016 CSVLK

Cscript.exe %windir%\system32\slmgr.vbs /ipk <insert Windows Srv 2016 DataCtr/Std KMS CSVLK here>

2. Activate the Windows Server 2016 CSVLK

Cscript.exe %windir%\system32\slmgr.vbs /ato

Windows Server 2008 R2 is not supported as a KMS Host for Windows Server 2016 or Windows 10 Enterprise 2016 LTSB edition

Setting up new Windows Server 2016 KMS host

If you want to setup a new Windows Server 2016 KMS host normally you can use the Volume Activation services role wizard or command line to configure the KMS host.

We are aware of issue where when you run the Volume Activation Services role wizard, it will report the error “vmw.exe has stopped working” during the product key management phase of the wizard

Microsoft is investigating this issue and will update this blog when a fix is available but in meantime you will need to configure it using the steps below

1. Open elevated cmd prompt

2. Install the Windows Server 2016 CSVLK

cscript.exe %windir%\system32\slmgr.vbs /ipk <insert Windows Srv 2016 DataCtr/Std KMS CSVLK here>

3. Activate the Windows Server 2016 CSVLK

Cscript.exe %windir%\system32\slmgr.vbs /ato

If system does not have internet connectivity do the following to activate via the command line:

1. Open an elevated command prompt

2. Obtain the Installation ID

Cscript.exe %windir%\system32\slmgr.vbs /dti

3. Look up Microsoft phone activation number using phone number listed in %windir%System32\SPPUI\Phone.inf

4. Call the number and follow the prompts to obtain the confirmation ID

5. Apply the confirmation ID (do not include hyphens)

Cscript.exe %windir%\system32\slmgr.vbs /atp <ConfirmationID>

6. Wait for a success message (numbers blurred on purpose)

7. Verify that the license status shows licensed:

Cscript.exe %windir%\system32\slmgr.vbs /dlv

Windows 10 Enterprise 2016 LTSB Edition volume activation

Note: In addition to activating Windows Server 2016 the “Windows Srv 2016 DataCtr/Std KMS” KMS host(CSVLK) key also activates Windows 10 Enterprise 2016 LTSB edition

Hope this helps with your Windows Server 2016 deployments

Scott McArthur

↧

Adobe Flash support on Windows Server 2016

November 8, 2016, 3:08 pm

≫ Next: Options for Installing and Maintaining Windows Applications

≪ Previous: Windows Server 2016 Volume Activation Tips

With the changes that have occurred to further align the Windows Server and Client desktop experiences, one of the most commonly asked questions is “Does Windows Server 2016 ship with Adobe Flash?”

Answer: Adobe is included and does ship on the Windows Server 2016 media. It can be installed by adding the Remote Desktop Session Host (RDSH) role.

Your thoughts, questions and feedback are very valuable to us and we encourage you to share them in the comments section below.

Derk Benisch
Senior Program Manager

↧

Options for Installing and Maintaining Windows Applications

November 11, 2016, 2:21 pm

≫ Next: Why Bitlocker takes longer to complete the encryption in Windows 10 as compared to Windows 7

≪ Previous: Adobe Flash support on Windows Server 2016

Starting with the Windows 1607 release, MSN News and MSN Finance are no longer included with the Windows installation, but they are still available via Windows Store.

For most, this change will be seamless.

If users upgrade from a previous version of Windows 10, applications that were previously installed will update automatically.
If users upgrade to Windows 10 1607 from a previous version of Windows (7 or 8) or does a clean installation (no previous install), the system will have the default apps available and any modern apps associated with a Microsoft account will be listed in the Store app, under My Library. MSN News and MSN Finance can be quickly reapplied to the machine from there.

In some Enterprise or Domain scenarios, the standard Windows Store app may not be available or network restrictions make access less than optimal. I want to outline the three methods for adding apps to Windows 10 and call out caveats to each.

First is the default option where users access and download apps via Windows Store.

Users need to use a Microsoft Account (MSA) to get apps
All apps will be tied to that user’s MSA account (as seen in My Library)
Systems need access to the Internet to install and update apps

The second option is the Windows Store for Business.

System Admins/Organizations/Enterprises can choose to publish only the apps that are desired in their environment.
This requires an Azure Active Directory (AAD) account, either one for Admins or AAD account for all users.

Note: The standard Windows Store can then be hidden from users by group Policy (User Configuration\Policies\Administrator Templates\Windows Components\Store\Only display the private store within the Windows Store app)

This is documented here:

https://technet.microsoft.com/en-us/itpro/windows/manage/windows-store-for-business

The third option for installing Apps is called “sideloading”.

For applications with an offline license, Appx packages can be added into a base image for deployment or pushed via Group Policy.
This can be used when Windows Store is disabled via Group Policy, MSA accounts are blocked or AAD accounts are not available.
The drawback here is that the applications will not update via Windows Update or the Store engine. Admins will have to check for updated versions and push updates manually.

This is documented here:

https://technet.microsoft.com/en-us/itpro/windows/deploy/sideload-apps-in-windows-10

In conjunction with System Center:

https://blogs.msdn.microsoft.com/teju_shyamsundar/2016/05/30/deploy-an-application-from-windows-store-for-business-via-system-center-configuration-Manager/

Here is more information on managing access to the Windows Store:

https://technet.microsoft.com/en-us/itpro/windows/manage/stop-employees-from-using-the-windows-store

Kim Johnson
Sr. Support Escalation Engineer
Windows Beta

↧