Lessons Learned with Windows Update for Business

Lessons Learned with Windows Update for Business

Windows Update for Business (WUfB) changed the paradigm for patch management in Windows Management. In the WSUS days, multiple patches were assessed, tested and released every month. It was a model of “press go when ready;” patches had to be individually released every month. With WUfB in Windows 10, that model changed. Instead of tasking IT with releasing a patch, the patch was released by default and IT was tasked instead with “pressing stop” if an issue was encountered. Updates were then broken up into two categories: quality updates (which mostly contained security patches) and feature updates (which released new major versions of the OS).  

In my last role, I have embraced the WUfB model since we released Windows 10. Overall, I think that strategy matured to a very successful strategy. However, we worked through a number of challenges as we balanced the twin desires to be as forward-leaning as possible while minimizing the impact of a bad update.

Quality Updates

In managing quality updates, we settled on a period of time by which patching was deferred while a slice of the fleet received the patch on day 0. Not only did that allow us to potentially catch issues before the bulk of the enterprise got the update, but realistically it gave time for the wider community to identify issues. The most valuable feedback I got, however, came from my news feed through articles about boot issues, internet access, and, my personal favorite, Microsoft Defender turning off! So instead of hunting for needles in our haystack we bought ourselves time to see what others have found and assess those issues in our environment. In combination with our deadlines policy, we are able to maintain a 95%+ patch compliance rate month to month and had the capability to tactically pause patching on the bulk of the fleet if we encounter anything worrisome.

Feature Updates

For feature updates, we faced a steeper learning curve, mostly because we really leaned forward with being on the latest version of Windows. We had embraced the Windows as a Service model, and strove to start updating the bulk of our enterprise between 90 and 120 days after a feature update was released. As we heard often from our contacts with the product team, few of our peers were doing that.  

To make that strategy work we first had to put a lot of thought into our rings. We had rings that were both automatically assigned using dynamic groups and for user opt-in with assigned groups. We had to ensure that the opt-in rings were excluded from the dynamic policies so that a user would only be a member of one ring in the end. Those opt-in rings included rings for Insider builds, 0 day installs, and 30 day deferrals. We built a "canary" ring as well (think canary in the coal mine) using similar logic that we made the effort to start upgrading about 60 days after release, and we aimed to have 5-10% of the enterprise in that group.  We divided up our enterprise into 16 general rings so we could space each ring about 3 days apart, starting at a 90-ish day deferral. That would change every cycle though and did require upkeep; if the new feature update came out in September we did not want to start patching the masses in December over the holidays. We came up with what I thought was a clever solution to randomly assign users to a ring based on the first letter of their User ID. Since the user ID is a GUID and a GUID contains only hex characters, it easily enabled us to randomly assign people to distribute and space out the load. Lastly, we had a Late Adopters ring with a significantly longer deferral time (180-270 days) that ended up proving invaluable for managing issues. No one was permanently assigned to that group; however, if we found a certain model or users of a particular application ran into critical issues in our canary testing we could add them there before it hit the general population.

Feedback and Analytics

However, that strategy is only as good as the actual feedback you get on how well the upgrade works. We couldn’t rely on the news feeds as much because few enterprises were as on-board with WUfB as we were. And feature updates were definitely not as reliable as quality updates. We learned some hard lessons early on about audio driver incompatibility and even one hard lesson where the whole Windows Hello stack crashed on a particular model that started life on a particular build of Windows.  

What matured our strategy was our use of ticketing data and the analytics capabilities built into ServiceNow. We had gotten all our Help Desk teams on ServiceNow (a monumental achievement in itself). ServiceNow enabled us to build reporting on those early rings. We were able to build reports that only looked at people in those early rings and we could analyze overall ticketing trends: if there is a spike within 7 days of that ring going live we know there is an issue. We can then generate charts in ServiceNow that looks for key words in the ticket summary and description for failures we have seen (“audio,” “authentication,” “update,” etc…) as well as read through incidents to look for trends. While individual help tickets may lack details (Your help desk is made of people, and some people are better at writing stuff down than others) at our scale more information could emerge. And by running these reports between 0 and 90 days after a feature update release, we bough ourselves time to dive into root cause and react intelligently using our Late Adopter ring.

Summary

In summary, I think of IT management as a balance of two goals: Doing good and avoiding evil. You want to do the good of patching systems as quick as possible as well as offering your users the latest features. But you want to avoid the evil of a bad patch that cripples your fleet. Leveraging WUfB with phased deployments and the insight that ServiceNow analytics has helped us to manage those two goals in our fleet.


* The views expressed are my own and do not necessarily represent those of my employer.

Pat Esposito

Driving business outcomes, accelerating digital transformation and creating better user experiences

3y

Good triumphs over evil. #SamRocksIT

Like
Reply
Akshay Mehta

Technology Strategist - Software and Digital Platforms

3y

Sam you and the team have been instrumental in helping us create better product. You were the first ones to embrace the new deployment model and have set benchmark for other enterprises to follow! Cheers 😊

Like
Reply

This!!! Sam, we've advanced a ton with WUfB in part due to the experiences and advice you and team shared. We're overdue for an update (see what I did there).

Joel Thomas

Cyber Security Specialist

3y

This is a great, Well Written!

Like
Reply

To view or add a comment, sign in

More articles by Samuel Grummons

Insights from the community

Others also viewed

Explore topics