Straight and Narrow Code For Safe Windows Path Updates

Straight and Narrow Code For Safe Windows Path Updates

Windows automation inevitably involves updating path environment variables. The most common is the Windows path (Variable: PATH), but increasingly we are all having to pay attention to the PowerShell module path (Variable: PSModulePath).

Over time I have learned most of the mistakes you can make when manipulating these paths first hand (Yep, there is a special flat spot on my forehead just for where I’ve banged it on the desk due to path issues).

The main four main items that trip most people up are:

  1. That the Process level PATH variable is a combination of both the Machine and User paths.
  2. That any environment variables stored in the registry strings for Machine and User are fully expanded in the Process level PATH variable.
  3. Forgetting that many other installations update the path - which means you really cannot have a “global” view of what is in all paths in your entire company, which makes the next point even more problematic…
  4. Manipulating the path as a string with search and replace or regular expressions - especially given that there generally no way to know the sum total of all paths that might be contained on all the machines that you will target with your automation - there is a better way ;)

Let’s jump right into a list of path manipulation mistakes so that you might find the one that brought you here, and so that you gain an appreciation for the challenge of coding to addresses some of the ones you haven’t even tripped over yet ;)

Path Manipulation Mistakes

  • Assuming nice clean paths - that none of the following list of mistakes are NOT ALREADY MADE by something before your automation (either automation scripting or software installers).
  • Blinding adding a path without checking if the path you wish to add already exists on the path you wish to add it to (thereby duplicating it)
  • forgetting to check if the path you wish to add is already there with and without a trailing slash because both are valid and it’s surprising how many times this detail causes duplicated paths.
  • Using case-sensitive checks when checking if the path already exists.
  • Not handling environment variables embedded in paths correctly. If you check the process environment variable (%PATH%), environment variable expansions are already done. If you have Java installed your “PATH” probably contains "%JAVA_HOME%\bin", however, if you examine $env:path, you will see that %JAVA_HOME% is expanded. So if you’re not careful you will read $env:path and then overwrite the registry path after adding your path. You end up changing %JAVA_HOME% to a literal reference - which you won’t catch until the next Java update when %JAVA_HOME% changes. So you want to ensure that you are looking at and replacing values with UNEXPANDED environment variables.
  • Reading the expanded, combined PATH variable - adding to it - and overwriting the system path. The PATH variable in a process is [a] a combination of both user and machine paths and [b] has all environment variables expanded. When you read this congolmerate path and write it back to the system path, you propagate user paths into the system wide path. Some of those user paths may point to locations that are only relevant to the user profile where the path update was made.
  • Not being able to remove paths with embedded environment variables due to checking for and attempting to remove the EXPANDED value.
  • Checking the process PATH variable for existence of a path and not updating the machine path because there was a matching value in the user path - however, after not updating the machine path - the path will not be available to other users.
  • Updating the Machine path, but not also updating the current PROCESS so that the path is available immediately - sometimes automators take reboots to resolve this - but the included code can be used to update the current process so that the new location resolves on the system path immediately.
  • When doing removals, you do not remove all duplicate instances of a path - which sometimes are there due to bad installers or your own previous practices - removals should remove all identical copies of the same path.
  • parsing problems with too many or too few semi-colon delimiters (’;')
  • Not using the same discipline and logic with path variables other than PATH with all the same functionality (e.g. PsModulePath).
  • not appropriately handling the fact that existing paths and some that you are working with contain spaces (or killing many hours coding to handle it properly).
  • adding an unexpanded environment variable to the Process path - this does not work because windows path variable processing assumes that all environment expansions were done at process setup time.
  • Sometimes parts of a path are in the PATH twice because a path and a subpath of that path are both on path (e.g. “C:\Program Files\ABC” and “C:\Program Files\ABC\Templates”). If you manipulate it with string or regex functions and do not remove the longer path first, you are at risk of soiling the path with an entry for the exact string “\Templates”.
  • Using regex or standard search / replace logic to manipulate path - I could write a book on this one - it is very hard to craft a regex or search replace that does what you expect, but nothing more. Add to that all kinds of character escaping and the possibility of partial matches and handling spaces it gets rough fast.
  • All of the above mistaken practices can lead to a longer and longer path with invalid or duplicate entries - this means any calls anywhere in the system that rely on the path start to take longer and longer.

Design Elements Of the Included Code To Reduce Path Manipulation Mistakes

  • Using the [Environment] type deals with all path data with the environment variables UNEXPANDED.
  • Using the [Environment] type forces the picking of a path scope or context => Machine, User or Process - you have to pick one.
  • Breaking the path into an array: [a] avoids regular expressions, search and replace and a myriad of other parsing issues (remember many other things may have updated the path and so you truly can’t know all the items that might match your search expression), [b] easy removal of duplicate paths, [c] eliminates semi-colon delimiter frustrations.
  • Idempotent code makes sure it only adds the path if it does not already exist. This is why the function name starts with “Ensure”.
  • case insensitivity ensures things work when your path case does not match exactly.
  • supports adding to the start or end of the path - defaults to the end as that is most common.
  • cleans up the ever so common empty path entries from previous bad removals (double semi-colons).
  • Errors upon an attempt to add an unexpanded environment variable to a Process level path. The environment variable will not be expanded by windows when searching the Process path as it assumes all environment variable expansion was done at process setup time. This only applies to Process level paths which are never persistently stored when the process ends.
  • These functions are designed for compactness so that you can insert them inline into existing single purpose scripts - so in the name of compactness, they do not follow PowerShell best practices for parameters, embedded help or coding style. I am finding I especially favor this approach when building PowerShell chucks that run as part of a larger orchestration system like Packer or CloudFormation.

Some Other Nice-To-Knows

  • If you need the path immediately added or removed from your current process - then also execute the function using the “Process” scope.
  • Service Manager only reads it’s path on system bootup, you might update it with something you wish to use in subsequent automation and even update the current process path - but find that the path still isn’t available. This happens in cases of orchestration technologies that use a service to run bits of code in a new process each time they run a segment. The current process is not relevant to new segments of code and the service hosting the automation won’t be updated until you reboot the system and service manager gets the new path. This is exactly the problem with packer automation where you install some automation software with a system path update in one “provisioner” and try to use it in another. There two solutions:[1] one is to reboot and the other is [2] to keep adding that path to any subsequent automation that needs it until a reboot is performed. Option 1 is the best if you can’t predict what might need the path before the next reboot or you have complex or “dynamically composed” automation stacks.

This Code In Production

Although it looks quite a bit different, I used this code to update a Chocolatey helper to work correctly: https://github.com/chocolatey/choco/blob/master/src/chocolatey.resources/helpers/functions/Install-ChocolateyPath.ps1 and to create a new Chocolatey helper for uninstall: https://github.com/chocolatey/choco/blob/master/src/chocolatey.resources/helpers/functions/Uninstall-ChocolateyPath.ps1. In both cases this code is not in the links indicated until the pull requests are merged for the Chocolatey 0.10.4 release.

Updated Code

The below code’s primary home is on the following repository (where it might be improved upon compared to the below): https://gitlab.com/missionimpossiblecode/MissionImpossibleCode

Ensure-OnPath Function

Function Ensure-OnPath ($PathToAdd,$Scope,$PathVariable,$AddToStartOrEnd)
{
  If (!$Scope) {$Scope='Machine'}
  If (!$PathVariable) {$PathVariable='PATH'}
  If (!$AddToStartOrEnd) {$AddToStartOrEnd='END'}
  If (($PathToAdd -ilike '*%*') -AND ($Scope -ieq 'Process')) {Throw 'Unexpanded environment variables do not work on the Process level path'}
  write-host "Ensuring `"$pathtoadd`" is added to the $AddToStartOrEnd of variable `"$PathVariable`" for scope `"$scope`" "
  $ExistingPathArray = @([Environment]::GetEnvironmentVariable("$PathVariable","$Scope").split(';'))
  if (($ExistingPathArray -inotcontains $PathToAdd) -AND ($ExistingPathArray -inotcontains "$PathToAdd\"))
  {
    If ($AddToStartOrEnd -ieq 'START')
    { $Newpath = @("$PathToAdd") + $ExistingPathArray }
    else 
    { $Newpath = $ExistingPathArray + @("$PathToAdd")  }
    $AssembledNewPath = ($newpath -join(';')).trimend(';')
    [Environment]::SetEnvironmentVariable("$PathVariable",$AssembledNewPath,"$Scope")
  }
}

#Test code
Ensure-OnPath '%TEST%\bin'
$env:ABC = 'C:\ABC'
Ensure-OnPath '%ABC%' 'Machine' 'PSModulePath' 'START'
Ensure-OnPath 'C:\ABC' 'Process' 'PSModulePath' 'START' #Make available in current process, can't use environment variables

#Show Modification Results
[Environment]::GetEnvironmentVariable("PATH","Process")
[Environment]::GetEnvironmentVariable("PSModulePath","Machine")
[Environment]::GetEnvironmentVariable("PSModulePath","Process")

Ensure-RemovedFromPath Function

Function Ensure-RemovedFromPath ($PathToRemove,$Scope,$PathVariable)
{
  If (!$Scope) {$Scope='Machine'}
  If (!$PathVariable) {$PathVariable='PATH'}
  $ExistingPathArray = @([Environment]::GetEnvironmentVariable("$PathVariable","$Scope").split(';'))
  write-host "Ensuring `"$PathToRemove`" is removed from variable `"$PathVariable`" for scope `"$scope`" "
  if (($ExistingPathArray -icontains $PathToRemove) -OR ($ExistingPathArray -icontains "$PathToRemove\"))
  {
    foreach ($path in $ExistingPathArray)
    {
      If ($Path)
      {
        If (($path -ine "$PathToRemove") -AND ($path -ine "$PathToRemove\"))
        {
          [string[]]$Newpath += "$path"
        }
      }
    }
    $AssembledNewPath = ($Newpath -join(';')).trimend(';')
    [Environment]::SetEnvironmentVariable("$PathVariable",$AssembledNewPath,"$Scope")
  }
}

#Test code (undoes changes from Ensure-OnPath test code)
Ensure-RemovedFromPath '%TEST%\bin'
Ensure-RemovedFromPath '%ABC%' 'Machine' 'PSModulePath'
Ensure-RemovedFromPath 'C:\ABC' 'Machine' 'PSModulePath'

#Show Modification Results
[Environment]::GetEnvironmentVariable("PATH","Machine")
[Environment]::GetEnvironmentVariable("PSModulePath","Machine")
[Environment]::GetEnvironmentVariable("PSModulePath","Process")