If you use –split as often as I do you probably had occasional issue with understanding why certain splits work as expected, and some produce results different from what you anticipated. Recently one of MVPs, Oisin Grehan, shed some light on the reason why this is the case. I thought it might be worth sharing in case you’ve had this issue and you are pulling your hair trying to figure out what’s going on.
Lets start with something simple. Normally you want to split larger string on one of the characters. Simple example, MAC address:
'00-27-10-1A-55-14' -split '-'
Problem starts if you need more complex regular expression that will serve as splitting pattern. I often found myself using some groups within this pattern but it would generate more elements than I needed/ wanted, some of them unexpected. Very often I don’t have to use these groups, I just use them because it’s sometimes easier to read. Example:
$Mac = '00-27-10-1A-55-14' $Mac -split '(-|:)'
This time I have no idea which character will user pick, so I try to split on both possible options. Result is different because groups are also added to final result. There are two options to avoid this additional elements in output: not use grouping if not necessary (example above does not need any grouping to work), or use non-capturing groups:
$Mac -split '(?:-|:)'
But this also means, that if we want to keep some elements of pattern that we want to split string on – we can do that using grouping:
'Test123Text456MixedWith789Digits' -split '\d(\d+)\d'
It’s rather silly example, but this may save some time on coming up with more complex regex that will do the same:
'Test123Text456MixedWith789Digits' -split '(?<=[a-z])\d|\d(?=[a-z])'
For me it was eye-opener (better late than never) – hope it will help others too. Thanks to Kirk for bringing up this topic and to Oisin for explaining what is going on.