After using various types, architectures and generations of computers over the years there is always the habit of “you go to what you know!” In other words once you figure out a solution to an issue, you then use that repeatedly in the future as you know the process/steps involved. This adroitly describes me when it comes to doing certain command line tasks. If I were being a bit more unkind to myself, then I could also use the saying “If all you have is a hammer, all you see are nails”.
Sometimes I like to mix it up and combine PowerShell commands with output from the cmd prompt since I have known ways of doing certain tasks. This is all good, until you start parsing output from PowerShell in the cmd prompt and get no matches/returns/hits on the data even though you know that there are matches within the data.
For example you might do the following:
-
Use PowerShell to parse IIS or Exchange logs
-
Use PowerShell to remove duplicates from a list
-
Use the venerable FOR command in a cmd prompt to pull certain tokens out of the returned data
However there are no results returned from the cmd prompt.
This is a case of data types and their evolution over time. Let’s take a look at an example and how to address it.
Current Day - 1.21 GIGAWATTS of PowerShell Awesomeness
In this example, we shall use PowerShell to parse the IIS logs to get a list of all the Outlook 2010 users who hit the /Autodiscover virtual directory, from a particular domain in the forest. We will look for hits to the Autodiscover.xml file from Outlook version 14.0 and ensure that the user is from one specific domain in our AD forest, this is the “Contoso” domain. The results will be outputted to the test file called Autodiscover.txt. This is in the $PWD – the present working directory:
Get-ChildItem -Recurse -Filter *.log | Get-Content | Where-Object {$_ -Match "Microsoft+Outlook+14.0" -And $_ -Match "Contoso" -And $_ -Match "POST /Autodiscover/autodiscover.xml"} | Out-File $PWDAutodiscover.txt
As you can see, the command completes and exists in the current folder - All good! You may be wondering about the –Recurse option. If we were parsing IIS logs from lots of servers, then typically they would be copied to one central location in the format of TopFolderServerIIS logs. Or expressed another way:
TopFolder
Server1
Log1.log
Log2.log
Server2
Log1.log
Log2.log
Using Get-Content, we can then look at the content of the Autodiscover.txt file. The content of the file is what we’d expect, lines containing the phrases we specified in the PowerShell search:
An example line would be:
2010-11-12 09:07:02 192.168.2.15 POST /Autodiscover/Autodiscover.xml – - 443 CONTOSODFunker 192.168.5.10 Microsoft+Office/14.0+(Windows+NT+6.1;+Microsoft+Outlook+14.0.4760;+Pro) 200 0 0 3411
Set Time Circuits To November 5, 1955 1995
Now since I have a penchant for the Findstr.exe command. The below example parses the file splitting using the specified delimiters and then retrieving tokens %a and %b. Using this, let’s then try to search the output from the above command using the cmd prompt:
FOR /F "Tokens=10,13 Delims=,; " %a IN (Autodiscover.txt) DO @ECHO %a %b
Hmm. No results, but we already saw that there is content in the file. What gives?
Great Scott!
The issue is that we are looking at utilities that were created in different computing eras. A lot has changed in computing, and localisation of content is one. Previously ASCII could be used quite happily, but nowadays UNICODE is typically the default option as it supports double byte characters.
To detect the current format of the file we can use PowerShell to inspect it with a script. Alternatively, open up the file in notepad and do a “Save As”. Notepad will default to the current encoding type and file location. We can see that the output file from PowerShell was encoded in UNICODE.
UNICODE is the default encoding from the Out-File cmdlet, but we can change this quite easily by adding the Encoding parameter. Valid values are "Unicode", "UTF7", "UTF8", "UTF32", "ASCII", "BigEndianUnicode", "Default", and "OEM".
"Default" uses the encoding of the system's current ANSI code page.
Back To The Future
So this time around, let’s tell Out-File to encode as ASCII and save as Autodiscover-v2.txt
Get-ChildItem -Recurse -Filter *.log | Get-Content | Where-Object {$_ -Match "Microsoft+Outlook+14.0" -And $_ -Match "Contoso" -And $_ -Match "POST /Autodiscover/autodiscover.xml"} | Out-File $PWDAutodiscover-v2.txt -Encoding ASCII
Now, if we look at the contents of Autodiscover-v2.txt the FOR command gets results:
The net result is that I get to keep on running batch file commands from the 90s!
Cheers,
Rhoderick