Directories popping up named %SystemDrive%

Had an issue today where people were noticing that on some machines, whenever a utility was run it would create a new %SystemDrive% folder with some subfolders leading to a set of .db files. That is literally %SystemDrive%, what you would expect on most systems to automatically expand to “c:”.

I decided to do a quick google search on the GUID .db filenames and I found many projects with a similar issue (utorrent especially). I think we tracked down the cause, though.

Investigation

First, we need to illustrate the problem.

|h ProcessStartTest.cs
using System.Diagnostics;
using System.Linq;
 
/// <summary>
/// Runs a command in a subprocess.
/// </summary>
class ProcessStartTest
{
    static int Main(string[] args)
    {
        int commandReturnInt = RunCommand(args);
        return commandReturnInt;
    }
 
    // In a shell that is missing both the ProgramData and SystemDrive environment variables, this produces:
    // .\%SystemDrive%\ProgramData
    // .\%SystemDrive%\ProgramData\Microsoft
    // .\%SystemDrive%\ProgramData\Microsoft\Windows
    // .\%SystemDrive%\ProgramData\Microsoft\Windows\Caches
    // .\%SystemDrive%\ProgramData\Microsoft\Windows\Caches\cversions.2.db
    // .\%SystemDrive%\ProgramData\Microsoft\Windows\Caches\{4E4260A4-7E39-442E-BC22-7FF751D1C161}.2.ver0x0000000000000001.db
    // .\%SystemDrive%\ProgramData\Microsoft\Windows\Caches\{6AF0698E-D558-4F6E-9B3C-3716689AF493}.2.ver0x0000000000000001.db
    // .\%SystemDrive%\ProgramData\Microsoft\Windows\Caches\{DDF571F2-BE98-426D-8288-1A9A39C3FDA2}.2.ver0x0000000000000001.db
    // This runs thorugh System.Diagnostics.Process.StartWithShellExecuteEx
    public static int RunCommand (string[] args)
    {
        var filename = args[0];
        var argumentArray = args.Skip(1).ToArray();
        var argumentString = string.Join(" ", argumentArray);
        var subProcess = Process.Start(filename, argumentString);
        subProcess.WaitForExit();
 
        return subProcess.ExitCode;
    }
}   

As the comment says, if you run this normally, it should work without problem. But if you clear out the two environment variables ProgramData and SystemDrive, it will run but also create a new, unwanted subfolder.

After some digging, this appears to be a bug in System.Diagnostics.Process.StartWithShellExecuteEx - which is the default method called if you do a standard Process.Start(“command”). A lot of projects use this form and can be subject to this bug.

To get around it, we can have the subprocess start without creating a shell. This is usually what I want when I am calling a subprocess, anyway - it should be hidden and if I want to display the output I will gather the output from the process and display it within my app. It isn't a big change to remove the shell.

|h ProcessStartFix.cs
using System.Diagnostics;
using System.Linq;
 
/// <summary>
/// Runs a command in a subprocess.
/// </summary>
class ProcessStartFix
{
    static int Main(string[] args)
    {
        int commandReturnInt = RunCommand(args);
        return commandReturnInt;
    }
 
    // This runs through System.Diagnostics.Process.StartWithCreateProcess
    public static int RunCommand (string[] args)
    {
        var subProcess = new Process();
        var filename = args[0];
        var argumentArray = args.Skip(1).ToArray();
        var argumentString = string.Join(" ", argumentArray);
        subProcess.StartInfo = new ProcessStartInfo
        {
            FileName = filename,
            Arguments = argumentString,
            UseShellExecute = false,
        };
        subProcess.Start();
        subProcess.WaitForExit();
 
        return subProcess.ExitCode;
    }
}

If you actually need to have a shell execute, you can follow the second method and just inspect the StartInfo object to make sure that either or both of those environment variables are set or set them yourself.

I wouldn't think this would come up very often, as those variables both seem to be expected by many programs. Even the Windows Start Menu shortcuts depend on the %WinDir% environment variable. In any case, there are several ways where those variables can be lost and cause the bug to occur.

Conclusion

It is a bug in System.Diagnostics.Process.StartWithShellExecuteEx if both the ProgramData and SystemDrive environment variables are unset, apparently. Either make sure one of these variables exists before calling Process.Start or instead create a ProcessStartInfo and disable shell execution.

Acknowledgements

Thanks to Adam Meyer for both the original mass mail with the problem and help digging in to see what was wrong.

2011/08/04 17:54

Clean Dated Directories in Batch

Log files, CI build artifacts, daily reports - I seem to have an ever growing list of files and folders that are important for a week or so and then need to be purged. Seems like a good job for a batch file.

Step 1 was making sure that the side effects were all grouped into files or folders by date.

Step 2 took a little bit of batch date math.

Requirements

  • Perform an operation on a set of files or folders that are named by date. (e.g., Log-2011-07-01)
  • Limit the operation to a range of dates between now and X days ago, possibly omitting the most recent Y days. (e.g., delete folders with dates older than a week ago)

Investigation

Assuming our target files or directories have a set date format, if we can get a list of dates relative to the current one, we can use that as input to another batchfile to either move, zip, or erase those targets individually. There doesn't seem to be any built in mechanism for date math in batch without calling into another executable or VBS.

Date math is a pretty complex subject, but if we limit the output to only dates and do not consider time or days of the week, the only special case we need to be concerned about is leap years and the number of days in February.

Implementation

To get a list of recent dates, we start with pulling the current date from %date% and parsing it to find individual values for day, month, and year. We then count backwards by subtracting days. When we rollover a month, we check to see how many days the new month has. If the month is February, we do some extra checking to see if it is a leap year or not.

Since the primary usecase is to keep Y number of days and delete X older ones, we optionally take a second argument of a number of most recent days to skip.

|h recentdates.bat
@setlocal
@echo off
REM recentdates.bat [DAYSTOPRINT] [DAYSTOSKIP]
REM Prints a list of dates, one per line, starting with today and going backwards.
REM By default, it will print the most recent 7 days, including today.
REM
REM murrayjw 2010-10-04 New.
 
set /a DAYSTOPRINT=7
set /a DAYSTOSKIP=0
 
if not ("%1") == ("") set /a DAYSTOPRINT=%1
if not ("%2") == ("") set /a DAYSTOSKIP=%2
 
if %DAYSTOPRINT% lss 1 goto END
 
for /f "tokens=2-4 delims=/ " %%i in ("%date%") do set /a CURRENTYEAR=%%k&&set /a CURRENTMONTH=1%%i - 100&&set /a CURRENTDAY=1%%j - 100
goto PRINTDATE
 
:MINUSDAY
if %CURRENTDAY% equ 1 goto MINUSMONTH
set /a CURRENTDAY=%CURRENTDAY% - 1
goto PRINTDATE
 
:MINUSMONTH
if %CURRENTMONTH% equ 1 goto MINUSYEAR
set /a CURRENTMONTH=%CURRENTMONTH% - 1

REM Default to 31 days, check 30 day months, and special case February.
set /a CURRENTDAY=31
if %CURRENTMONTH% equ 4 set /a CURRENTDAY=30 && goto MINUSMONTHEND
if %CURRENTMONTH% equ 6 set /a CURRENTDAY=30 && goto MINUSMONTHEND
if %CURRENTMONTH% equ 9 set /a CURRENTDAY=30 && goto MINUSMONTHEND
if %CURRENTMONTH% equ 11 set /a CURRENTDAY=30 && goto MINUSMONTHEND
if %CURRENTMONTH% neq 2 goto MINUSMONTHEND

REM Calculate leap years for Feb 29.
REM if year modulo 400 is 0 then is_leap_year
set /a CURRENTYEARMOD400=%CURRENTYEAR% %% 400
if %CURRENTYEARMOD400% equ 0 set /a CURRENTDAY=29 && goto MINUSMONTHEND
REM else if year modulo 100 is 0 then not_leap_year
set /a CURRENTYEARMOD100=%CURRENTYEAR% %% 100
if %CURRENTYEARMOD100% equ 0 set /a CURRENTDAY=28 && goto MINUSMONTHEND
REM else if year modulo 4 is 0 then is_leap_year
set /a CURRENTYEARMOD4=%CURRENTYEAR% %% 4
if %CURRENTYEARMOD4% equ 0 set /a CURRENTDAY=29 && goto MINUSMONTHEND
REM else not_leap_year
set /a CURRENTDAY=28
goto MINUSMONTHEND
 
:MINUSMONTHEND
goto PRINTDATE
 
:MINUSYEAR
set /a CURRENTYEAR=%CURRENTYEAR% - 1
set /a CURRENTMONTH=12
set /a CURRENTDAY=31
goto PRINTDATE
 
:PRINTDATE
if %DAYSTOSKIP% gtr 0 set /a DAYSTOSKIP=%DAYSTOSKIP%-1 && goto PRINTDATEEND
 
set /a DAYSTOPRINT=%DAYSTOPRINT%-1
set CURRENTMONTHSTR=%CURRENTMONTH%
set CURRENTDAYSTR=%CURRENTDAY%
call :ZERO_PAD_WIDTH_2 CURRENTMONTHSTR %CURRENTMONTHSTR%
call :ZERO_PAD_WIDTH_2 CURRENTDAYSTR %CURRENTDAYSTR%
echo %CURRENTYEAR%-%CURRENTMONTHSTR%-%CURRENTDAYSTR%
if %DAYSTOPRINT% equ 0 goto END
goto PRINTDATEEND
 
:PRINTDATEEND
goto :MINUSDAY
 
:ZERO_PAD_WIDTH_2
if %~2 LSS 10 set "%~1=0%~2"
goto :eof
 
:END
endlocal

This could be customized for individual date setups (like MM-DD-YYYY, for example) by just changing :PRINTDATE. Also look out for localization issues when reading %date%.

We can wrap a call to recentdates.bat in another batch to delete the folders with those date names. By default, I like to keep only the last week. Since we're running this fairly often, we can restrict the number of dates it returns to the last month, just to keep it small and still allow for some accidental script downtime.

h|removeoldbuilds.bat
@echo off
@setlocal

REM removeoldbuilds.bat [BASEFOLDER]
REM Finds a list of recent dates and removes directories with those names.
 
if ("%1") == ("") goto ERROR_USAGE
set BASEPATH=%1
 
if not exist %BASEPATH% goto ERROR_BASE_MISSING
 
for /f %%a in ('call recentdates.bat 30 8') do if exist %BASEPATH%\%%a rmdir /s /q %BASEPATH%\%%a && echo %BASEPATH%\%%a (deleted)
goto END_SUCCESS
 
:ERROR_BASE_MISSING
echo %BASEPATH% does not exist.
goto END_FAIL
 
:ERROR_USAGE
echo removeoldbuilds.bat [BASEFOLDER]
goto END_FAIL
 
:END_FAIL
endlocal
exit /b 1
 
:END_SUCCESS
endlocal
exit /b 0

Results

recentdates.bat output
C:\>recentdates
2011-07-05
2011-07-04
2011-07-03
2011-07-02
2011-07-01
2011-06-30
2011-06-29
 
C:\>recentdates 2
2011-07-05
2011-07-04
 
C:\>recentdates 10 8
2011-06-27
2011-06-26
2011-06-25
2011-06-24
2011-06-23
2011-06-22
2011-06-21
2011-06-20
2011-06-19
2011-06-18
removeoldbuilds.bat Jenkins output
A timer trigger started this job
Building remotely on buildmachine1
 
d:\hudsonslave\workspace\Cluster - Clean Output Folders>call removeoldbuilds.bat \\fileshare\builds\cluster
\\fileshare\builds\cluster\2011-06-27 (deleted)
Finished: SUCCESS

Conclusion

This works great. I put removeoldbuilds.bat in a Jenkins job for each of the temporary destinations. It outputs which folders it deleted and lets me know of any issues. Having a simple script to do this clean up has saved me some recurring headaches over disk space. It also lets me publish data retention policies and avoid misunderstandings.

It has also proved surprisingly versatile - it only took a couple of small edits to use recentdates.bat to mass-rename a bunch of dated files from MMDDYYYY to YYYYMMDD and then zip them by month.

2011/07/05 20:03

Parallel Batch

Due to limitations in the threading of a source control system I was using, it became apparent that checkout times could be improved by instead checking out individual subfolders at the same time in different command windows. This exposed a need to be able to run tasks in parallel from a batch file while being able to limit the number of parallel jobs running simultaneously.

Parallel batch files?!? Oh yeah!

Specification

  • Run multiple sub-commands simultaneously
  • Limit the number of simultaneous commands
  • Do not return control until all commands have completed
  • Must run on a bare install of Win XP or Win 7 - no extra libraries

Investigation

We will assume that our batch script is running as a process. We want to limit the number of subprocesses of a specific name running under the batch to a max number. To do this, we will need a way to both identify our own PID (Process Identifier) as well as identifying all other processes which are children of our PID. (Probably not grandchildren or lower, in case a command splits off copies of itself.)

You can spawn a new process to run independent of a batch file via start. Adding the /b flag will prevent start from spawning a new command window. When created this way, the originating batch file has no way to see the output of the process or even to be able to tell if it has finished or not.

When a process is begun through start, it will generate a new process as a child of the originating batch command process. Unfortunately, there is no good way that I can find to give you the PID of the currently running process from the available batch commands. You can get a list of all processes pretty easily, but not your specific PID or the PID of your parent process.

Windows XP and 7 both have Visual Basic Script support built in via cscript.exe and wscript.exe. By using VBS, we can get access to a process list which includes both the process name and PID, but also the PID of its parent process. This turns out to be very important when we want to find the PID of a batch process.

As a side note, I would have used PowerShell, but it is not guaranteed to be available on XP.

The process list could contain many copies of any individual command at a time. To differentiate them, we can examine each process' calling commandline. To identify which PID is the specific script we are running, we can pass a unique identifier as an ignored commandline parameter. In practice, using the %time% environment variable is usually unique enough.

In a batch file, if you want to record the output of a command as a variable so you can perform logic on it, you usually do so via a for command.

|h Batch for Store Command Result Example
for /f "delims=" %%a in ('echo test') do set ECHO_RESULT=%%a

The interpreter will spawn the for in a new subprocess, which will then run the command passed as a new process under it. So if we have a VBS that can identify PIDs, and we call that script from a batch for command, the batch PID will be the grandparent PID relative to the VBS process. (whew)

The only other piece we're missing is a separate command to count all child processes of a given PID and compare that to a specified max value. If we limit the count to all child processes of a specific name, we can get pretty accurate parallelization.

Implementation

Please be understanding of mistakes in the VB below - it is not a language I'm very familiar with and it has been a while since I wrote it.

First we need a VBS to identify the PID of the calling process. It should optionally return the grandparent PID for use in batch files.

|h pid.vbs
' pid.vbs
' Returns the pid of the current process.
' If the first argument is "/parent", this script returns the PID of the current process' parent.
' This is useful for determining the PID of a bat file, as the for /f ('') command generates a child and grandchild process.
' Based on code from expertsexchange user DanBenway: 01.11.2010 at 11:10AM PST, ID: 26286222
' Extra arguments can be used to guarantee the PID found is correct.  Passing %time% as an argument is usually adequate.
'
' 20100804 murrayjw New.
' 20100804 murrayjw PID is sometimes invalid - return known string instead of garbage.

Option Explicit
Dim str_processID
Dim str_parentprocessID
Dim str_grandparentprocessID
 
Dim str_parentFlag
str_parentFlag = "/parent"
 
Dim bool_useGrandParent
bool_useGrandParent = False
 
If WScript.Arguments.Count >= 1 then
	If (InStr(1,str_parentFlag, WScript.Arguments.Item(0), vbTextCompare)<>0) then
		bool_useGrandParent=True
	end If
end If
 
Call sub_getPIDAndOwnerInfoForThisScript (str_processID)
Call sub_getParentPID (str_processID, str_parentprocessID)
 
If bool_useGrandParent=True then
	Call sub_getParentPID (str_parentprocessID, str_grandparentprocessID)
	wscript.echo str_grandparentprocessID
Else
	wscript.echo str_parentprocessID
end If
 
wscript.Quit
 
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getPIDAndOwnerInfoForThisScript (str_PID)
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
Dim str_commandLine
Dim str_scriptName
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

str_commandLine = "INVALIDCOMMANDLINE"
str_PID = "INVALIDPID"
str_scriptName = WScript.ScriptName
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process Where Name = ""WScript.exe"" OR Name = ""CScript.exe"" ")
 
For Each obj_process In col_processList
	str_commandLine = obj_process.CommandLine
	If (InStr(1,str_commandLine, str_scriptName, vbTextCompare)<>0) Then
		str_PID	= obj_process.processID
	End If
	str_commandLine = "~!@#$%^&*()_+"
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getParentPID (str_PID, str_ParentPID)
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

str_ParentPID = "INVALIDPID"
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process")
 
For Each obj_process In col_processList
	If (InStr(1,obj_process.processID, str_PID, vbTextCompare)<>0) Then
		str_ParentPID	= obj_process.parentprocessID
	End If
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Occasionally, querying the WMI Service will produce an odd return and no PID will be identified. In this case, the script will return INVALIDPID.

Next we need a script to count the child processes for a given PID, limited by process name.

|h childprocesscount.vbs
' childprocesscount.vbs
' Count the number of child (not grandchild or lower) processes.
' Based on code from expertsexchange user DanBenway: 01.11.2010 at 11:10AM PST, ID: 26286222
' If an argument is passed, only children with that name will be counted.
' If the argument is numeric, the children of that process will be counted.
'
' 20100804 murrayjw New.
' 20100806 murrayjw Use clng in place of cint to account for full range of possible PID values.
' 20100804 murrayjw PID is sometimes invalid - return known string instead of garbage.
' 20110616 murrayjw Ignore the PID of the running script and its parent when counting.

 
Option Explicit
Dim int_processID
Dim int_parentprocessID
Dim str_name
Dim int_ChildCount
 
Call sub_getPIDAndParentPIDInfoForThisScript (int_processID, int_parentprocessID)
 
If WScript.Arguments.Count = 0 then
	Call sub_getChildProcessCount (int_parentprocessID, int_ChildCount, int_processID, int_parentprocessID)
end If
 
If WScript.Arguments.Count = 1 then
	If IsNumeric(WScript.Arguments.Item(0)) then
		Call sub_getChildProcessCount (clng (WScript.Arguments.Item(0)), int_ChildCount, int_processID, int_parentprocessID)
	Else
		Call sub_getChildProcessCountByName (int_parentprocessID, WScript.Arguments.Item(0), int_ChildCount, int_processID, int_parentprocessID)
	end If
end If
 
If WScript.Arguments.Count = 2 then
	If IsNumeric(WScript.Arguments.Item(0)) then
		Call sub_getChildProcessCountByName (clng (WScript.Arguments.Item(0)), WScript.Arguments.Item(1), int_ChildCount, int_processID, int_parentprocessID)
	Else
		wscript.echo "First argument is not a PID."
		wscript.Quit
	end If
end If
 
If WScript.Arguments.Count > 2 then
	wscript.echo "Too many arguments."
	wscript.Quit
end If
 
 
wscript.echo int_ChildCount
wscript.Quit
 
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getChildProcessCount (int_PID, int_ChildCount, int_IgnorePID, int_IgnorePID2)
' Counts the child (not grandchildren or lower) processes of the current process.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

int_ChildCount = 0
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process")
 
For Each obj_process In col_processList
	If obj_process.parentprocessID = int_PID Then
		If obj_process.processID <> int_IgnorePID And obj_process.processID <> int_IgnorePID2 Then
			int_ChildCount = int_ChildCount + 1
		End If
	End If
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getChildProcessCountByName (int_PID, str_name, int_ChildCount, int_IgnorePID, int_IgnorePID2)
' Counts the child (not grandchildren or lower) processes of the current process.
' Restrict to processes of the passed name.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

int_ChildCount = 0
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process where Name =""" & str_name & """")
 
For Each obj_process In col_processList
	If obj_process.parentprocessID = int_PID Then
		If obj_process.processID <> int_IgnorePID  And obj_process.processID <> int_IgnorePID2 Then
			int_ChildCount = int_ChildCount + 1
		End If
	End If
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getPIDAndParentPIDInfoForThisScript (int_PID, int_ParentPID)
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
Dim str_commandLine
Dim str_scriptName
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

str_commandLine = "INVALIDCOMMANDLINE"
int_PID = -1
int_ParentPID = -1
str_scriptName = WScript.ScriptName
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process Where Name = ""WScript.exe"" OR Name = ""CScript.exe"" ")
 
For Each obj_process In col_processList
	str_commandLine = obj_process.CommandLine
	If (InStr(1,str_commandLine, str_scriptName, vbTextCompare)<>0) Then
		int_PID	= obj_process.processID
		int_ParentPID = obj_process.parentprocessID
	End If
	str_commandLine = "INVALIDCOMMANDLINE"
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Now to put these to good use, lets write a wrapper batch file to run a series of sub-commands read in from a text file.

|h parallelexecute.bat
@setlocal
@echo off
REM parallelexecute.bat
REM Execute individual commands from a text file in parallel.
REM usage: parallelexecute.bat commandfile.txt [NUMBER_OF_PROCESSES]
REM e.g. parallelexecute.bat commandfile.txt 4
 
call :FINDPID
 
if ("%1")==("") goto :eof
set COMMANDFILE=%1
if not exist %COMMANDFILE% goto :eof
 
set MAXTHREADS=4
if not ("%2")==("") set /a MAXTHREADS=%2
if not %MAXTHREADS% gtr 0 goto :eof

REM To allow batfiles in the commandfile, wrap all commands in a call statement.
for /f "delims=" %%a in (%COMMANDFILE%) do echo Starting: %%a&&start /b cmd.exe /c call %%a >nul 2>nul &&call :CHECKTHREADCOUNT
call :WAITFORTHREADSTOFINISH
 
endlocal
@GOTO :eof
 
:FINDPID
rem Add the time to the PID search to be sure the right PID is found.
rem %~dp0 is used here to indicate the directory of the calling batfile.
rem %~dp$PATH:1 would search the PATH directories for a file - it could be slower.
for /F "tokens=1" %%A IN ('cscript /nologo %~dp0\pid.vbs /parent %time%') DO (set TEMPPID=%%A)
if ("%TEMPPID%")==("INVALIDPID") GOTO FINDPID
set PID=%TEMPPID%
GOTO :eof
 
:CHECKTHREADCOUNT
for /F "tokens=1" %%A IN ('cscript /nologo %~dp0\childprocesscount.vbs %PID%') DO (SET CHILDCMDCOUNT=%%A)
if ("%CHILDCMDCOUNT%")==("INVALIDPID") GOTO CHECKTHREADCOUNT
REM If for whatever reason we are over the limit, busy stall until we are not.
if %CHILDCMDCOUNT% geq %MAXTHREADS% GOTO CHECKTHREADCOUNT
set CHILDCMDCOUNT=
set TEMPPID=
GOTO :eof
 
:WAITFORTHREADSTOFINISH
for /F "tokens=1" %%A IN ('cscript /nologo %~dp0\childprocesscount.vbs %PID%') DO (SET CHILDCMDCOUNT=%%A)
if %CHILDCMDCOUNT% gtr 0 GOTO WAITFORTHREADSTOFINISH
set CHILDCMDCOUNT=
set TEMPPID=
GOTO :eof

This will run each line of the input file in a new cmd.exe process then immediately check the number of child cmd.exe processes. If this number is greater or equal to MAXTHREADS, it will stall until the number is lower. Once all commands have been run, it waits for the number to reach zero before exiting.

Results

I filled a commands.txt text file with 10 copies of ping google.com. Here's the output:

Parallel Batch Results
D:\temp>timerun.bat ping gooogle.com
 
Pinging gooogle.com [74.125.93.104] with 32 bytes of data
Reply from 74.125.93.104: bytes=32 time=81ms TTL=41
Reply from 74.125.93.104: bytes=32 time=77ms TTL=41
Reply from 74.125.93.104: bytes=32 time=78ms TTL=41
Reply from 74.125.93.104: bytes=32 time=81ms TTL=41
 
Ping statistics for 74.125.93.104:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 77ms, Maximum = 81ms, Average = 79ms
Elapsed msec: 3470
 
D:\temp>timerun.bat parallelexecute.bat commands.txt 1
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Elapsed msec: 36740
 
D:\temp>timerun.bat parallelexecute.bat commands.txt 2
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Elapsed msec: 18580
 
D:\temp>timerun.bat parallelexecute.bat commands.txt 4
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Elapsed msec: 11550
 
D:\temp>timerun.bat parallelexecute.bat commands.txt 10
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Elapsed msec: 6400
# Threads Time (ms)
1 36740
2 18580
4 11550
10 6400

So not a linear speed up, but most definitely parallel execution. The amount of speedup is very application and machine dependent. For my original version control issue, we saw linear speed up of checkouts to around 8 threads locally and 4 threads when remote.

Conclusion

Parallel execution in batch files is possible, even without additional libraries. For wrapping high-latency or single threaded applications, this can definitely provide some surprising speedups.

At the same time, though, you can achieve a much cleaner solution by writing a small utility in C#. Batch is commonly available and remarkable capable when you try hard, but try to use the best available tools when you can.

Future Work

My VB code is very hacky, especially concerning which PID to ignore. There is also the case where the return is occasionally invalid, which is a pain to handle. Cleaning this up would be a good step.

Notice that the batch file is redirecting STDOUT and STDERR to nul? If it didn't, the output of all the running commands would be piped to the screen simultaneously. Redirecting the output to a temp file or the like would be a good plan.

The return code of the commands run under start is lost. Wrapping the call in a separate script to record the return would be helpful.

Busy waiting in this way is wasteful, but there isn't a good sleep command in windows by default. A small sleep script would make the parent bat churn less.

Credits

When I originally started looking into this, I had a hard time finding any example code for PID identification that did not require running a custom .net app. Thanks to expertsexchange user DanBenway for providing some great examples.

2011/06/16 21:10

Jenkins Logo Contest Submission

So for those who have not been keeping track and as far as I'm aware, Oracle acquired Sun and trademarked Hudson. The community decided to fork and rename their version Jenkins to avoid issues with Oracle. As part of this, they also need a new logo and they've asked for submissions.

Guidelines: vector only, large to favicon, english butler theme

So I made one!

My Jenkins Logo Attempt

Here's my attempt. I am not an artist or designer, but everyone has aspirations, right?

The basic idea was a butler, but I don't like the idea of putting a face on them. I figure the best butlers are so good that they are basically invisible. I also wanted to mask the identity/race/gender of the butler so it would be more universal.

I looked at pictures of different butlers and saw their arms look somewhat like a J while carrying trays. I tried to stylize the graphic to represent that shape. I also intentionally left the skin out to imply the invisibility I mentioned. The empty tray looked a little odd, so I added a name card to the top. The font is Baskerville semi-bold; the J looks very close to what I wanted. The card could easily be blanked and replaced with a version specific to the installation, which would keep more of the feel of Jenkins for a customized site than if the graphic were replaced entirely. The bit of white shirt showing also left an option for a tiny bit of personalization in the form of a cufflink.

This would fit fairly well in place of the current logo on the front page of Jenkins. The arm will seem to lead in from off the left side of the page.

This is a bit tall, though. It still looks pretty good with the arm cut out.

I cheated on the favicon, as so few colors didn't make a very discernible 16×16 icon. The J looked great, though, so I just used a 14pt version for the icon.

Anyway, that's my try. Maybe the community will like the idea enough to get someone to remake it properly. :)

Here's a zip of the files above plus the svg to make them: jenkins-logo-jeremy_murray.zip

2011/02/27 22:36

Hudson Plugin Modifications

After learning a bit about groovy, it has become a bit of a catch-all for Hudson related tasks mid-build. Most of my investigation with it previously was with the groovy-postbuild plugin, but I wanted to be able to do some updates to the job immediately after the build started. This began a bit of a dive into the groovy plugin and into the Hudson plugin authoring setup, but it wasn't that hard in the end.

Note that the install steps are for windows, but the tools are available for Windows, Linux, and OSX and the basic steps are all the same (it is Java and Maven, after all).

Missing Functionality

  • The groovy plugin can run scripts on either the slave or the master. On the master, it has access to the Hudson instance in code, but it has no handle to the current build. The wiki shows ways of figuring it out via requesting the latest build by string name, but that is not reliable in a concurrent build job.
    • Expose current build to groovy script.
  • Most of my builds are initiated manually. I would like to send an e-mail to the initiator at the build start and at the end, along with notification of the build's success or failure. The submitter's e-mail address is currently an optional parameter, but it can be found via LDAP.
    • Expose submitter's e-mail to build.
  • There seems to be a lot of trouble getting the parameters of parameterized builds in a groovy script.
    • Verify that build parameters can be accessed by groovy script.

Getting the current build is a prerequisite for the other functionality, so I decided to start by getting the source for the groovy and groovy-postbuild plugins to see how the build could be exposed.

Plugin Source

Install:

  • Git
    • For windows, grab the latest msysgit (Git-1.7.3.1-preview20101002.exe) from msysgit downloads

Get the source:

Build A Plugin

Install:

  • Java 6 SDK (JDK)
    • For windows, grab the latest Java 6 JDK (jdk-6u23-windows-x64.exe) from Java Downloads
      • If you're having trouble finding it, search for “Java Platform, Standard Edition” then look for the button labeled “Download JDK”
    • For windows, this default installs to to C:\Program Files\Java\jdk1.6.0_23
  • Maven 2
    • Grab the latest maven 2 binary zip (apache-maven-2.2.1-bin.zip) from Maven Downloads
    • For windows, unzip it to C:\Program Files\Apache Software Foundation\apache-maven-2.2.1
    • Maven connects to the internet to download dependencies. If you have trouble running maven, especially if you are behind a fire wall, refer to the reference section for help.
  • Update environment variables either permanently or by running each line below prefixed with set on windows or export on most linuxes, adjusting the paths as needed
    • JAVA_6_HOME=C:\Program Files\Java\jdk1.6.0_23
    • M2_HOME=C:\Program Files\Apache Software Foundation\apache-maven-2.2.1
    • M2=%M2_HOME%\bin
    • MAVEN_OPTS=-Xms256m -Xmx512m
    • JAVA_HOME=%JAVA_6_HOME%
    • PATH=%PATH%;%JAVA_HOME%\bin;%M2%

Build the plugin:

  • cd into a plugin directory - it will have a pom.xml file
  • If you have made any modifications, update the version in the pom.xml file
    • e.g. <version>1.6-SNAPSHOT-PLUS_BUILD</version>
  • mvn install -Dmaven.test.skip=true
  • If all goes well, the build will eventually report: [INFO] BUILD SUCCESSFUL and produce a new plugin.hpi file in a target subdirectory.

Install a Plugin

Updating Hudson's groovy Plugin

Searching through the groovy plugin source, it looks like the perform method is the starting point for execution. It even takes a build as a parameter, so we probably just need to pass that through to the groovy script.

|h hudson-plugins/groovy/src/main/java/Hudson/plugins/groovy/SystemGroovy.java
...
public class SystemGroovy extends AbstractGroovy {
...
    public boolean perform(AbstractBuild<?, ?> build, Launcher launcher, BuildListener listener) throws InterruptedException, IOException {
...
        shell.setVariable("out", listener.getLogger());
        shell.setVariable("build", build);              // Add this line to expose this specific build to the script.
...

This should now give us access to the current executing build via the build variable in a groovy script.

Testing the New groovy Plugin

|h Test Groovy Job - Build Number
println ("This is build number: " + build.number)
  • Save the changes and select “Build Now”
Test Groovy Job - Build Number - Results
Started by user murrayjw
This is build number: 1
Finished: SUCCESS

With the current build exposed to groovy, we should be able to get the rest of the information we were looking for.

Accessing the Submitter E-Mail Address

For the submitter's e-mail address, we first need to be sure the build was caused by a user selecting Build Now. If it was, the build causes will have a UserCause entry. If that exists, we can examine it to find the user's username then query the Hudson instance to both resolve that username to a user object and query that object for its e-mail address, if available.

|h Test Groovy Job - Submitter E-Mail Address
for (cause in build.getCauses())
{
  if (cause instanceof hudson.model.Cause.UserCause)
  {
    def username = ((hudson.model.Cause.UserCause)cause).getUserName()
    def user = hudson.model.Hudson.instance.getUser(username)
    if (user != null)
    {
      hudson.tasks.Mailer.UserProperty mailProperty = user.getProperty(hudson.tasks.Mailer.UserProperty)
      if (mailProperty != null)
      {
        println("Submitter address: " + mailProperty.getAddress())
      }
      else
      {
        println("No available address for " + username)
      }
    }
  }
}
Test Groovy Job - Submitter E-Mail Address - Results
Started by user murrayjw
Submitter address: Jeremy.Murray@example.com
Finished: SUCCESS

Accessing the Build Parameters

Build parameters are exposed in two ways, both as an environment variable and under their own buildVariables collection. This shows how to access both. For illustration, I created a few dummy parameters of different types.

|h Test Groovy Job - Build Parameters
println("Parameters:")
build.buildVariables.each{println("  " + it.key +': '+ it.value) }
 
def envVars = build.properties.get("envVars")
println("Environment variables:")
envVars.each{ println("  " + it.key +': '+ it.value) }
Test Groovy Job - Build Parameters - Results
Started by user murrayjw
Parameters:
  TestBooleanValue: true
  TestStringParam: myDefaultValue
  TestChoice: myTestChoice1
Environment variables:
  ... bunch of variables
  BUILD_ID: 2011-01-12_17-18-47
  BUILD_NUMBER: 30
  BUILD_TAG: hudson-Test Groovy Job-30
  BUILD_URL: http://hudson/job/Test%20Groovy%20Job/30/
  ... bunch of variables
  TestBooleanValue: true
  TestChoice: myTestChoice1
  TestStringParam: myDefaultValue
  ... bunch of variables
Finished: SUCCESS

Conclusion

Modifying an existing Hudson plugin wasn't hard at all. The tool install went smoothly on multiple systems and the get/modify/build/install had no issues. I was very surprised, actually, and I think this has encouraged me to forge on with the other plugins I had been thinking of coding.

Now to figure out github to make a branch of this one plugin. If you would like a copy of this small modification, I've put up a copy of my modified groovy.hpi.

Next up: the groovy-postbuild plugin has a bunch of built-in functionality for adding badges and such - it would be great to have that in the groovy plugin, too, or to have a script version that can be run.

Much later: better StarTeam support, or even just a more generic approach to version control than the current plugins.

References

2011/01/17 15:12

<< Newer entries | Older entries >>

 
blog.txt · Last modified: 2010/04/07 15:36 by Jeremy Murray · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki