Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /homepages/13/d92059571/htdocs/jeremymurray/inc/parser/handler.php on line 1458

Parallel Batch

Due to limitations in the threading of a source control system I was using, it became apparent that checkout times could be improved by instead checking out individual subfolders at the same time in different command windows. This exposed a need to be able to run tasks in parallel from a batch file while being able to limit the number of parallel jobs running simultaneously.

Parallel batch files?!? Oh yeah!

Specification

  • Run multiple sub-commands simultaneously
  • Limit the number of simultaneous commands
  • Do not return control until all commands have completed
  • Must run on a bare install of Win XP or Win 7 - no extra libraries

Investigation

We will assume that our batch script is running as a process. We want to limit the number of subprocesses of a specific name running under the batch to a max number. To do this, we will need a way to both identify our own PID (Process Identifier) as well as identifying all other processes which are children of our PID. (Probably not grandchildren or lower, in case a command splits off copies of itself.)

You can spawn a new process to run independent of a batch file via start. Adding the /b flag will prevent start from spawning a new command window. When created this way, the originating batch file has no way to see the output of the process or even to be able to tell if it has finished or not.

When a process is begun through start, it will generate a new process as a child of the originating batch command process. Unfortunately, there is no good way that I can find to give you the PID of the currently running process from the available batch commands. You can get a list of all processes pretty easily, but not your specific PID or the PID of your parent process.

Windows XP and 7 both have Visual Basic Script support built in via cscript.exe and wscript.exe. By using VBS, we can get access to a process list which includes both the process name and PID, but also the PID of its parent process. This turns out to be very important when we want to find the PID of a batch process.

As a side note, I would have used PowerShell, but it is not guaranteed to be available on XP.

The process list could contain many copies of any individual command at a time. To differentiate them, we can examine each process' calling commandline. To identify which PID is the specific script we are running, we can pass a unique identifier as an ignored commandline parameter. In practice, using the %time% environment variable is usually unique enough.

In a batch file, if you want to record the output of a command as a variable so you can perform logic on it, you usually do so via a for command.

|h Batch for Store Command Result Example
for /f "delims=" %%a in ('echo test') do set ECHO_RESULT=%%a

The interpreter will spawn the for in a new subprocess, which will then run the command passed as a new process under it. So if we have a VBS that can identify PIDs, and we call that script from a batch for command, the batch PID will be the grandparent PID relative to the VBS process. (whew)

The only other piece we're missing is a separate command to count all child processes of a given PID and compare that to a specified max value. If we limit the count to all child processes of a specific name, we can get pretty accurate parallelization.

Implementation

Please be understanding of mistakes in the VB below - it is not a language I'm very familiar with and it has been a while since I wrote it.

First we need a VBS to identify the PID of the calling process. It should optionally return the grandparent PID for use in batch files.

|h pid.vbs
' pid.vbs
' Returns the pid of the current process.
' If the first argument is "/parent", this script returns the PID of the current process' parent.
' This is useful for determining the PID of a bat file, as the for /f ('') command generates a child and grandchild process.
' Based on code from expertsexchange user DanBenway: 01.11.2010 at 11:10AM PST, ID: 26286222
' Extra arguments can be used to guarantee the PID found is correct.  Passing %time% as an argument is usually adequate.
'
' 20100804 murrayjw New.
' 20100804 murrayjw PID is sometimes invalid - return known string instead of garbage.

Option Explicit
Dim str_processID
Dim str_parentprocessID
Dim str_grandparentprocessID
 
Dim str_parentFlag
str_parentFlag = "/parent"
 
Dim bool_useGrandParent
bool_useGrandParent = False
 
If WScript.Arguments.Count >= 1 then
	If (InStr(1,str_parentFlag, WScript.Arguments.Item(0), vbTextCompare)<>0) then
		bool_useGrandParent=True
	end If
end If
 
Call sub_getPIDAndOwnerInfoForThisScript (str_processID)
Call sub_getParentPID (str_processID, str_parentprocessID)
 
If bool_useGrandParent=True then
	Call sub_getParentPID (str_parentprocessID, str_grandparentprocessID)
	wscript.echo str_grandparentprocessID
Else
	wscript.echo str_parentprocessID
end If
 
wscript.Quit
 
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getPIDAndOwnerInfoForThisScript (str_PID)
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
Dim str_commandLine
Dim str_scriptName
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

str_commandLine = "INVALIDCOMMANDLINE"
str_PID = "INVALIDPID"
str_scriptName = WScript.ScriptName
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process Where Name = ""WScript.exe"" OR Name = ""CScript.exe"" ")
 
For Each obj_process In col_processList
	str_commandLine = obj_process.CommandLine
	If (InStr(1,str_commandLine, str_scriptName, vbTextCompare)<>0) Then
		str_PID	= obj_process.processID
	End If
	str_commandLine = "~!@#$%^&*()_+"
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getParentPID (str_PID, str_ParentPID)
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

str_ParentPID = "INVALIDPID"
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process")
 
For Each obj_process In col_processList
	If (InStr(1,obj_process.processID, str_PID, vbTextCompare)<>0) Then
		str_ParentPID	= obj_process.parentprocessID
	End If
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Occasionally, querying the WMI Service will produce an odd return and no PID will be identified. In this case, the script will return INVALIDPID.

Next we need a script to count the child processes for a given PID, limited by process name.

|h childprocesscount.vbs
' childprocesscount.vbs
' Count the number of child (not grandchild or lower) processes.
' Based on code from expertsexchange user DanBenway: 01.11.2010 at 11:10AM PST, ID: 26286222
' If an argument is passed, only children with that name will be counted.
' If the argument is numeric, the children of that process will be counted.
'
' 20100804 murrayjw New.
' 20100806 murrayjw Use clng in place of cint to account for full range of possible PID values.
' 20100804 murrayjw PID is sometimes invalid - return known string instead of garbage.
' 20110616 murrayjw Ignore the PID of the running script and its parent when counting.

 
Option Explicit
Dim int_processID
Dim int_parentprocessID
Dim str_name
Dim int_ChildCount
 
Call sub_getPIDAndParentPIDInfoForThisScript (int_processID, int_parentprocessID)
 
If WScript.Arguments.Count = 0 then
	Call sub_getChildProcessCount (int_parentprocessID, int_ChildCount, int_processID, int_parentprocessID)
end If
 
If WScript.Arguments.Count = 1 then
	If IsNumeric(WScript.Arguments.Item(0)) then
		Call sub_getChildProcessCount (clng (WScript.Arguments.Item(0)), int_ChildCount, int_processID, int_parentprocessID)
	Else
		Call sub_getChildProcessCountByName (int_parentprocessID, WScript.Arguments.Item(0), int_ChildCount, int_processID, int_parentprocessID)
	end If
end If
 
If WScript.Arguments.Count = 2 then
	If IsNumeric(WScript.Arguments.Item(0)) then
		Call sub_getChildProcessCountByName (clng (WScript.Arguments.Item(0)), WScript.Arguments.Item(1), int_ChildCount, int_processID, int_parentprocessID)
	Else
		wscript.echo "First argument is not a PID."
		wscript.Quit
	end If
end If
 
If WScript.Arguments.Count > 2 then
	wscript.echo "Too many arguments."
	wscript.Quit
end If
 
 
wscript.echo int_ChildCount
wscript.Quit
 
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getChildProcessCount (int_PID, int_ChildCount, int_IgnorePID, int_IgnorePID2)
' Counts the child (not grandchildren or lower) processes of the current process.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

int_ChildCount = 0
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process")
 
For Each obj_process In col_processList
	If obj_process.parentprocessID = int_PID Then
		If obj_process.processID <> int_IgnorePID And obj_process.processID <> int_IgnorePID2 Then
			int_ChildCount = int_ChildCount + 1
		End If
	End If
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getChildProcessCountByName (int_PID, str_name, int_ChildCount, int_IgnorePID, int_IgnorePID2)
' Counts the child (not grandchildren or lower) processes of the current process.
' Restrict to processes of the passed name.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

int_ChildCount = 0
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process where Name =""" & str_name & """")
 
For Each obj_process In col_processList
	If obj_process.parentprocessID = int_PID Then
		If obj_process.processID <> int_IgnorePID  And obj_process.processID <> int_IgnorePID2 Then
			int_ChildCount = int_ChildCount + 1
		End If
	End If
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub sub_getPIDAndParentPIDInfoForThisScript (int_PID, int_ParentPID)
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim obj_process
Dim col_processList
Dim obj_WMIService
Dim str_commandLine
Dim str_scriptName
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

str_commandLine = "INVALIDCOMMANDLINE"
int_PID = -1
int_ParentPID = -1
str_scriptName = WScript.ScriptName
 
Set obj_WMIService = GetObject ("winmgmts:\\.\root\cimv2")
Set col_processList = obj_WMIService.ExecQuery ("Select * from Win32_Process Where Name = ""WScript.exe"" OR Name = ""CScript.exe"" ")
 
For Each obj_process In col_processList
	str_commandLine = obj_process.CommandLine
	If (InStr(1,str_commandLine, str_scriptName, vbTextCompare)<>0) Then
		int_PID	= obj_process.processID
		int_ParentPID = obj_process.parentprocessID
	End If
	str_commandLine = "INVALIDCOMMANDLINE"
Next
 
Set obj_WMIService = Nothing
Set col_processList = Nothing
Set obj_process = Nothing
 
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Now to put these to good use, lets write a wrapper batch file to run a series of sub-commands read in from a text file.

|h parallelexecute.bat
@setlocal
@echo off
REM parallelexecute.bat
REM Execute individual commands from a text file in parallel.
REM usage: parallelexecute.bat commandfile.txt [NUMBER_OF_PROCESSES]
REM e.g. parallelexecute.bat commandfile.txt 4
 
call :FINDPID
 
if ("%1")==("") goto :eof
set COMMANDFILE=%1
if not exist %COMMANDFILE% goto :eof
 
set MAXTHREADS=4
if not ("%2")==("") set /a MAXTHREADS=%2
if not %MAXTHREADS% gtr 0 goto :eof

REM To allow batfiles in the commandfile, wrap all commands in a call statement.
for /f "delims=" %%a in (%COMMANDFILE%) do echo Starting: %%a&&start /b cmd.exe /c call %%a >nul 2>nul &&call :CHECKTHREADCOUNT
call :WAITFORTHREADSTOFINISH
 
endlocal
@GOTO :eof
 
:FINDPID
rem Add the time to the PID search to be sure the right PID is found.
rem %~dp0 is used here to indicate the directory of the calling batfile.
rem %~dp$PATH:1 would search the PATH directories for a file - it could be slower.
for /F "tokens=1" %%A IN ('cscript /nologo %~dp0\pid.vbs /parent %time%') DO (set TEMPPID=%%A)
if ("%TEMPPID%")==("INVALIDPID") GOTO FINDPID
set PID=%TEMPPID%
GOTO :eof
 
:CHECKTHREADCOUNT
for /F "tokens=1" %%A IN ('cscript /nologo %~dp0\childprocesscount.vbs %PID%') DO (SET CHILDCMDCOUNT=%%A)
if ("%CHILDCMDCOUNT%")==("INVALIDPID") GOTO CHECKTHREADCOUNT
REM If for whatever reason we are over the limit, busy stall until we are not.
if %CHILDCMDCOUNT% geq %MAXTHREADS% GOTO CHECKTHREADCOUNT
set CHILDCMDCOUNT=
set TEMPPID=
GOTO :eof
 
:WAITFORTHREADSTOFINISH
for /F "tokens=1" %%A IN ('cscript /nologo %~dp0\childprocesscount.vbs %PID%') DO (SET CHILDCMDCOUNT=%%A)
if %CHILDCMDCOUNT% gtr 0 GOTO WAITFORTHREADSTOFINISH
set CHILDCMDCOUNT=
set TEMPPID=
GOTO :eof

This will run each line of the input file in a new cmd.exe process then immediately check the number of child cmd.exe processes. If this number is greater or equal to MAXTHREADS, it will stall until the number is lower. Once all commands have been run, it waits for the number to reach zero before exiting.

Results

I filled a commands.txt text file with 10 copies of ping google.com. Here's the output:

Parallel Batch Results
D:\temp>timerun.bat ping gooogle.com
 
Pinging gooogle.com [74.125.93.104] with 32 bytes of data
Reply from 74.125.93.104: bytes=32 time=81ms TTL=41
Reply from 74.125.93.104: bytes=32 time=77ms TTL=41
Reply from 74.125.93.104: bytes=32 time=78ms TTL=41
Reply from 74.125.93.104: bytes=32 time=81ms TTL=41
 
Ping statistics for 74.125.93.104:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 77ms, Maximum = 81ms, Average = 79ms
Elapsed msec: 3470
 
D:\temp>timerun.bat parallelexecute.bat commands.txt 1
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Elapsed msec: 36740
 
D:\temp>timerun.bat parallelexecute.bat commands.txt 2
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Elapsed msec: 18580
 
D:\temp>timerun.bat parallelexecute.bat commands.txt 4
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Elapsed msec: 11550
 
D:\temp>timerun.bat parallelexecute.bat commands.txt 10
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Starting: ping google.com
Elapsed msec: 6400
# Threads Time (ms)
1 36740
2 18580
4 11550
10 6400

So not a linear speed up, but most definitely parallel execution. The amount of speedup is very application and machine dependent. For my original version control issue, we saw linear speed up of checkouts to around 8 threads locally and 4 threads when remote.

Conclusion

Parallel execution in batch files is possible, even without additional libraries. For wrapping high-latency or single threaded applications, this can definitely provide some surprising speedups.

At the same time, though, you can achieve a much cleaner solution by writing a small utility in C#. Batch is commonly available and remarkable capable when you try hard, but try to use the best available tools when you can.

Future Work

My VB code is very hacky, especially concerning which PID to ignore. There is also the case where the return is occasionally invalid, which is a pain to handle. Cleaning this up would be a good step.

Notice that the batch file is redirecting STDOUT and STDERR to nul? If it didn't, the output of all the running commands would be piped to the screen simultaneously. Redirecting the output to a temp file or the like would be a good plan.

The return code of the commands run under start is lost. Wrapping the call in a separate script to record the return would be helpful.

Busy waiting in this way is wasteful, but there isn't a good sleep command in windows by default. A small sleep script would make the parent bat churn less.

Credits

When I originally started looking into this, I had a hard time finding any example code for PID identification that did not require running a custom .net app. Thanks to expertsexchange user DanBenway for providing some great examples.

 
blog/20110616_parallel_batch.txt · Last modified: 2011/06/16 21:17 by Jeremy Murray · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki