Clean Dated Directories in Batch

Log files, CI build artifacts, daily reports - I seem to have an ever growing list of files and folders that are important for a week or so and then need to be purged. Seems like a good job for a batch file.

Step 1 was making sure that the side effects were all grouped into files or folders by date.

Step 2 took a little bit of batch date math.

Requirements

  • Perform an operation on a set of files or folders that are named by date. (e.g., Log-2011-07-01)
  • Limit the operation to a range of dates between now and X days ago, possibly omitting the most recent Y days. (e.g., delete folders with dates older than a week ago)

Investigation

Assuming our target files or directories have a set date format, if we can get a list of dates relative to the current one, we can use that as input to another batchfile to either move, zip, or erase those targets individually. There doesn't seem to be any built in mechanism for date math in batch without calling into another executable or VBS.

Date math is a pretty complex subject, but if we limit the output to only dates and do not consider time or days of the week, the only special case we need to be concerned about is leap years and the number of days in February.

Implementation

To get a list of recent dates, we start with pulling the current date from %date% and parsing it to find individual values for day, month, and year. We then count backwards by subtracting days. When we rollover a month, we check to see how many days the new month has. If the month is February, we do some extra checking to see if it is a leap year or not.

Since the primary usecase is to keep Y number of days and delete X older ones, we optionally take a second argument of a number of most recent days to skip.

|h recentdates.bat
@setlocal
@echo off
REM recentdates.bat [DAYSTOPRINT] [DAYSTOSKIP]
REM Prints a list of dates, one per line, starting with today and going backwards.
REM By default, it will print the most recent 7 days, including today.
REM
REM murrayjw 2010-10-04 New.
 
set /a DAYSTOPRINT=7
set /a DAYSTOSKIP=0
 
if not ("%1") == ("") set /a DAYSTOPRINT=%1
if not ("%2") == ("") set /a DAYSTOSKIP=%2
 
if %DAYSTOPRINT% lss 1 goto END
 
for /f "tokens=2-4 delims=/ " %%i in ("%date%") do set /a CURRENTYEAR=%%k&&set /a CURRENTMONTH=1%%i - 100&&set /a CURRENTDAY=1%%j - 100
goto PRINTDATE
 
:MINUSDAY
if %CURRENTDAY% equ 1 goto MINUSMONTH
set /a CURRENTDAY=%CURRENTDAY% - 1
goto PRINTDATE
 
:MINUSMONTH
if %CURRENTMONTH% equ 1 goto MINUSYEAR
set /a CURRENTMONTH=%CURRENTMONTH% - 1

REM Default to 31 days, check 30 day months, and special case February.
set /a CURRENTDAY=31
if %CURRENTMONTH% equ 4 set /a CURRENTDAY=30 && goto MINUSMONTHEND
if %CURRENTMONTH% equ 6 set /a CURRENTDAY=30 && goto MINUSMONTHEND
if %CURRENTMONTH% equ 9 set /a CURRENTDAY=30 && goto MINUSMONTHEND
if %CURRENTMONTH% equ 11 set /a CURRENTDAY=30 && goto MINUSMONTHEND
if %CURRENTMONTH% neq 2 goto MINUSMONTHEND

REM Calculate leap years for Feb 29.
REM if year modulo 400 is 0 then is_leap_year
set /a CURRENTYEARMOD400=%CURRENTYEAR% %% 400
if %CURRENTYEARMOD400% equ 0 set /a CURRENTDAY=29 && goto MINUSMONTHEND
REM else if year modulo 100 is 0 then not_leap_year
set /a CURRENTYEARMOD100=%CURRENTYEAR% %% 100
if %CURRENTYEARMOD100% equ 0 set /a CURRENTDAY=28 && goto MINUSMONTHEND
REM else if year modulo 4 is 0 then is_leap_year
set /a CURRENTYEARMOD4=%CURRENTYEAR% %% 4
if %CURRENTYEARMOD4% equ 0 set /a CURRENTDAY=29 && goto MINUSMONTHEND
REM else not_leap_year
set /a CURRENTDAY=28
goto MINUSMONTHEND
 
:MINUSMONTHEND
goto PRINTDATE
 
:MINUSYEAR
set /a CURRENTYEAR=%CURRENTYEAR% - 1
set /a CURRENTMONTH=12
set /a CURRENTDAY=31
goto PRINTDATE
 
:PRINTDATE
if %DAYSTOSKIP% gtr 0 set /a DAYSTOSKIP=%DAYSTOSKIP%-1 && goto PRINTDATEEND
 
set /a DAYSTOPRINT=%DAYSTOPRINT%-1
set CURRENTMONTHSTR=%CURRENTMONTH%
set CURRENTDAYSTR=%CURRENTDAY%
call :ZERO_PAD_WIDTH_2 CURRENTMONTHSTR %CURRENTMONTHSTR%
call :ZERO_PAD_WIDTH_2 CURRENTDAYSTR %CURRENTDAYSTR%
echo %CURRENTYEAR%-%CURRENTMONTHSTR%-%CURRENTDAYSTR%
if %DAYSTOPRINT% equ 0 goto END
goto PRINTDATEEND
 
:PRINTDATEEND
goto :MINUSDAY
 
:ZERO_PAD_WIDTH_2
if %~2 LSS 10 set "%~1=0%~2"
goto :eof
 
:END
endlocal

This could be customized for individual date setups (like MM-DD-YYYY, for example) by just changing :PRINTDATE. Also look out for localization issues when reading %date%.

We can wrap a call to recentdates.bat in another batch to delete the folders with those date names. By default, I like to keep only the last week. Since we're running this fairly often, we can restrict the number of dates it returns to the last month, just to keep it small and still allow for some accidental script downtime.

h|removeoldbuilds.bat
@echo off
@setlocal

REM removeoldbuilds.bat [BASEFOLDER]
REM Finds a list of recent dates and removes directories with those names.
 
if ("%1") == ("") goto ERROR_USAGE
set BASEPATH=%1
 
if not exist %BASEPATH% goto ERROR_BASE_MISSING
 
for /f %%a in ('call recentdates.bat 30 8') do if exist %BASEPATH%\%%a rmdir /s /q %BASEPATH%\%%a && echo %BASEPATH%\%%a (deleted)
goto END_SUCCESS
 
:ERROR_BASE_MISSING
echo %BASEPATH% does not exist.
goto END_FAIL
 
:ERROR_USAGE
echo removeoldbuilds.bat [BASEFOLDER]
goto END_FAIL
 
:END_FAIL
endlocal
exit /b 1
 
:END_SUCCESS
endlocal
exit /b 0

Results

recentdates.bat output
C:\>recentdates
2011-07-05
2011-07-04
2011-07-03
2011-07-02
2011-07-01
2011-06-30
2011-06-29
 
C:\>recentdates 2
2011-07-05
2011-07-04
 
C:\>recentdates 10 8
2011-06-27
2011-06-26
2011-06-25
2011-06-24
2011-06-23
2011-06-22
2011-06-21
2011-06-20
2011-06-19
2011-06-18
removeoldbuilds.bat Jenkins output
A timer trigger started this job
Building remotely on buildmachine1
 
d:\hudsonslave\workspace\Cluster - Clean Output Folders>call removeoldbuilds.bat \\fileshare\builds\cluster
\\fileshare\builds\cluster\2011-06-27 (deleted)
Finished: SUCCESS

Conclusion

This works great. I put removeoldbuilds.bat in a Jenkins job for each of the temporary destinations. It outputs which folders it deleted and lets me know of any issues. Having a simple script to do this clean up has saved me some recurring headaches over disk space. It also lets me publish data retention policies and avoid misunderstandings.

It has also proved surprisingly versatile - it only took a couple of small edits to use recentdates.bat to mass-rename a bunch of dated files from MMDDYYYY to YYYYMMDD and then zip them by month.

 
blog/20110705_clean_dated_directories_in_batch.txt · Last modified: 2011/07/05 20:11 by Jeremy Murray · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki