Wednesday 31 August 2016

BASH Script: Compare File Sizes Within Given Date Range

BASH Script
After the Previous Post, let's discuss another important task. Compare Files / Folders sizes from one location to another. There are many ways to do this task in LINUX. Also one can find multiple commands as well. We will discuss one specific scenario as follows.




Let's assume that you want to compare Folders / Files from source to destination but only if destination location has the specified folders / files. Also this operation will be spanning for a range of dates or within start date and end date. Also one additional thing to remember is the directory structure. How you want to preserve your directory structure in destination location that also matters a lot. In our case, let's say that we have directories with dates (directory name is in the format of date. e.g. 20160101) in YYYYMMDD format. Do remember though that one can have any format for the directory as per their own requirements but I am going to discuss for above mentioned scenario only. If you want then you can change the code keeping in mind that your changes reflect the task you want to perform.

Summary of scenario,
1. Compare files size only if Destination location has the specified files. Here we will use "ls" command and "AWK" command.
2. Directory structure in the Destination location must match to Source location.
3. The Date range should be in YYYYMMDD Format.
4. Script should accept Date range as command line arguments.

Following is the sample code of the implementation. You can also get it from my GitHub repository.



#!/bin/bash
# This code compare file sizes within the given date range. 
# This code creates 3 output txt files as follows,
# 1. Matched Files
# 2. Mismatched Files,
# 3. MissingFiles.
# Please check src & dest path before you run the code.

StartDate=`date +"%Y%m%d" -d $1`  #"20160121
EndDate=`date +"%Y%m%d" -d $2`    #"20160123"

# Please change src and dest location as per your need.
src=/home/yogesh/bash-tp
dest=/home/yogesh/bash-tp/desti

if [ $# -ne 2 ]
then
    echo "Usage:`basename $0` Start_Date End_Date"
    echo "e.g. :bash CompareFiles.bash 20160621 20160625"
    exit $E_BADARGS
fi

if [[ ! -d $src || ! -d $dest ]]
then
    echo "Given source or destination path doesn't exist."
    exit $E_NOFILE
fi

echo "Now: "$StartDate
echo "End: "$EndDate
echo "src: "$src
echo "dest: "$dest

# Delete Files if already exists.
 for i in OutM*; do if [[  -f "$i" ]]; then rm -f $i; fi; done

function CompareFiles {

srcpath=$src/$StartDate/*

for srcfile in $srcpath
do

    destfile=${srcfile/$src/$dest} 

    echo "Srcfilepath: "$srcfile
    echo "Destfilepath: "$destfile

     if [ -f $destfile ]
     then

    filesize=`ls -l  $srcfile | awk '{print $5}'`
    destfilesize=`ls -l  $destfile | awk '{print $5}'`

    echo "SrcFilesize: "$filesize
    echo "DestFilesize: "$destfilesize
        if [ "$filesize" == "$destfilesize" ]
        then
        # File names which matches in size will be written into following file.
            echo $destfile >>OutMatchFiles.txt
        else
        # File names which do not matches in size will be written into following file.
            echo $destfile >>OutMismatchFiles.txt
        fi
     else
     # File names which do not exists in destination path will be written into following file.
        echo $destfile >>OutMissingFiles.txt
     fi
done #For Complete                                                                                                                                                                                                                      
    }

while [ "$StartDate" -le "$EndDate" ] ;
do
 
    echo "Date being Processed: "$StartDate

    CompareFiles

    StartDate=`date +"%Y%m%d" -d "$StartDate + 1 day"`;

done
echo "All Done"




A sample command to run the above program would be like as follows.


$ bash CompareFiles.bash 20160621 20160625


Let's discuss about the above mentioned code. The BASH Script takes two command line arguments. Start Date as First & End Date as Second. It checks that valid arguments are provided or not at the time of execution of script otherwise it exits without processing further with a "Usage" message. It also checks the existence of Source as well as Destination paths.

If both the IF conditions are satisfied then; It checks for destination file existence and compare file sizes between source and destination files. Also, please note here that we have used "ls" and "AWK" command which to get the file size. In LINUX, various ways are present to get the files size. We just used one of them.
Also, after execution of the code maximum 3 text files will be created in the working directory which will includes the list of Matched Files, Mis-matched Files and Missing Files.

Following is the sample output of the script for quick reference.


$ bash Compare-File.bash 20160101 20160103
Now: 20160101
End: 20160103
src: /home/yogesh/bash-tp
dest: /home/yogesh/bash-tp/desti
Date being Processed: 20160101
Srcfilepath: /home/yogesh/bash-tp/20160101/abc.txt
Destfilepath: /home/yogesh/bash-tp/desti/20160101/abc.txt
SrcFilesize: 3
DestFilesize: 3
Srcfilepath: /home/yogesh/bash-tp/20160101/func.bash
Destfilepath: /home/yogesh/bash-tp/desti/20160101/func.bash
SrcFilesize: 397
DestFilesize: 397
Date being Processed: 20160102
Srcfilepath: /home/yogesh/bash-tp/20160102/abcd.txt
Destfilepath: /home/yogesh/bash-tp/desti/20160102/abcd.txt
SrcFilesize: 3
DestFilesize: 3
Srcfilepath: /home/yogesh/bash-tp/20160102/func.bash
Destfilepath: /home/yogesh/bash-tp/desti/20160102/func.bash
SrcFilesize: 397
DestFilesize: 397
Date being Processed: 20160103
Srcfilepath: /home/yogesh/bash-tp/20160103/abcde.txt
Destfilepath: /home/yogesh/bash-tp/desti/20160103/abcde.txt
Srcfilepath: /home/yogesh/bash-tp/20160103/func.bash
Destfilepath: /home/yogesh/bash-tp/desti/20160103/func.bash
All Done



I hope you understood the discussion so far and liked the post.
I would like to Thank You for visiting the Website & going through the post. Stay tuned for more interesting stuff.

==>Posted By Yogesh B. Desai

Next Post: BASH Home Page

Previous Post: BASH Script: Copy Files From Source To Destination Within Given Date Range By RSYNC

2 comments: