2008-12-19

Asmodian's Workbench: The archive sorter

In the past when I have done backups for home computers, I used a cd-rom burner and dumped everything in a tarball. The problem is that the archive is full of stuff I don't need, want or remember anything about.

So to aid in figuring out whats what I turned to the wonderful unix command FILE.

The file command spits out what file format it thinks a given file is. Its does this through magic numbers.


#!/bin/bash
TEMP_DIR1=`mktemp -d -q /tmp/TMP1.XXXXXX`
TEMP_DIR2=`mktemp -d -q /tmp/TMP2.XXXXXX`
DATE=`date "+%m_%d_%y"`
ARC_FILE=$1
TF=`mktemp -q /tmp/TF.XXXXX`
cp $ARC_FILE $TF.tgz
tar -zx -C $TEMP_DIR1 -f $TF.tgz
NUM1=1
find=`find -X -P $TEMP_DIR1/. -type f`
for FILE in $find; do
TYPE=`file -b $FILE|tr [:space:][:cntrl:] \_`
FILTERED=`basename $FILE|tr [:space:][:cntrl:][:punct:] \.`
FILEFILTERED=`echo $NUM1$FILTERED`
mkdir -p $TEMP_DIR2/$TYPE
cp $FILE $TEMP_DIR2/$TYPE/$FILEFILTERED
echo $FILE >> $TEMP_DIR2/md5_file_list.txt
md5 $FILE >> $TEMP_DIR2/md5_file_list.txt
NUM1=`expr $NUM1 + 1`
done
mv $TEMP_DIR2 ~/recovered$DATE
rm -rf $TEMP_DIR1
rm $TF.tgz
rm $TF



This code has been tested on Mac OS X 10.5 . Linux handles the find and file commands differently. OS X either classifies it as a file (well gee now I know its a normal file... Thanks OS X, you were very helpful there...) or it responds with everything up to and including the picture size. Linux responds with some basics about the file or the whole mime-type, which comes in handy if you want to sub categorize. And I made use of the kick ass TR command (which Ax0n based a previous article about). The find command was useful too but once again there is a syntax difference between OSX and Linux.



Interesting Facts:
Wikipedia: Magic Numbers in files
Apple Man pages

blog comments powered by Disqus