Recent news, I had to reinstall SUSE linux on my 64-bit PC. So this means I have to get my tools that I use often, including REXX scripts. Previously I used a REXX script for reading a text file and downloading a series of files based on a filter, such as files x001.jpg to x030.jpg. The script did it effectively but was not flexible. So I decided to make the next generation version of the script. One that can handle more complex download sequences without modifying the code.
I took the modular approach, separating the program into two parts, the core download processing script and a higher level simple file filtering script that offers the ease of use as my old script.
The new core script is flexible but uses a more complex syntax that is very linux like, taking s syntax as follows:
dlfiles –start n1 –stop n2 –lz n3 –dest s1 –src s2
where….
n1 says where to start its numeric counter for downloading
n2 says where to stop its numeric counter for downloading
n3 says to use leading zeros and how many to use… eg… 001 002 003 … 099
s1 says to what directory to store the downloads
s2 is the source url of the downloads where the counting section is replaced with
a symbolic “~~”
Thus to download files x0001.txt to x0040.txt from a server the command to use would be
dlfiles –start 1 –stop 40 –lz 2 –src http://server/x00~~.txt
The second script called “getem” works by reading a specified text file and parsing the contents and then calling dlfiles with the correct parameters. Thus the file would contain records such as…
Docs 1, http://server1/doc~~.doc
Docs 2. http://server2/text~~.odt
It uses the dlfiles default settings… starting with 1 and ending with 30, no leading zeros but tells it to store the files in the Docs 1 and Docs 2 folders, if these folders do not exist, they are created by the download program WGET, of which it is dependent for downloading.
The main error that I got from this script was the limited parsing ability of REXX and its inability to handle parameters with spaces even if delimited by quotation marks, which resulted in that I had to create my own parsing code.
The end result is the following two programs that would be listed. Of course, the usual copyright notices require giving me credit if any of it is used for your legal purposes. If illegal, please do not include me in your illicit activities.
Btw. Why do I use REXX instead of sh or perl or python? Simple, I am an old OS/2 user, and a lot of the REXX programming I learned then is still my mainstay, as a result I install REXX on any computer I am using, including Windows and Linux machines. I currently use and support Open Object Rexx, as this is the direct IBM descendant as that was used on the OS/2 PCs and the OS/400 minicomputers.
— dlfiles —
#!/usr/bin/rexx /* dlfiles (c) 2007 Dion Mohammed */ parse arg params DEBUG = 0 v. = "" call proc_args s = 'wget --tries=10 --continue --timeout=10 --directory-prefix="'v.dest'/"' c = v.start do until c > v.stop parse value v.src with v.prefix "~~"v.postfix if v.lzeros\=0 then do fspec=v.prefix""right(c, v.lzeros,"0")""v.postfix end else do fspec=v.prefix""format(c)""v.postfix end dncmd=s' "'fspec'"' IF DEBUG=1 then do say "DLFILES ==> "dncmd say "V.SRC: "v.src say "PREFIX: "v.prefix say "POSTFIX:"v.postfix end else dncmd c = c + 1 end exit proc_args: procedure expose params v. SE = "Syntax Error: " x=1 i=1 v.dest = "." v.start=1 v.stop=30 v.lzeros=0 v.src="" do while x\=0 y=pos("--", params, x) x=pos("--", params, y+1) if x=0 then do w.i = substr(params,y) end else do w.i = substr(params,y,(x-y)) end i = i + 1 end i = i - 1 do c=1 to i parse value w.c with operand data select when operand="--start" then v.start=data when operand="--stop" then v.stop=data when operand="--lz" then v.lzeros=data when operand="--dest" then v.dest=data when operand="--src" then v.src=data otherwise end end /* Validifying */ if v.src="" then do say SE "Error processing Source" exit end if v.dest="" then do say SE "Error processing Destination" exit end if datatype(v.start)\="NUM" then do say SE "Start value must be numeric" exit end if datatype(v.stop)\="NUM" then do say SE "Stop value must be numeric" exit end if datatype(v.lzeros)\="NUM" then do say SE "Leading zeros must be numeric" exit end return
-- getem -- #!/usr/bin/rexx /* GETEM (c) 2007 Dion Mohammed process download index file */ parse arg indexfile . do while lines(indexfile)>0< currline = linein(indexfile) parse value currline with storepath", "fileurl cmdline = 'dlfiles --dest "'storepath'" --src "'fileurl'"' say "GETEM ==> "cmdline cmdline end exit
These are released for free use, with respect to the original owner of the source code. You are free to use and modify for your use, with acknowledgment given to me.