[Data-Mongers] web access tools
Nan Galbraith
ngalbraith at whoi.edu
Fri Apr 5 10:34:20 EDT 2013
Hi all -
I'm not sure which (if either) of these lists is active, so I'm
trying both - I hope mailman takes care of duplicates, but
if not, my apologies for cross posting.
A project I'm working on needs to pull resources from a lot
of different web servers at various agencies and research centers.
These will initially be images, but may include data files and
mixed format metadata later. These downloads will be done
on different schedules, and need to be automated - they'll be
run via cron.
Some downloads will need to be set up to generate a custom
URL, for sites where the resource name changes (e.g. to include
a date or some other index) and each download will be followed
by some post-processing to either modify images, add them
to time series, shuffle them into a file structure, or ... something
else, TBD.
When I've set up similar projects in the past, I've used wget or
lynx - they're easy to set up to be called from a shell script and
they send the output wherever you want. This time I'm thinking
of using python; I think it handles redirects and logins (which
may be required at some of my sites) more smoothly than the
others.
I just wanted to ask if anyone has a different solution for
this kind of project. It will be running under Mac OS X 10.8.
Thanks in advance for any suggestions -
Nan
--
*******************************************************
* Nan Galbraith (508) 289-2444 *
* Upper Ocean Processes Group Mail Stop 29 *
* Woods Hole Oceanographic Institution *
* Woods Hole, MA 02543 *
*******************************************************
More information about the Data-Mongers
mailing list