Quick oneliner to get lots of linked files (e.g. mp3s) from a webpage

Obviously this will work for any type of file that one wants, e.g. ogg files. The need came from me trying to find great resources on Chinese language sites, and stumbling upon clearchinese.com which had a ton of mp3s buried in links. Here is the oneliner which I ran on my laptop running a Ubuntu Hardy/8.04:
for f in $(elinks -dump http://www.clearchinese.com/mp3-lessons/newbie/index.htm | sed -n 's%.*\(http:.*mp3\)$%\1%p' | sort -u ); do wget -N $f ; done


When looking for the linked URLs in a page, I tend to always use elinks or lynx to dump the output so I can observe the links and any unusual formatting, then use some standard scripting commands to pick out just the links into a list, then wrap a loop around it and use wget to download the 150 or so mp3s automatically. Using this technique an automated download oneliner can be generated very quickly (about 5 mins including the research for this one).

Comments