wget example tips and tricks for command line download
Some times before I was looking for command line download tool which can download in background. Going through the man page of wget, a lot of interesting features I noticed, which we are not familiar much. Now a days I am using those features in my day to day linux life.
wget is a highly used tool to manage your internet download from command line. But many of us don’t utilize it to its full capacity. wget makes your command line life more easier and both resource and time saving. I believe you need not to run a browser just for download, which consumes a consderable CPU time and memory. Apart from this download managers in browsers are less featured than command line download manager like wget. Here I am going to present a few less used but highly useful and handy features.
- Background download (wget -b)
- Continuing the broken partial download (wget -c)
- Retry attempts setting (wget –tries=<n>)
- Rename the file downloaded (wget -O <filename>)
- Bulk download (wget -i <input-file>)
- Mimic wget as your browser (wget –user-agent=<string>)
- Limit the download speed (wget –limit-rate=<speed>)
- Downloading behind a Proxy
A typical usage of wget is,
$ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-184.108.40.206.tar.bz2
This command just downloads the file pointed by the url given.
Background download (wget -b)
$ wget -b http://www.kernel.org/pub/linux/kernel/v2.6/linux-220.127.116.11.tar.bz2
This is command is highly useful when you initiate a download in remote machine through SSH terminal. This will start downloading in background, so that you can disconnect the terminal once the command is issued.
Continuing the broken partial download (wget -c)
$ wget -c http://www.kernel.org/pub/linux/kernel/v2.6/linux-18.104.22.168.tar.bz2
This switch is useful for big downloads, which may get interrupted before completion. If already there is a file with this name already, it will check its size and start downloading the remaining portion of the file instead of full download again.
Retry attempts setting (wget –tries=<n>)
$ wget –tries=10 http://www.kernel.org/pub/linux/kernel/v2.6/linux-22.214.171.124.tar.bz2
It is obvious from the switch name, it retries number of times mentioned to download the file, in case the server is busy and slow.
Rename the file downloaded (wget -O <filename>)
$ wget -O latest-kernel.tar.bz2 http://www.kernel.org/pub/linux/kernel/v2.6/linux-126.96.36.199.tar.bz2
This switch will download the file and store with the file name mentioned in command line.
Bulk download (wget -i <input-file>)
$ wget -i to_download.txt
Content of to_download.txt
The above example is to download the files listed in input file called to_download.txt. You can store list of URLs in this file.
Mimic wget as your browser (wget –user-agent=<string>)
Some web sites validates the download client and deliver the content according to it. For example, when you visit firefox site from Linux machine, it automatically provides firefox for linux download link. But in some cases, the web server doesn’t allow less used download tool like wget. In this case you can mimic wget as your browser like firefox. Here is how I do it.
Get your browser’s user agent by visiting “http://getright.com/useragent.html”. Mine is “Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/20100101 Firefox/5.0”.
$ wget –user-agent=”Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/20100101 Firefox/5.0″ http://www.kernel.org/pub/linux/kernel/v3.0/testing/patch-3.0-rc4.bz2
Limit the download speed (wget –limit-rate=<speed>)
$ wget –limit-rate=1k http://www.kernel.org/pub/linux/kernel/v3.0/testing/patch-3.0-rc4.bz2
In my home networking, my 24×7 NAS server use to downloads files. During this time, my internet access from laptop may be slow if the server download occupies entire bandwidth. In this case I restrict the server download speed with –limit-rate switch. In the above case, it is 1kB/s.
Downloading behind a Proxy
You need to set the Proxy server through http_proxy environment variable as shown in the following example, which sets proxy server as “myproxyserver” and proxy port as “8080”.
$ export http_proxy=”http://myproxyserver:8080″
In case you need to get authenticated by proxy server, you can use the following form.
$ export http_proxy=”http://username:password@myproxyserver:8080″