Bookmark and Share

Download web pages recursively under an URL1

 wget \
 --recursive \
 --no-clobber \
 --page-requisites \
 --html-extension \
 --convert-links \
 --restrict-file-names=windows \
 --domains example.com \
 --no-parent \
     www.example.com/subdirectory/
  • Substitute example.com and www.example.com/subdirectory/ with relevant expressions in your problem.
  • --recursive: download the entire Web site.
  • --domains website.org: don't follow links outside website.org.
  • --no-parent: don't follow links outside the directory tutorials/html/.
  • --page-requisites: get all the elements that compose the page (images, CSS and so on).
  • --html-extension: save files with the .html extension.
  • --convert-links: convert links so that they work locally, off-line.
  • --restrict-file-names=windows: modify filenames so that they will work in Windows as well.
  • --no-clobber: don't overwrite any existing files (used in case the download is interrupted and resumed).

References


  1. linuxjournal.com. Downloading an Entire Web Site with wget. 2008. https://www.linuxjournal.com/content/downloading-entire-web-site-wget
blog comments powered by Disqus