Tyler Smith

Downloading a webpage and all of its assets with wget

A friend of mine has a single-page website that he hasn't updated in over a year. He'd like to keep the website, but he'd also like to save $20 a month. I told him I could probably help him get it on Netlify since he never changes it.

I needed a way to download the page with all of its assets. Chrome's "Save as..." menu option wasn't working: it wouldn't download content from the CDN because it was on a different domain. I thought wget might be a good option.

Here is the command I ultimately ended up using:

wget --page-requisites --convert-links --span-hosts --no-directories https://www.example.com

To go through the arguments one-by-one:

If you open index.html the assets will be broken: --convert-links doesn't seem to make these relative to the root directory. So to view the page, you'll need to start a webserver in the download directory. You can use the following command:

python3 -m http.server

The output is pretty messy and it might be quicker to just build something with Tailwind than clean this download up, but at least I know how to do this now.