Downloading files from a webserver, and failing.
Recently I wanted to download all the transcripts of a podcast (600+ episodes). The transcripts are simple txt files so in a way I am not even ‘web’-scraping but just reading in 600 or so text files which is not really a big deal. I thought.
This post shows you where I went wrong
Also here is a picture I found of scraping.
Webscraping general For every download you ask the server for a file and it returns the file (this is also how you normally browse the web btw, your browser requests the pages).
[Read More]