diff --git a/subjects/wget/README.md b/subjects/wget/README.md index aa6b49b5..be844ffb 100644 --- a/subjects/wget/README.md +++ b/subjects/wget/README.md @@ -4,7 +4,7 @@ This project objective consists on recreating some functionalities of [`wget`](https://www.gnu.org/software/wget/manual/wget.html) using **Go** -This functionalities will include: +These functionalities will include: - The normal usage of `wget`, downloading a file given an URL, example: `wget https://some_url.ogr/file.zip` - Downloading a single file and saving it under a different name @@ -12,8 +12,8 @@ This functionalities will include: - Set the download speed, limiting the rate speed of a download - Continue interrupted downloads - Downloading a file in background -- Downloading multiple files at the same time, by reading a file containing multiple download links. All this asynchronously -- Main feature, will be to download an entire website, [mirroring a website](https://en.wikipedia.org/wiki/Mirror_site). +- Downloading multiple files at the same time, by reading a file containing multiple download links asynchronously +- Main feature, will be to download an entire website, [mirroring a website](https://en.wikipedia.org/wiki/Mirror_site) ### Introduction @@ -23,7 +23,7 @@ To see more about wget you can visit the manual by using the command `man wget`, #### Usage -Your program must have as arguments the link from were you want to download the file, for instance: +Your program must have as arguments the link from where you want to download the file, for instance: ```console student@student$ ./wget https://pbs.twimg.com/media/EMtmPFLWkAA8CIS.jpg @@ -33,13 +33,13 @@ The program should be able to give feedback, displaying the: - Time that the program started, this must include the following format **yyyy-mm-dd hh:mm:ss** - Status of the request. For the program to proceed to the download it must present a response to the request as status OK (`200 OK`) if not it should say which status it got and finish the operation with an error warning -- Size of the content downloaded, The content length can be presented as raw (bytes) and rounded to Mb or Gb depending on the size of the file downloaded +- Size of the content downloaded, the content length can be presented as raw (bytes) and rounded to Mb or Gb depending on the size of the file downloaded - Name and path of the file that is about to be saved - A progress bar, having the following: - - A amount of `KiB` that was downloaded + - A amount of `KiB` or `MiB` (depending on the download size) that was downloaded - A percentage of how much was downloaded - Time that remains to finish the download -- Time the download finished respecting the previous format +- Time that the download finished respecting the previous format It should look something like this @@ -78,7 +78,7 @@ student@student$ ls -l --- -2. It should also handle the path to were your file is going to be saved using the flag `-P` followed by the path to where you want to save the file, example +2. It should also handle the path to where your file is going to be saved using the flag `-P` followed by the path to where you want to save the file, example: ```console student@student$ go run main.go -P=~/Downloads/ -O=meme.jpg https://pbs.twimg.com/media/EMtmPFLWkAA8CIS.jpg @@ -102,9 +102,11 @@ student@student$ ls -l ~/Downloads/meme.jpg student@student$ go run main.go --rate-limit=400k https://pbs.twimg.com/media/EMtmPFLWkAA8CIS.jpg ``` +This flag should accept different value types, example: k and M. So you can put the rate limit as `rate-limit=200k` or `rate-limit=2M` + --- -4. Downloading different files should be possible, for this the program will receive `-i` flag followed by a file name that will contain all links that are to be downloaded. Example: +4. Downloading different files should be possible. For this the program will receive `-i` flag followed by a file name that will contain all links that are to be downloaded. Example: ```console student@student$ ls @@ -125,7 +127,18 @@ The Downloads should work asynchronously, it should download both files at the s --- -5. [**Mirror a website**](https://en.wikipedia.org/wiki/Mirror_site), this option should download the entire website being possible to use "part" of the website offline and for other useful [reasons](https://www.quora.com/How-exactly-does-Mirror-Site-works-and-how-it-is-done). For this you will have to download the websites file system and save it into a folder that will have the domain name. Example: `http://www.example.com`, the folder name will be `www.example.com` containing every file from the mirrored website. +5. [**Mirror a website**](https://en.wikipedia.org/wiki/Mirror_site), this option should download the entire website being possible to use "part" of the website offline and for other useful [reasons](https://www.quora.com/How-exactly-does-Mirror-Site-works-and-how-it-is-done). For this you will have to download the websites file system and save it into a folder that will have the domain name. Example: `http://www.example.com`, the folder name will be `www.example.com` containing every file from the mirrored website. The flag should be `--mirror`. + +To mirror a website you will have to implement the following `wget` flags so that the web mirror is complete (you do not need to do the literal flags, but just the theory behind it, so your flag `--mirror` need to behave like the following wget flags combined): + +- [`--mirror`](https://www.gnu.org/software/wget/manual/wget.html) download recursive +- [`--convert-links`](https://www.gnu.org/software/wget/manual/wget.html), after the download is complete it will convert all links in the document to make them suitable for local viewing +- [`--page-requisites`](https://www.gnu.org/software/wget/manual/wget.html), downloads all files that are necessary to properly display a given HTML page +- [`--no-parent`](https://www.gnu.org/software/wget/manual/wget.html), this will not let the program ascend to the parent directory when retrieving + +### Hint + +You can take a look into the [html package](https://godoc.org/golang.org/x/net/html) for some help --- diff --git a/subjects/wget/audit/README.md b/subjects/wget/audit/README.md index e69de29b..d57a6054 100644 --- a/subjects/wget/audit/README.md +++ b/subjects/wget/audit/README.md @@ -0,0 +1,89 @@ +#### Functional + +##### Try to run the following command "`./wget https://pbs.twimg.com/media/EMtmPFLWkAA8CIS.jpg`" + +###### Did the program download the file "`EMtmPFLWkAA8CIS.jpg`"? + +##### Try to run the following command with a link at your choice "`./wget `" + +###### Did the program download the expected file? + +##### Try to run the following command "`./wget https://golang.org/dl/go1.15.linux-amd64.tar.gz`" + +###### Did the program download the file "`go1.15.linux-amd64.tar.gz`"? + +###### Did the program displayed the start time? + +###### Did the start time and the end time respected the format? (yyyy-mm-dd hh:mm:ss) + +###### Did the program displayed the status of the response? (200 OK) + +###### Did the Program displayed the content length of the download? + +###### Is the content length displayed as raw (bytes) and rounded (Mb or Gb)? + +###### Did the program displayed the name and path of the file that was saved? + +##### Try to download a big file, for example: "`./wget http://ipv4.download.thinkbroadband.com/100MB.zip`" + +###### Did the program download the expected file? + +###### While downloading, did the progress bar show the amount that is being downloaded? (KiB or MiB) + +###### While downloading, did the progress bar show the percentage that is being downloaded? + +###### While downloading, did the progress bar show the time that remains to finish the download? + +###### While downloading, did the progress bar progressed smoothly (kept up with the time that the download took to finish)? + +##### Try to run the following command, "`./wget -O=test_20MB.zip http://ipv4.download.thinkbroadband.com/20MB.zip`" + +###### Did the program downloaded the file with the name "`test_20MB.zip`"? + +##### Try to run the following command, "`./wget -O=test_20MB.zip -P=~/Downloads/ http://ipv4.download.thinkbroadband.com/20MB.zip`", then go to the folder "`~/Downloads/`" + +###### Can you see the file downloaded? + +##### Try to run the following command, "`./wget --rate-limit=300k http://ipv4.download.thinkbroadband.com/20MB.zip`" + +###### Was the download speed always lower than 300KB/s? + +##### Try to run the following command, "`./wget --rate-limit=700k http://ipv4.download.thinkbroadband.com/20MB.zip`" + +###### Was the download speed always lower than 700KB/s? + +##### Try to run the following command, "`./wget --rate-limit=2M http://ipv4.download.thinkbroadband.com/20MB.zip`" + +###### Was the download speed always lower than 2MB/s? + +##### Try to create a text file with the name "`downloads.txt`" and save into it the links below. Then run the command "`./wget -i=downloads.txt`" + +``` +https://pbs.twimg.com/media/EMtmPFLWkAA8CIS.jpg +http://ipv4.download.thinkbroadband.com/20MB.zip +http://ipv4.download.thinkbroadband.com/10MB.zip +``` + +###### Did the program download all the files from the downloads.txt file? (EMtmPFLWkAA8CIS.jpg, 20MB.zip, 10MB.zip) + +###### Did the downloads occurred in an asynchronous way? (tip: look to the download order) + +#### Mirror + +##### Try to run the following command "`./wget --mirror http://corndog.io/`", then try to open the "`index.html`" with a browser + +###### Is the site working? + +##### Try to run the following command "`./wget --mirror https://theuselessweb.com/`" + +###### Is the site working? + +##### Try to run the following command to mirror a website at your choice "`./wget --mirror `" + +###### Did the program mirror the website? + +#### Bonus + +###### +Does the project runs quickly and effectively? (Favoring recursive, no unnecessary data requests, etc) + +###### +Does the code obey the [good practices](https://public.01-edu.org/subjects/good-practices/README.md)?