Max Pages

As it stands, our crawler is only useful on fairly small sites. Sites with thousands or hundreds of thousands of pages take too long to crawl.

Google's crawlers run concurrently on fleets of servers to crawl the majority of the internet. You might not have that kind of budget for this project.

Let's add a maxPages setting so that we can crawl even gigantic websites and have our tool automatically stop when it's done a reasonable amount of work.

Assignment

Add a maxPages integer to your config struct. This will be the maximum number of pages to crawl.
Update your crawlPage function to return immediately at the start of the function if the length of the pages map is greater than or equal to maxPages. Make sure you safely access the pages map with the mutex!
Make both the maxConcurrency and maxPages settings configurable via command-line args.
Now you should be able to run your program with this syntax:

go build -o crawler

# usage: ./crawler URL maxConcurrency maxPages
./crawler "https://example.com" 3 10

go build and ./crawler can also be replaced with go run . from your main package directory.

Make sure that your program prints a line to the console each time you crawl a page, as well as once per page in the report. This will help you see what your crawler is doing, and ensure you can kill it with ctrl+c if it's stuck in a loop or spamming requests.

Make sure that you're printing a line to the console each time you crawl a page, as well as once per page in the report.

Run and submit the CLI tests.