As it stands, our crawler is only useful on fairly small sites. Sites with thousands or hundreds of thousands of pages take too long to crawl.
Google's crawlers run concurrently on fleets of servers to crawl the majority of the internet. You might not have that kind of budget for this project.
Let's add a maxPages setting so that we can crawl even gigantic websites and have our tool automatically stop when it's done a reasonable amount of work.
go build -o crawler
# usage: ./crawler URL maxConcurrency maxPages
./crawler "https://example.com" 3 10
go build and ./crawler can also be replaced with go run . from your main package directory.
Make sure that your program prints a line to the console each time you crawl a page, as well as once per page in the report. This will help you see what your crawler is doing, and ensure you can kill it with ctrl+c if it's stuck in a loop or spamming requests.
Make sure that you're printing a line to the console each time you crawl a page, as well as once per page in the report.
Run and submit the CLI tests.