Packages
crwlr / crawler
The main package of this collection, providing kind of a framework and a lot of ready to use, so-called steps, that you can use to build your own web crawlers and scrapers with.
crwlr / url
The Swiss Army knife for urls. Parses urls to components (scheme, host, domain, path,...). You can access and modify url components, compare components of different urls and resolve relative to absolute urls. Also supports internationalized domain names.
crwlr / robots-txt
Use this library within crawler and scraper programs to parse robots.txt files and check if your crawler user-agent is allowed to load certain paths.