Documentation for crwlr / crawler (v0.2)

Attention: You're currently viewing the documentation for v0.2 of the crwlr/crawler package.

The newest version of this page is available in v0.5. However, this page no longer exists in the latest version of the package documentation (v3.5), which likely means the feature or topic was removed or significantly changed.

You can click here to view the newer version of this page (v0.5) or, if you're planning to use a more recent version of the package, please check the release notes on GitHub.

Loggers

You can add a logger to your crawler to get log output and see what it is/was doing. It takes any implementation of the PSR-3 LoggerInterface.

use Crwlr\Crawler\HttpCrawler;
use Psr\Log\LoggerInterface;

class MyCrawler extends HttpCrawler
{
    protected function logger(): LoggerInterface
    {
        return new MyLogger();
    }

    // user agent...
}

As you can see the method to add a logger to the crawler is via the protected logger() method. It's called only once in the constructor of the Crawler class, and then the logger instance is automatically handed over to every step that you add to the crawler.

As default, the Crawler class uses the CliLogger shipped with the package that just echoes the log lines. So you only need to set your own if you want to use a different logger.

The included steps always log some information about what they are doing. In your custom steps you can use the logger via $this->logger. That's the same in all callbacks that are bound to a step, like the withInput() hook in loops and updateInputUsingOutput() in groups.