Json
The Json step has three static methods:
- Json::all()to extract the whole JSON object
- Json::get()to cherry pick properties from the JSON object
- and Json::each()to extract multiple items from the JSON object
Json::all()
use Crwlr\Crawler\HttpCrawler;
use Crwlr\Crawler\Steps\Json;
use Crwlr\Crawler\Steps\Loading\Http;
$crawler = HttpCrawler::make()->withUserAgent('MyCrawler');
$crawler
    ->input('https://www.example.com/json')
    ->addStep(Http::get())
    ->addStep(Json::all());Json::get()
The Json::get() method works pretty much like the extract method of the Html and Xml steps. Thanks to
adbario/php-dot-notation extracting data from JSON documents is really simple. Given the URL https://www.example.com/json responds with the following JSON:
{
    "data": {
        "something": "yolo",
        "target": {
            "foo": "Lorem ipsum",
            "bar": "dolor sit",
            "array": [
                { "baz": "zero" },
                { "baz": "one" },
                { "baz": "two" }
            ]
        }
    }
}Cherry-pick your desired properties like this:
use Crwlr\Crawler\HttpCrawler;
use Crwlr\Crawler\Steps\Json;
use Crwlr\Crawler\Steps\Loading\Http;
$crawler = HttpCrawler::make()->withUserAgent('MyCrawler');
$crawler
    ->input('https://www.example.com/json')
    ->addStep(Http::get())
    ->addStep(
        Json::get([
            'foo' => 'data.target.foo',
            'bar' => 'data.target.array.1.baz',
        ])
    );The output of the JSON step then is:
array(2) {
  ["foo"]=>
  string(11) "Lorem ipsum"
  ["bar"]=>
  string(3) "one"
}Json::each()
You can also extract multiple items from an array in the JSON object, by using the each method. Let's say the JSON looks like this:
{
    "list": {
        "people": [
            { "name": "Hans Zimmer", "age": { "years": 66 }, "home": "US" },
            { "name": "John Williams", "age": { "years": 92 }, "home": "US" },
            { "name": "Alan Silvestri", "age": { "years": 73 }, "home": "US" }
        ]
    }
}You can get the names and ages like this:
use Crwlr\Crawler\HttpCrawler;
use Crwlr\Crawler\Steps\Json;
use Crwlr\Crawler\Steps\Loading\Http;
$crawler = HttpCrawler::make()->withUserAgent('MyCrawler');
$crawler
    ->input('https://www.example.com/json')
    ->addStep(Http::get())
    ->addStep(
        Json::each(
            'list.people',
            [ // provide the data mapping as second argument to the each() method.
                'name' => 'name',
                'age' => 'age.years'
            ]  
        )
    );This yields 3 separate outpus:
array(2) {
  ["name"]=>
  string(11) "Hans Zimmer"
  ["age"]=>
  int(66)
}
array(2) {
  ["name"]=>
  string(13) "John Williams"
  ["age"]=>
  int(92)
}
array(2) {
  ["name"]=>
  string(14) "Alan Silvestri"
  ["age"]=>
  int(73)
}