Combine pagination and a multi-page crawling strategy

A powerful new feature to crawl data at scale is to have your kimono API crawl specific URLs  - whether it be from sourcing the URLs from another kimono API, generating the URLs or manually inputting the URLs - and then follow next or more links from those pages. 

To crawl and then paginate through pages:

1. Create a kimono API that scrapes the detail you want to gather from each URL with the pagination element selected. You can read more about how to set up a paginated API here

2. From the detail page for your desired data API, select your crawl strategy.

Once you hit start crawl, your kimono API will crawl every page you have specified, and then follow the pagination link for each of those pages.

Now you can set your API crawl settings to set the frequency (auto-run a crawl of the URLs) and set the pagination limit. Note that with multi-page crawling, you can either schedule your crawls or you can trigger them manually or with the kimono RESTful API.

Note that the pages you crawl must have the same structure. This is because kimono finds the element you want to extract on a page based on the CSS selectors for that element, if you try to crawl a page that does not have that CSS selector, your crawl will fail.

Alternatively spelled pagenation.

Powered by Zendesk