Manually input a list of URLs to crawl

One crawl strategy is to crawl a specific set of URLs by manually inputting the URls you want to crawl - this is a useful for scenarios where the URLs are specific or unique and as a fallback for edge cases where more automated crawling options (like sourcing URLs from another kimono API or generating them) fail to meet your needs.

To set up a mult-page crawl and input manually URLs to crawl:

1. Create a kimono API that scrapes the detail you want to gather from each URL

2. From the detail page for your Desired Data API, select 'Manual URL List' from the crawl strategy drop down.

3. Enter or paste the URLs you want to crawl into this text box

Now you can set your API crawl settings to crawl the URLs on a schedule. Note that with multi-page crawling, you can either schedule your crawls or you can trigger them manually or with the kimono RESTful API.

Note that the pages you crawl must have the HTML same structure as the source URL. This is because kimono finds the element you want to extract on a page based on the CSS selectors for that element, if you try to crawl a page that does not have that CSS selector, your crawl will fail.

If you want to crawl multiple pages, you can also source URls from a kimono API or generate them by togglign path and query parameters.

Powered by Zendesk