If you are either doing web scraping or working in API, you should be aware of the limitations of your data source. API can be subscription-based with daily/monthly limits on requests, and websites often have features that protect them from overloading by enthusiastic web scrapers.
For this end, keep three main tools close by when you scrape with Alteryx:
- Select Records: while you build and debug your flow, limit number or records you request from the remote site – also, your workflow will finish sooner.
- Cash it! Use “Run and cash feature” from the right-click menu on your Download tool to avoid repetitive requests.
- Save it! Put an output tool right after the Download and save your data – use, for instance, Alteryx database format to save it together with metadata, so you can continue where you left. Putting it right after the Download makes it handy if you make a mistake in one of the following tool: you won’t need to re-download external data again.
Anyway, I’ve downloaded some data from IMDb API to prepare this viz for our #dashboardweek. And I used all these tools. Enjoy!