Making the choice between accessing data using an API or downloading a snapshot of the data and storing it locally can be a hard decision.
Dataset snapshots are a point-in-time snapshot of any given dataset. So, unlike accessing data via an API, a snapshot downloaded to you local system will not reflect later refreshes of the data. It's important to understand that this doesn't mean data accessed via an API is a live feed from the data creator: many of the datasets on data.wa.gov.au are refreshed every night, but others are only updated weekly, monthly or less frequently. Using an API can be beneficial in either case, either to avoid re-downloading a frequently updated dataset, or to avoid missing a rare update to a mostly-static dataset. However for many users of data the ease of access to snapshots (i.e. simply downloading a file) is a worthwhile trade-off.
Deciding which option is best for you will vary from case to case -
APIs
Pros -
- Are always up-to-date with the latest data uploaded by data custodians.
- Can be simple to use in supported software. e.g. For geospatial data APIs, there are many common GIS and web-based mapping tools that integrate with APIs without needing to download, extract, and re-host the data yourself.
- Often support a range of advanced querying and filters options to enable fast and easy access to larger datasets without the need to download the whole dataset.
- Datasets which are completely open and public may be accessed without needing to register for an account.
Cons -
- Retrieving any moderately sized dataset may require many consecutive HTTP web requests to retrieve the data, merge it together into a single dataset, before saving it to your local system.
- Are inherently more prone to failure and slower response times due to the technical complexity inherent in receiving, parsing, querying, and translating data directly from the database.
Snapshots
Pros -
- Available in a variety of formats suitable for a wide range of desktop and web-based software.
- Are less prone to failure due to the lack of technical complexity - snapshots are simply hosted on a fileserver with none of the technical overhead of accessing databases or providing APIs.
- Are stored as ZIP files to reduce their file sizes (where relevant).
Cons -
- A local copy (e.g. a download) is static. You'll need to re-download the snapshot to get the latest data.
- It's not possible to filter snapshots before downloading them, except by clipping to pre-set areas. Accessing larger datasets may involving downloading several gigabytes in order to extract only a few megabytes of data.
For more information on clipping to an area when downloading snapshots, please see SLIP Data Download by Area of Interest.