Much of the geospatial data you'll find on data.wa.gov.au is provided through SLIP - the Shared Location Information Platform - which provides you with a range of different ways for accessing geospatial data, including APIs and dataset snapshots.
Deciding whether you should use an API or a dataset snapshot can be difficult, and will vary from case to case. See our article Should I use APIs or should I download data directly? for further guidance before continuing with the rest of this article.
This article will focus on how data professionals and developers can access data snapshots - with an emphasis on automating the downloading of data snapshots. For more information about accessing data and the APIs available on data.wa.gov.au please see Available APIs & data formats.
At the time of writing data snapshots are available in two officially supported formats, with several other formats available as an unsupported pre-release for early adopters to test -
Shapefile: An older but still very popular format for spatial datasets. Easy to read, manipulate, and convert to other data formats using spatial software packages.
Service Map Package: A proprietary format created by the spatial software company, ESRI, for storing spatial data and their associated styles. Only accessible to users with ArcGIS Desktop or ArcGIS Pro.
Pre-release formats are not yet discoverable on data.wa.gov.au and may be accessed by browsing to the folder for each dataset. e.g. https://maps.slip.wa.gov.au/datadownloads/SLIP_Public_Services/People_and_Society/ShipwrecksWAM_002/
Accessing Data Snapshots
Data snapshots for geospatial data are discoverable via data.wa.gov.au alongside the other resources associated with each dataset.
For example, the Shipwrecks dataset has both Shapefile and Service Map Package data snapshots available -
You can browse all of the geospatial datasets available as data snapshots through the SLIP Data Snapshots group on data.wa.gov.au.
How do I know if I'm accessing a SLIP data snapshot?
While the SLIP platform is the main provider of geospatial data on data.wa.gov.au, it is by no means the only provider - so you may encounter data provided by other platforms, or data that has been directly uploaded to data.wa.gov.au itself by an agency.
The technical guidance in this article refers specifically to the use of data snapshots from SLIP, but the principles are broadly applicable to data provided from other sources.
You can easily tell if a data snapshot comes from SLIP by clicking on the snapshot (e.g. Shapefile or Service Map Package) and inspecting its URL.
If the URL begins with https://maps.slip.wa.gov.au/datadownloads/ then you can be sure you're accessing a data snapshot from the SLIP platform.
How do I know if the data snapshot has been updated?
Knowing when a data snapshot was last refreshed is a key piece of information for most users of data. For most datasets on data.wa.gov.au this information is easily available on the dataset itself.
At a glance these three fields can tell you when the dataset was first published, when the agency last updated it, and a rough measure of how often they intend to update it.
In most cases the Data last updated on is automatically updated when the custodian supplies a new version of a dataset, but this is not always the case.
In the case of SLIP's data snapshots you have one further means of determining the currency of a snapshot. Each data snapshot ZIP file created by SLIP comes with an Info.txt file that contains the date and time at which the snapshot was last refreshed. It is also available as a plain text file alongside the snapshots themselves. e.g. https://maps.slip.wa.gov.au/datadownloads/SLIP_Public_Services/People_and_Society/ShipwrecksWAM_002/Info.txt
Note that this is when the snapshot was created, not the time at which the dataset itself was last updated. It's possible a dataset may only be updated a few times a month, but its snapshot may be recreated every day.
For this reason it's important to compare all of the information provided about when a dataset has been updated. If you require additional confirmation it may be best to seek confirmation from the data custodian themselves.
All dataset snapshots created by SLIP - even those for completely open and public datasets - require you to register an account with SLIP in order to access them. For more information about why we do this see Why am I being asked to login to access data?.
If you're simply accessing a data snapshot from data.wa.gov.au in your web browser on an as-needed basis then you'll be asked to login to your SLIP account, and when you've done so, your download will begin automatically.
If you are looking to automate or script access to data snapshots then it pays to know a little about how SLIP's authentication system works. Anyone wishing to automate downloads should refer to our article How does data.wa.gov.au’s authentication system work? for more in-depth information that is intended to be read alongside the rest of this guide.
Automating data snapshot downloads
How you'll go about automating downloads of data snapshots will vary depending on your background, skillset, and the tools at your disposal. We've put together examples spanning three different pieces of software that are relevant to data professionals and developers.
- cURL - A command-line tool for transferring data over HTTPS, FTP, et al. Free, simple to install, and can be easily automated using cron jobs/Windows Task Scheduler.
- FME - A powerful ETL (Extract, Transform, and Load) tool for working with geospatial datasets and APIs.
- Python - The popular programming language for data professionals.
Option 1: cURL
cURL is a popular command-line tool for making web requests (HTTPS, FTP, et al.). It is beyond the scope of this article to provide instruction on installing cURL - please see cURL's Releases and Downloads page for packages and installers for your operating system.
This example has been tested on cURL 7.49.0.
Our cURL example is perhaps the simplest option of the three presented here in terms of what it takes to get it up running. cURL itself takes cares of much of the technical complexity of performing the authentication handshake for us - to the point that all we need to do to authenticate and download a file is a single command -
curl --location-trusted --cookie GET https://maps.slip.wa.gov.au/datadownloads/SLIP_Public_Services/People_and_Society/ShipwrecksWAM_002/ShipwrecksWAM_002.zip -H "User-Agent: SLIPAppUser" --user "username:password" --remote-name
So what's going on here?
- --location-trusted - Instructs cURL to trust all URLs/domains encountered during the authentication handshake, thus allowing it to send your username and password for each request made during the handshake.
- --cookie - Instructs cURL to store and pass cookies from one request to the next during the authentication handshake.
- -H User-Agent - Allows cURL to opt-in to HTTP Basic-like authentication.
- --user - Your SLIP username and password in the format "username:password".
- --remote-name - Instructs cURL to save the file with the same name as it has on the sever. In this case, as ShipwrecksWAM_002.zip.
Running this command, with your credentials substituted, should create a ZIP file called ShipwrecksWAM_002.zip in your working directory.
While we will leave automating downloads with cURL as an exercise for the reader, doing so is a fairly straightforward matter of using your operating system's job scheduler (e.g. Cron, Windows Task Scheduler).
For more information about cURL see the cURL FAQs page.
Option 2: FME
On the surface our FME example looks more complex, but it does have some additional smarts that our simple cURL example didn't have.
This workbench has been tested on FME 2016.1 and 2017.0.
In short, our workbench is -
- Accepting three Published Parameters -
- SOURCE_URL: The URL of the data snapshot to download.
- SLIP_EMAIL: The email address you signed up to SLIP with.
- SLIP_PASSWORD: Your password for SLIP.
- Extracting the date and time the snapshot was created at from its Info.txt file.
- Downloading the shipwrecks snapshot and saving it to a file with the date and time of its creation embedded in the filename. e.g. ShipwrecksWAM_002_20170801180131.zip
We'll leave it as an exercise for the reader to make use of the date and timestamp for more advanced purposes, such as archiving copies of data or checking that the snapshot has been updated before downloading it.
A note on authentication
The astute reader will notice that we're using a TextEncoder transformer to manually encode your credentials and then sending it directly as part of the Authorization header in our HTTPCallers. FME best practice would usually see us relying on FME's inbuilt Web Connections, or at least HTTPCallers authentication parameters.
Unfortunately, FME doesn't meet one of the requirements of SLIP's Basic auth-like authentication system that requires your credentials to be passed for every request in the handshake. HTTPCallers will only pass credentials for the first request in series.
Option 3: Python
Lastly, to our Python example using the popular requests library for fetching our ZIP file.
A note on authentication
As with FME, the requests library doesn't fully meet the requirements of SLIP's Basic auth-like authentication system in that it can follow redirects, but will only pass authentication credentials to any redirects that are on the same domain as the original. In the case of our data snapshots we must traverse both maps.slip.wa.gov.au and sso.slip.wa.gov.au and must send our credentials to both domains.
We can work around this by manually implementing our authentication handshake in our fetchDownloadSnapshot() function. Refer to the function's documentation for further details.