CASDA User Guide
1. CASDA Overview
The CSIRO ASKAP Science Data Archive (CASDA) stores science-ready data products from the Australian Square Kilometre Array Pathfinder (ASKAP), and provides the services to make these products available to astronomers. ASKAP’s high data rate (~75 PB per year) means storage of raw data is not feasible – instead, data is automatically processed by the ASKAPsoft software pipeline into science-ready data products, producing what is expected to be about 5 PB per year in full operations.
The main features of CASDA are currently:
- Long term storage of ASKAP science data products
- Access via web interface (CSIRO Data Access Portal) and Virtual Observatory (VO) services
- Interactive Skymap search
- Image cutout generation
- Quantitative data quality metrics and validation of data products
- Ability for science teams to upload value-added science catalogues and image cubes
- Digital Object Identification (DOI) for all datasets, for a persistent data URL identifier that can be used with publications
CASDA stores and makes available the following types of data products, described in full in the Data Products section:
- Images/image cubes. Images in FITS format, including 2D continuum images, 3D continuum image cubes, and 3D spectral line image cubes
- Catalogues. Source catalogues created by running CASDA images through Selavy, the ASKAP source-finder.
- Measurement sets. Calibrated visibilities as CASA-format measurement sets.
- Evaluation/validation files. Ancillary data describing data quality or information about the processing.
These include standard “observational” data produced with ASKAPsoft and enhanced “derived” data products (e.g. cross-matched catalogues, cubes from multi-day observations) produced by ASKAP survey teams. Data is validated by the respective science team(s) before public release. Released data is available to all users (public, i.e. non-proprietary), and users assigned to a science project can access unreleased data for that project.
CASDA has regular updates and developments. Major items are listed as news items here, and release notes on individual production releases are here.
1.1. Projects
ASKAP was designed as a survey science instrument, and its primary application for the first five years of its operation is to large Survey Science Projects (SSPs), each requiring over 1,500 hours of observing time to complete. These SSPs are listed and described on the ATNF ASKAP Science page. Each semester, ASKAP also observes up to 100 hours for Guest Science Projects (GSPs), smaller projects awarded on a semesterly basis through a usual time allocation process.
All data in CASDA are assigned to an ASKAP Science Project. ASKAP was designed with commensality in mind, so in some cases there is a primary project, which oversees data validation, and a secondary commensal project tagged to commensal data. This ensures that commensal data is returned in a search of all data associated with a project.
1.2. Authentication
Accessing CASDA catalogues and viewing the metadata of released data products is allowed without authentication. However, to download image-based data or visibilities, user authentication is required. This can be done with either an ATNF Online Proposal Access & Links (OPAL) account, or a CSIRO Nexus account. OPAL accounts are recommended for general use, and registration can be done at the OPAL home page.
1.3. Technical support
Please contact atnf-datasup@csiro.au for issues and help with CASDA. We will get back to you as soon as we can. Please include any information that might help us solve your issue, for example the Project ID you were attempting to download, what search you attempted in the web-form, what browser you were using, etc.
1.4. Acknowledging ASKAP data and CASDA in publications
Publications making use of data obtained through CASDA must follow the ATNF Data Policies and CSIRO Space & Astronomy Publications and Acknowledgements guidelines. In summary, publications should:
- Acknowledge the ATNF and any instrument used to collect data used in the publication. For example, publications using ASKAP data should include the text:
This scientific work uses data obtained from Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory. We acknowledge the Wajarri Yamaji People as the Traditional Owners and native title holders of the Observatory site. CSIRO’s ASKAP radio telescope is part of the Australia Telescope National Facility (https://ror.org/05qajvd42). Operation of ASKAP is funded by the Australian Government with support from the National Collaborative Research Infrastructure Strategy. ASKAP uses the resources of the Pawsey Supercomputing Research Centre. Establishment of ASKAP, Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory and the Pawsey Supercomputing Research Centre are initiatives of the Australian Government, with support from the Government of Western Australia and the Science and Industry Endowment Fund.
This paper includes archived data obtained through the CSIRO ASKAP Science Data Archive, CASDA (http://data.csiro.au). - Include the name of any instrument used (e.g. “ASKAP”) in the abstract.
- Ensure appropriate citations are included for any project the data used is assigned to. Recommended citations are included on each project’s Data Access Portal collection page.
2. Data Products
2.1. Images and image cubes
There are three main image types stored in CASDA: two-dimensional continuum images, three-dimensional coarse-channel image cubes, and three-dimensional fine-channel spectral line image cubes. In addition to the primary images ancillary images produced by the software pipelines are also made available. All images and image cubes are in the FITS format, and the different types are described below.
2.1.1. 2D continuum
- cont_restored_t0 – The primary image type. This is the restored total-intensity (or Taylor-0) image (summed over the entire bandwidth being processed). Example: i.SB1234.cont.taylor.0.restored.fits
- cont_restored_t1 – The “Taylor-1” image, produced in multi-frequency synthesis. This shows the spectral index at each point multiplied by the total intensity. Used to derive spectral indices for catalogued components. Example: i.SB1234.cont.taylor.1.restored.fits
- cont_residual_t0 – The residual flux remaining from deconvolving & cleaning the total-intensity image (i.e. shows the un-cleaned emission). Example: i.SB1234.cont.taylor.0.fits
- cont_residual_t1 – The residual flux remaining from deconvolving & cleaning the Taylor-1 image. Only produced by multi-frequency synthesis. Example: i.SB1234.cont.taylor.1.fits
- cont_cleanmodel_t0 – The image of the deconvolved model resulting from cleaning the total intensity image. Example: i.SB1234.cont.taylor.0.fits
- cont_cleanmodel_t1 – The image of the deconvolved model resulting from cleaning the Taylor-1 image. Example: i.SB1234.cont.taylor.1.fits
- cont_weight_t0 – The relative sensitivity across the field. This typically is a product of the linear-mosaicking approach, and shows the effect of the primary beam attenuation, coupled with the number of visibilities contributing to a given PAF beam. This version goes with the total-intensity Taylor-0 image. Example: i.SB1234.cont.taylor.0.fits
- cont_weight_t1 – The relative sensitivity image that goes with the Taylor-1 image. Example: i.SB1234.cont.taylor.1.fits
- cont_components_t0 – A map of fitted components, that are identified as part of the cataloguing process and fitted to the total intensity image. These are Gaussian components that appear in the component catalogue. Example: i.SB1234.cont.taylor.0.restored.fits
- cont_fitresidual_t0 – A map of the residual emission after the component map is subtracted from the total intensity image. Example: i.SB1234.cont.taylor.0.restored.fits
- cont_noise_t0 – A map of the noise in the total-intensity image as a function of position across the field. This is typically created during the source-finding process. Example: image.i.SB1234.cont.taylor.0.restored.fits
2.1.2. 3D continuum
- cont_restored_3d – The primary continuum cube type. This is the restored total-intensity image in each coarse (continuum) channel. Example: restored.i.SB1234.contcube.fits
- cont_residual_3d – The residual flux remaining in each coarse channel after deconvolution. Example: i.SB1234.contcube.fits
- cont_cleanmodel_3d – The image of the deconvolved model resulting from cleaning each coarse channel. Example: i.SB1234.contcube.fits
- cont_weight_3d – The relative sensitivity in each coarse channel. Example: i.SB1234.contcube.fits
2.1.3. 3D spectral line
- spectral_restored_3d – The primary spectral-line cube type. This is the restored total-intensity image in each fine spectral channel. This cube may have had the continuum emission removed. Examples: restored.i.SB1234.cube.fits or image.restored.i.SB1234.cube.contsub.fits
- spectral_residual_3d – The residual flux remaining in each fine channel after deconvolution. Examples: i.SB1234.cube.fits
- spectral_cleanmodel_3d – The image of the deconvolved model resulting from cleaning each fine channel. Examples: i.SB1234.cube.fits
- spectral_weight_3d – The relative sensitivity in each fine channel. Examples: i.SB1234.cube.fits
- spectral_restored_mom0 – A two-dimensional image showing the moment-0 (or summed intensity over a frequency range) image of a spectral cube.
- spectral_restored_mom1 – A two-dimensional image showing the moment-1 (or average spectral value) image of a spectral cube
2.1.4. File name patterns
The image products produced by the ASKAPsoft pipeline conform to patterns that allow one to discern the type of file that it is. The general form of an image name is {prefix}{imagekind}.{polarisation}.{survey}_{pointing}.SBxxxxx.{datatype}.{suffix}.fits where the individual elements are:
- {prefix} – this is used for derived images that are created after the imaging (by the source-finding software, for instance).
- {imagekind} – the kind of image, usually one of “image”, “residual”, “weights”. For cubes, this could also be “image.restored” (but continuum images have the “restored” tag in the suffix).
- {polarisation} – the lower-case version of the Stokes parameter: usually one of “i”, “q”, “u” or “v”
- {survey} – the short name of the survey the image is from (not always present for guest survey projects)
- {pointing} – coordinates or object name describing the pointing of the image
- SBxxxxx – the scheduling block (SB) ID number
- {datatype} – this indicates whether it is a continuum image (“cont”), a continuum cube (coarse-resolution, typically 1MHz channels – “contcube”), or a spectral cube (fine resolution, usually 18.5kHz channels or finer – “cube”).
- {suffix} – this can indicate things like the multi-frequency synthesis information (“taylor.0”), whether it is a restored continuum image (“restored”), whether the cube has been continuum-subtracted (“contsub”). There are several types of restored continuum images that may be presented
- “restored.conv.fits” – these are mosaics that have had all constituent beams convolved to a common resolution prior to mosaicking. The PSF in the header will accurately reflect the resolution across the field and fluxes will be accurate.
- “restored.raw.fits” – a mosaic created from beam images prior to the convolution. The resolution will be better, but beam-dependent, and the PSF in the header will not necessarily reflect the resolution at an arbitrary point in the image (leading to potential flux errors when making measurements).
- “highres.restored.conv.fits” or “alt.restored.conv.fits” – indicates an alternative preconditioning parameterisation was used to restore the image (in addition to the regular preconditioning). The name varies according to processing template, but this is often done at a lower robustness value, hence the indication of high resolution (“highres”).
2.2. Catalogues
Several types of catalogues are available in CASDA. These are made available through the DAP Observation Search, but the use of Virtual Observatory tools (e.g. Astroquery or TOPCAT) is generally recommended for accessing, searching, and filtering catalogues. A set of “observational” catalogues is automatically created for each image using ASKAP’s source-finding software Selavy as part of standard data reduction processing:
- Continuum Island: a catalogue of the “islands” in the total-intensity image. An island is a group of contiguous pixels that are above some detection threshold.
- Continuum Component: a catalogue of the components that make up each island. A component is a two-dimensional Gaussian, parameterised by a location, flux, size, and orientation. Each island has some number of components fitted to it, so there is a one-to-many relationship between the island and component catalogues.
- Polarisation Component: a catalogue of polarisation and rotation measure properties for each component in the component catalogue.
2.3. Measurement sets (visibilities)
The visibility data stored in CASDA is in the form of Measurement Sets, that have been packaged as tar files. A Measurement Set (MS) has a directory structure and will contain a series of CASA-format tables that describe a lot of the observation metadata, as well as the main table with the visibility data. Each MS will have data just for a single PAF beam.
To supplement the MS metadata, there is an additional directory, called “ASKAP_METADATA”, containing the metadata files used by the pipeline processing.
2.4. Evaluation files
The evaluation files are a collection of ancillary data that either provide information about the quality of the data or information about the processing. These are typically available via download links from the information tab for each image product. There are several distinct types:
- Calibration-metadata-processing-logs: this is a tar file containing a large amount of information from the processing. Its filename will have a timestamp (of when the processing was begun). This contains:
- Calibration tables, for bandpass, self-calibration, and leakage (if done). These tables are CASA -format, that are understood by the ASKAPsoft processing software.
- A diagnostics directory, containing plots and flagging summaries for each of the MSs
- The pipeline metadata directory (that is also copied to the MSs)
- Tarred directories of logs and parameter set inputs for each of the processing jobs
- Validation results: these are tar files containing the directory of science validation results. There may be one or more set of results corresponding to validation scripts developed by Survey Science Teams:
- EMU – cross matches the component catalogue with RACS-low, and assesses position & flux accuracy
- POSSUM – detailed validation looking at the polarisation spectra, including assessing wide-field leakage correction
- GASKAP-HI – quality validation of the visibility data and calibration
- WALLABY, DINGO, FLASH – closely-related validation reports looking at the spectra properties of noise and other statistics
- Extracted spectra: when 1D spectra are extracted from the larger spectral cubes, these are combined into tar files. Each file will contain spectra of a common type (source spectra, noise spectra, Faraday spectra for the case of RM synthesis).
- An “encaps” file: an “encapsulation” of the pipeline configuration files used and the continuum validation quality metrics (as an XML file).
2.5. Derived data products
ASKAP science teams may generate data products beyond what is automatically produced by the ASKAPsoft pipeline and upload them to CASDA. These derived data products are accessible in the same ways as all other data in CASDA and are tagged as belonging to “Derived” collections.
3. Using CASDA with the Data Access Portal
The CSIRO Data Access Portal (DAP) provides several online services for accessing CASDA with user-friendly interfaces. Currently three services are available:
3.1. ASKAP Observation Search
The ASKAP Observation Search provides access to all data products stored in CASDA. The search form provides the following parameters for the search, all of them optional:
- Object name. Enter the name of an object here and click “Resolve” to automatically fill out the Right ascension and Declination fields with the position of that object. This queries VizieR for the object name, and is not limited to object names as listed within CASDA’s catalogues.
- Right ascension, Declination, and search radius – Single position. Entering a position in these fields, either manually or by resolving an object name, will perform a cone search around this position with the provided radius (default 2 arcmin). This will return all data products overlapping with this region of the sky.
Note that the returned data products will be the complete data, i.e. full files which cover the region of the cone search. Images will not be cropped, and catalogues will not be filtered to the search region. For these, you will respectively need to request a cutout, or filter the catalogue with Virtual Observatory tools. - Right ascension, Declination, and default radius – Multiple positions. A cone search can also be performed on multiple positions at once by clicking on the “Multiple” tab at the top of the form:
Click “Browse” to select a file to upload. This must be a text file, with one J2000 RA and Dec per line and an optional radius in arcmin, each space separated. If no radius is provided, the default radius will be used for that position. The accepted position formats are:HH:MM:SS.SSS DD:MM:SS.SSS [radius/arcmin]
HH MM SS.SSS DD MM SS SSS [radius/arcmin]
DDD.DDDD DDD.DDDD [radius/arcmin]
Tip: the multiple cone search can be used with the observation search cutout generation service to produce cutouts of multiple positions/objects at once.
- Project. Filter by ASKAP science project. This is a type-ahead field, so you can see the list of projects matching your current entry as you type the project code or name. By default this will only match data for which the specified project is the primary project. Ticking the “Commensal Observations” checkbox will include commensal data for the specified project.
- Filename. Filter by filename (not case sensitive), with Unix wildcard support.
- Released status. Include or exclude results by their status. By default, only released data is included.
- Scheduling block ID (SBID) or Observation date. Filter for data from observations from a particular scheduling block or within a particular date and time range.
- Frequency, redshift, velocity. Filter for data within the provided frequency range. This can be done directly with frequency in MHz, or by redshift or velocity relative to one of the two available reference rest frequencies:
HI – 1420.405751786 MHz
OH – 1665.4018 MHz
3.1.1. Search results
Search results are returned in a table as shown, divided into the data product categories via tabs at the top of the table. The currently applied search parameters are shown in blue boxes above the table and can be removed by clicking ×. The results can be further filtered with the “Refine results” section on the left. Note that to reapply or modify a cone search you will need to return to the search form. The “Refine Search” button will return to the search form, preserving the previously entered fields.
The “Configure Columns button” allows you to modify the columns shown in the table (each tab has a different set of available columns).
The box on the right lists the currently active columns, and the box on the left lists the inactive columns. Select an item in either list and use the buttons in the centre to move the selection between them:
- » : Move all columns to active list
- › : Move selection from inactive to active list
- ‹ : Move selection from active to inactive list
- « : Move all columns to inactive list
The active columns can be reordered with the ↑ and ↓ buttons below the active list. Default column configuration can be restored with the “Restore Defaults” button in the bottom left. Click “Save” to apply the new configuration, or “Cancel” to discard changes.
3.1.2. Download data
Select the file(s) to be downloaded with the checkboxes in the leftmost column of the table and click “Download” in the top right. Selections can include as many files as desired and include products from multiple tabs (e.g. you can select an image cube and ancillary files simultaneously).
To download images, spectra, or visibilities, you must be logged in with an OPAL, CSIRO Nexus, or AAO Data Central (Lens) account. Unreleased data is only available to users assigned to the project that data is associated with.
There are multiple methods to download data:
- Download from a public web site. This method will redirect to a unique public URL of a Data Access Job page that presents the status of your data request. Full file requests will usually be available almost immediately upon reloading the page, and cutouts will take time to process depending on the size of the cutout and the cube it is taken from. The page will automatically refresh every 30 seconds until the data is ready for retrieval. You will receive emails notifying the creation and completion of the request at the email associated with your OPAL/Nexus account.
Once the data is ready, the table in the page will update to include links to directly download each of the requested files and respective checksums. If you have requested multiple files, it may be convenient to save a text file with each of the download URLs, using the button in the bottom left, and use command line tools to download all the files at once. For example:
Mac OSX:xargs wget --content-disposition '{}' < links.txt
Unix:xargs -i wget --content-disposition '{}' < links.txt
This page and the file links will expire 7 days after creation. - Access data with a Pawsey HPC account. This method requires a current Pawsey account. Upon selecting this option, note the statement regarding data accessibility, accept the statement by checking the box, click “Request files”, and review & accept the licence information. You will be redirected to a page detailing the status of the request and will also receive a link to this page via email.
Following the instructions under “Next Steps”:- Log in to a Pawsey server.Navigate to the directory where you would like to download the data to.Obtain the download URLs for the files by clicking “Save links as text file” on the status page. Either upload this file to Pawsey or copy its contents into a text file on Pawsey.
- Perform the download with the wget command:
xargs wget --content-disposition '{}' < links.txt
- View in CARTA (beta). Request a CARTA session for viewing the selected data remotely, without needing to download the full file. See 3.4 – CARTA for more details.
3.1.3. Use checksums to validate downloaded data
Every data product has an associated checksum file which can be used to validate that a file was downloaded correctly without errors. A checksum file contains a hash generated by the MD5 algorithm, which can be compared to the hash produced by applying MD5 to the respective data file. This can be done in a shell terminal:
$ md5 filename
MD5 (filename) = 0f1d2d40db0e037cf8154ab11df9786e
$ cat filename.checksum
0f1d2d40db0e037cf8154ab11df9786e%
If the output hashes match exactly (as they do here), the file has been downloaded without any corruption.
3.1.4. View metadata
Metadata for any file in CASDA is viewable via the DAP by clicking on the ⓘ icon under “Actions” in the search results form. The precise metadata varies by data type, as listed below.
Observational catalogues
- Project description
- Filename
- Scheduling block ID(s)
- Collection + link to collection page
- Observation start + end
- Link to download evaluation files
- Validation notes
Derived catalogues
- Project description
- Filename
- Collection + link to collection page
Spectra
- Project description
- File name
- Spectrum type
- Schedblock ID(s)
- Image cube the spectrum is taken from
- ASKAP object name of the source
- Project principal investigator
- Collection + link to collection page
- Observation start + end
- Centre frequency
- Polarisation
- Collection type (observational or derived)
- Links to download evaluation files + validation reports
- FITS header
- Validation notes
Image cubes
- Project description
- Filename
- Image type
- Schedblock ID(s)
- Project principal investigator
- Collection + link to collection page
- Observation start + end
- Link to cutout generator
- Links to download evaluation files + validation reports
- Image preview
- FITS header
- Validation notes
Visibilities
- Project description
- Filename
- Schedblock ID(s)
- Project principal investigator
- Collection + link to collection page
- Observation start + end
- Collection type (observational or derived)
- Links to download evaluation files + validation reports
- Validation notes
3.1.5. Collection page
All data in CASDA is assigned to at least one CSIRO DAP collection. To view the collection page, click on the link on the data’s metadata page or the project code in the search results list. Note that there is often more than one collection for a given project, for instance there will often be separate collection pages for catalogues and images from the same project.
The “Description” tab includes:
- Collection description. Information about the collection, including contributors, a short description (often including required citations), start date (and end date if applicable), and the access status of the collection’s data.
- Project description. Information about the project the collection is a part of.
- VO Resource. The IVO identifier is a unique Virtual Observatory (VO) URI that can be used to obtain collection data.
- Project coverage map. A map showing the regions of the sky covered by the project in blue.
- Publication date.
- Contact.
- Copyright licence.
- Persistent collection link.
- Citation. Note that the citation here is for the specific collection itself, and projects may require other citations as noted in the collection description.
The “Files” tab lists all the data products included in the collection in a table similar to that on the Search Results page, with the same functionality for refining the listed data products and downloading.
3.1.6. Generate image cutout
CASDA provides an Image Generation Service for generating “cutouts” of image cubes. A cutout is a subset of the data contained in a larger image cube, restricted in spatial extent (for 2D and 3D cubes) and/or spectral extent (for 3D cubes). A Curated Cutout Service is available for generating cutouts of selected datasets, but cutouts can be generated from any image cube.
There are two ways to launch generation of a cutout of a cube:
- Clicking the “Generate cutout” link on the cube’s metadata page
- Clicking the scissors ✄ icon under “Actions” in the search results form
The Image Generation Service page lists some basic metadata of the base image cube, and a preview of the entire image. The cutout request form appears as follows:
- Aladin-lite window. This is an interactive plot of an image of the whole sky, by default the RACS-low Epoch 1 Stokes I image. The sky region of the main image cube is outlined in green. By clicking on the image, you can draw a box which can then be used to define the requested cutout. The cutout region will then be drawn in green outline, and the whole image region in red outline. By drawing a box, the spatial parameters (centre and radius) of the cutout region will be automatically filled into the relevant fields below. Additional features of the Aladin-lite window are described in the Skymap section.
- Position centre and radius. Enter the centre of the desired cutout image and a radius in arcmin. Note that these fields will be automatically filled out by using the interactive Aladin-lite window to draw a cutout region. Also note that although the term “radius” is used, generated cutouts will be square with a side length twice the specified radius.
- Spectral selection. If the image cube is 3D, a selection can be made along the spectral axis in this section. This can be done directly with frequency or wavelength, by selecting channels via their index (“pixels” in the drop-down), or by redshift or velocity relative to one of the two available reference rest frequencies:
HI – 1420.405751786 MHz
OH – 1665.4018 MHz
Use the “Spectral selection” dropdown to select between these options.
A selection can be made for either a single channel corresponding to a value entered, or for a range of channels corresponding to a range entered as a space-separated pair of values.
For each option, hovering over the ⓘ icon next to the field for entering the desired value/range will show you the valid range for the image cube in the relevant units.
If a cone search had been performed, the form will not include the Aladin-lite window or fields for specifying a cutout central position and radius. Instead, the parameters of the cone search (or cone searches if multiple positions were entered in the search form) will be listed as shown below. The spectral selection will still be editable.
Clicking on the “Single cutouts” button will switch the form over to the mode that includes the Aladin-lite window and allows for specification of the spatial parameters of the cutout.
Click “Generate cutout” to submit the request. You will be redirected to a download page detailing the status of your request. This page will refresh automatically every 30 seconds until the cutouts are available.
Once complete, the cutout(s) will be available for download from this page (along with respective checksums).
Note on cutouts: the time taken to generate cutouts and make them available will depend on the size of the requested cutout and the size of the cube it is taken from. Large cutouts from large cubes can take several hours to complete. It is recommended that arcminute-scale cutouts are taken from spectral cubes. In cases where degree-scale cutouts are required, in some cases it could be faster to download the full cube to Pawsey or your home institute and process there.
3.1.7. Generate integrated spectrum of image region
Also available via the Image Generation Service is functionality to produce a one-dimensional spectrum integrated over a sky region in a 3D image cube. To produce such a spectrum, follow the same steps as to create an image cutout as described in the previous section, selecting a region of the sky over which to integrate the flux. Then, click “Create spectrum” to generate the spectrum and be redirected to its download status page.
Note: this should only be done on small sky regions, e.g. on regions containing a single source.
3.2. Skymap
The CASDA Skymap Search service provides an interactive Aladin-Lite viewer showing an image of the sky with CASDA catalogue sources overlaid as red squares.
The viewer can be targeted to precise coordinates or a particular object by typing coordinates or the object name (resolved by Vizier) in the “Target” field and hitting Enter or clicking on the crosshair button to its right. Click and drag in the viewer to move around the sky and scroll up/down to zoom in/out (or use the +/- buttons on the right-hand side). Purple crosshairs mark the centre of the view, and the coordinates of this point are shown in the top left of the viewer. The current field of view (FoV) is shown in the bottom left. The default position is of the object 2MASX J08161181-7039447.
By default, the image shown in the viewer is the RACS-low Epoch 1 Stokes I image. The background image can be changed by opening the “Manage layers” menu in the top-left of the viewer and selecting a survey (source from from the HiPS registry) in the drop-down menu under “Base layer”. In the “Manage layers” menu, you can also enable display of the HEALPix grid.
Also in the top-left of the viewer, below “Manage layers”, is the “Coordinates grid” menu, which can be used to display and configure a coordinate grid over the background image.
The source catalogue overlaid can be selected using the radio buttons. A reference and link to the project page for each catalogue are available by clicking on the respective ⓘ icon.
Click on a source to select it. Below the Aladin-Lite viewer, a summary of the basic properties of the source will be listed, along with a preview image if the selected catalogue is one of the observational catalogues (continuum component, continuum island, polarisation component). The ellipse drawn on the preview represents the fitted Gaussian of the source.
The precise information varies between each catalogue:
Continuum Component, Continuum Island, RACS DR1 Galactic/Non Galactic Region
- Source name
- RA, Dec (J2000)
- Peak Flux Density
- Integrated Flux Density
- Major Axis
- Minor Axis
- Position Angle
Polarisation component
- Source name
- RA, Dec (J2000)
- Stokes-I Flux
- Peak Polarised Intensity
- Faraday Depth
- Polarisation Angle at Zero Wavelength
- Polarisation Fraction
SPICE-RACS DR1 cubelets
- Source Name
- Source ID
- RA, Dec (J2000)
- Peak Flux Density
- Integrated Flux Density
- Major Axis
- Minor Axis
- Position Angle
WALLABY Pilot DR1
- Source name
- RA, Dec (J2000)
- Centroid Frequency
- Integrated Flux
- Spectral Line Width
SPICE-RACS DR1 spectra
- Source Catalog ID
- RA, Dec (ICRS)
- Stokes I Flux Density*
- Polarized Intensity*
- Fractional Polarization
- SNR Pol Intensity
- Rotation Measure*
- De-rotated Polarization*
- RM Width
- Sigma_add Complexity*
- Instrumental Leakage Estimate
*error in quantity also included
Click on the “View item summary” button to open a page listing the full details for the selected source and a link to download source files (you must be logged in). If a preview image is available, there will also be a link to “View parent image”, which will open the metadata page for the image from which the preview is taken.
3.3. Curated Cutout Service
The DAP provides a curated cutout service for selected survey data in CASDA. Currently, this service gives convenient access to cutouts from the Rapid ASKAP Continuum Survey (RACS) data release 1 (DR1), in the low and mid bands.
The cutout generation form has the following required fields:
- Object name. Enter the name of an object here and click “Resolve” to automatically fill out the Right ascension and Declination fields with the position of that object. This queries VizieR for the object name, and is not limited to object names as listed within CASDA’s catalogues.
- Right ascension/Declination. Coordinates of the centre of the desired cutout, either entered manually or by resolving an object name.
- Survey. Survey(s) from which to generate cutouts.
- Cutout radius. Radius of desired cutout.
Because survey images often overlap, it is possible that there will be more than one option for an image to take the requested cutout form. The results page lists all available options, with a tab for each survey selected. To download a cutout, click the download icon in the leftmost column of the corresponding row. In the “Options” dropdown, click “Refine cutout” to return to the cutout generation form.
3.4. CARTA
Cube Analysis and Rendering Tool for Astronomy (CARTA) is an image visualisation and analysis tool for image cubes that uses a client-server architecture for efficient use of large data products. CARTA can be used as an interface for viewing image cubes stored in CASDA without needing to download the image cubes directly onto a local machine.
3.4.1. Starting a CASDA CARTA session
To request a CARTA session with CASDA:
- Log in to the DAP. Viewing and downloading image-based data products requires authentication with an OPAL, CSIRO Nexus, or AAO Data Central (Lens) account.
- Search for the desired image cube(s) using the ASKAP Observation Search.
- Select the data to view with CARTA.
- For a single file: Click on the CARTA icon in the “Actions” column next to the desired file.
Then click “Submit” on the pop-up window that appears - For multiple files: Select the files with the checkboxes in the leftmost column of the results list and click the “Download” button. Select “View in CARTA” in the method dropdown menu and click “Request files”.
- For a single file: Click on the CARTA icon in the “Actions” column next to the desired file.
You will be redirected to a CARTA Download Job Status page:
You will also receive a link to this page via email. This page will automatically refresh every 30 seconds until the data has been retrieved and is ready for visualisation on CASDA’s CARTA server. Once this is complete, you will receive an email notification that the requested data is ready for CARTA visualisation and the status page will update:
Now that the data is ready in CASDA’s CARTA server, a CARTA Session must be requested with the “Request CASDA CARTA Session” button on the status page. This will enter your request into the session queue until a session is available, with the current status given in the “CASDA CARTA Session Status” section:
There can only be 10 sessions active at a time, and each session lasts for 3 hours. Once your session is available, the session status section will update with a button to connect to the session, and will give its expiry time:
Click “Connect to CASDA CARTA Session” to connect to the session. Once the session is ended, the status page will remain available for 7 days from the initial request.
3.4.2. Using CARTA
Upon connection, you will be presented with a File Browser window, listing the data you requested:
Selecting a file will list metadata in the panel on the right, which also has a tab for viewing the header of the file itself. Click “Load” to load the selected files. The selected data will then be displayed in the primary CARTA window:
We recommend reference to the Quick Start section of CARTA’s user manual for a description of CARTA’s basic functionality.
4. Virtual Observatory
The Virtual Observatory (VO) is a system that allows access to a wide variety of astronomical data with a standardised and systematic framework. Data in CASDA is available for access via the VO through standard VO Tools. These tools can be used to access and filter catalogues, perform cone searches for images and catalogues containing a sky position or object, and use CASDA’s image cutout generation service. The VO tools described here are TOPCAT, a GUI-based application for searching catalogues, and Astroquery, a Python package that enables scripted and automated access to CASDA.
4.1. Virtual Observatory Tools: Catalogue searches with TOPCAT
TOPCAT (Tool for OPerations on Catalogues And Tables) is a graphical viewer and editor for astronomical tables. It has functionality for interfacing with the VO to download tables from services in VO registries, CASDA included.
There are two methods to load a table of sources from CASDA catalogues: cone search, and TAP Query.
4.1.1. Cone search with TOPCAT
To perform a cone search (return all sources in catalogue within a certain distance of a specified sky position) on CASDA catalogues, select the Cone search option from the VO dropdown list to open the Cone Search window:
In the pop-up window, type “casda” into the keyword search field (1), hit Enter or click “Find Services”, and select the “CSIRO ASKAP Science Data Archive Cone Search Service” (2). The available catalogues are listed in (3) (you may need to expand the window to see all of the options). These will include curated catalogues (e.g. the RACS catalogues in the above screenshot) and the “Continuum…” catalogues, which include the respective source types (see Data Products – Catalogues) taken from across all observational catalogues in CASDA.
To set the parameters of the cone search, either:
- Enter the name of an object in the “Object Name” field (4) and click “Resolve” to obtain the position via lookup in VizieR
- Enter a position RA and Dec directly (5)
Finally, enter the search radius in the “Radius” field (6) and hit “OK” at the bottom of the window to execute the cone search. The “Load New Table” window will open, listing the status of all in-progress requests. Note that while a table is being loaded, you can submit more requests by entering the parameters for a new cone search and executing from the “Cone Search” window.
Once the table is loaded, it will be listed in the main TOPCAT window.
4.1.2. Catalogue search via Table Access Protocol (TAP) Query
The Table Access Protocol (TAP) allows for more sophisticated searches than a simple cone search, which is based only on sky position. TAP queries can be used to select sources from catalogues conditional on any of their attributes, and to obtain tables of metadata of image-based data products in CASDA.
To perform a TAP query in TOPCAT, select “Table Access Protocol (TAP) Query” from the VO dropdown menu to open the TAP Query menu:
Select “CSIRO ASKAP TAP” from the list of available TAP services (enter “casda” in the Keywords field to filter for it) and click “Use Service”:
The available tables are listed in the left panel of the “Metadata” section (1). Selecting a table will allow viewing of its metadata, including of the table itself (2) and of the columns it contains (3).
To run a TAP query, enter it into the textbox in the “ADQL Text” section of the TAP Query window (4) and click “Run Query” (5). The Load New Table window will appear, displaying the status of the query. Once completed, the table resulting from the query will be included in the table list in the main TOPCAT window.
Tables are queried with the Astronomical Data Query Language (ADQL). The basic syntax of an ADQL query follows this pattern:
SELECT {columns} FROM {table} WHERE {condition}
A query will filter the specified table for rows for which condition is true, returning these rows with the specified columns. The full specification and functionality of ADQL is described in its documentation. A number of examples are also included in TOPCAT itself and can be accessed by clicking on the “Examples” button in the TAP Query window (6).
The three main types of tables most likely to be of use to CASDA users are:
- Observational catalogues. These are the catalogues produced by the source-finding stage of the standard ASKAP data reduction pipeline, combined across all observations included in CASDA, corresponding to the catalogue types described in 2.2 – Catalogues. These are under the “casda” folder, and include:
- casda.continuum_component
- casda.continuum_island
- casda.polarisation_component
- Science-team derived catalogues. These are derived catalogues uploaded to CASDA by science teams. The specific data contained in each of them will vary between catalogues, so referencing table metadata and project pages is recommended. Derived catalogues are organised into folders labelled with the respective project code.
- IVOA obscore metadata table. The table ivoa.obscore under “ivoa” contains metadata for all data products accessible with CASDA, including image cubes and visibilities.
The information contained in catalogue tables is not always the same, and the names of columns containing similar information are also often different in different tables. For example, in some derived catalogues, the RA and Dec in degrees are stored in columns named “ra” and “dec”. But in the observational catalogues RA and Dec are given in both degrees and as HMS/DMS strings: “ra_deg_cont”, “dec_deg_cont” & “ra_hms_cont”, “dec_dms_cont”. The column metadata tab (3) can be referenced for information about the columns contained in a table, including their names, data types, units, and descriptions.
Example: cone search of RACS DR1 gaussian components
SELECT TOP 1000 *
FROM AS110.racs_dr1_gaussians_galacticcut_v2021_08_v01
WHERE 1=CONTAINS(POINT('ICRS', ra, dec),
CIRCLE('ICRS', 8.2699400, -70.6625389, 0.25))
This query performs a cone search within the RACS DR1 Gaussian components (outside Galactic plane) table, with a radius of 0.25 degrees (15 arcminutes) around the position RA=8.2699400 degrees, Dec=-70.6625389 degrees. The components of this query are:
SELECT TOP 1000 *
– return all columns (*) and limit the returned table to the first 1000 rows (TOP 1000)FROM AS110.racs_dr1_gaussians_galacticcut_v2021_08_v01
– search the RACS DR1 Gaussian components (outside Galactic plane) tableWHERE 1=CONTAINS(POINT('ICRS', ra, dec),
CIRCLE('ICRS', 8.2699400, -70.6625389, 0.25))- Include only rows where the source position is within 0.05 degrees of the position RA=8.2699400 degrees, Dec=-70.6625389 degrees.
POINT(…)
– defines a point on the sky based on a specified reference frame (‘ICRS’) and RA+Dec combination, here specified using ra and dec, which are the fields of the table respectively specifying the RA and Dec of each source. POINT must be given RA and Dec as a value or field in units of decimal degrees.CIRCLE(…)
– defines a circular region on the sky based on a specified reference frame (‘ICRS’), RA+Dec combination for the centre of the region, here specified directly with values for the coordinates of the desired cone search, and a radius. CIRCLE must be given RA, Dec, and radius as a value or field in units of decimal degrees
CONTAINS(…)
– takes in a point and a region and returns 1 if the point is contained within the region.
- Include only rows where the source position is within 0.05 degrees of the position RA=8.2699400 degrees, Dec=-70.6625389 degrees.
Example: Using obscore to plot the sky coverage of a survey
The ivoa.obscore table contains metadata for all data products in CASDA. This includes metadata of the observation from which the data was produced. Querying this table can be used to obtain and view the properties of a set of observations, the available information being listed in the Columns panel of the TAP Query window. This example will plot the sky coverage of all data in CASDA related to the EMU survey.
SELECT TOP 1000 *
FROM ivoa.obscore
WHERE(obs_collection LIKE '%EMU'
AND dataproduct_subtype = 'cont.restored.t0'
AND pol_states = '/I/')
SELECT TOP 1000 *
– return all columns (*) and limit the returned table to the first 1000 rows (TOP 1000)FROM ivoa.obscore
– query the obscore metadata tableWHERE(…)
obs_collection LIKE ‘%EMU%’
– include only rows where the obs_collection column value contains the substring “EMU” (‘%’ acts as a wildcard)dataproduct_subtype = ‘cont.restored.t0’
– include only rows corresponding to primary imagespol_states = ‘/I/’
– include only rows corresponding to Stokes I data
Once the resulting table from this query is loaded into the TOPCAT main window, select it and click the Sky Plot button in the top row:
This will open a new Sky Plot window, by default plotting the central RA and Dec of each of the observations in the table on an interactive sky projection (click and drag to navigate):
To plot the coverage of each observation as an area, click the “Add a new area plot control” button (indicated on the above screenshot). In the new layer added to the plotting stack, select the EMU observation table from the Table dropdown. By default, the s_region column will be selected as the plotted area, which represents the sky region contained in each image in the survey:
4.2. Virtual Observatory Tools: Accessing images in the HiPS registry via Aladin
The VO includes the Hierarchical Progressive Surveys (HiPS) service, which allows for seamless access and visualisation of large images and cubes, such as those from large surveys, at wide ranges of spatial resolutions. CASDA has made the low- and mid-band RACS all-sky images available via HiPS, which can be viewed using Aladin. This section describes how to view ASKAP surveys in Aladin, and how Aladin can be used to examine data across multiple wavelength domains.
4.2.1. View all-sky ASKAP images in Aladin
To see what ASKAP data is available in Aladin, type “ASKAP” in the search box below the “Available data” on the left side of the main Aladin window:
In the screenshot above, the two available HiPS images available are the Mid and Low RACS Stokes I surveys. Double click on the desired survey to view it in the Aladin viewer. The image can be navigated by clicking and dragging in the viewer, and zoomed in/out by scrolling up/down. You can also navigate directly to a particular position or object by typing its coordinates or name in the “Command” field above the viewer and hitting “Enter”.
4.2.2. View multiwavelength data: plot ASKAP data as contours over an image
Aladin can open multiple HiPS datasets at once as separate layers in the viewer. This can be applied to examine the sky in multiple wavelength regimes at once, for example by plotting radio data as contours over an optical image.
To do this, open an optical dataset by double-clicking it in the “Available data” list, then open a radio dataset in the same way. These will both appear in the “planes” list on the right side of the main Aladin window:
The checkbox selected indicates which dataset is currently shown as an image in the viewer. To create contours from the radio dataset, select it in the list, then select Overlay > contour plot from the top menu. This will display the “Contours plotting menu”, where the parameters of the contours can be configured:
Clicking “Get Contours” will generate and plot the contours as a new plane. Selecting the optical image plane for display will plot the contours over the optical data:
The process can be repeated with another radio dataset and contours plotted in another colour. For example, plotting RACS Mid contours in blue together with the above example which plots RACS Low contours in red:
To configure the parameters of the contours after they have been generated, right-click the contours plane in the list and select “Properties…” to bring up the contours properties window:
4.2.3. Overlay catalogue sources
Source catalogues are also accessible within Aladin. For example, RACS Gaussian component catalogues are available, and can be plotted over images by double-clicking them in the Available data list:
Selecting a source by clicking it will display its properties in a table below the viewer.
4.2.4. Broadcast catalogue sources from TOPCAT to overlay on images in Aladin
TOPCAT is able to broadcast the contents of a table (e.g. resulting from a cone search or TAP Query) to Aladin, where they can be plotted over images. To do this, make sure Aladin is open and select the desired table in the main TOPCAT window and click the “Transmit” button in the top row: The selected table will appear as a new plane in Aladin.
4.3. Virtual Observatory Tools: Scripted/automated access using Astroquery
Astroquery is a Python package that provides tools for querying astronomical web forms and databases and includes a module specifically for accessing CASDA. This module replicates the search functionalities of the DAP web form, and further allows for targeted queries and filtering of specific catalogues with the use of the Astronomical Data Query Language (ADQL). The main features of the CASDA Astroquery module are:
- Cone search
- Downloading data products
- Generating image cutouts
- Querying within specific catalogues, including observational and derived catalogues
Using Astroquery is a useful solution for automating queries to CASDA. For example, it can be used to generate cutouts of many different positions within an image automatically, or to automate queries to catalogues within a larger piece of software.
4.3.1. Installing Astroquery
Instructions to install Astroquery are available at the Astroquery documentation. Important to note is that Astroquery operates with a continuous deployment model, which means that releases are immediately available as soon as changes are implemented by the developers and are uploaded to PyPi. However, these are not automatically uploaded to conda, which instead receives regular, tagged versions. This means that the versions available via pip and conda may be different and have different functionality. This user guide assumes the use of the latest version available via pip, and so we recommend using pip to install Astroquery.
4.3.2. Authentication
As with the DAP observation search, CASDA catalogues are publicly accessible with Astroquery, but to download images user authentication is required. Authentication is done with an OPAL account:
from astroquery.casda import Casda
OPAL_USER = “...” # set to opal login username
casda = Casda()
casda.login(username=OPAL_USER, store_password=True)
This will prompt the user for their OPAL password. Setting store_password=True
in the casda.login()
call will save the password, and subsequent logins will not require re-entry of the password.
4.3.3. Cone search with query_region()
The simplest way to query CASDA with a cone search is using the Casda.query_region()
function. This will return an astropy Table with the metadata of all image cubes and visibility measurement sets in CASDA overlapping with a specified region. The basic usage is described below, and full documentation is available here.
query_region()
takes in a sky region to query, either circular (by providing a radius) or rectangular (by providing a height and width).
query_region(coordinates, radius=radius)
query_region(coordinates, height=height, width=width)
coordinates – centre of the region to query [str or astropy.coordinates]
radius – radius of the cone search [str or astropy.units.Quantity, optional]
height – height of a box region [str or astropy.units.Quantity, optional]
width – width of a box region [str or astropy.units.Quantity, optional]
The returned table has the following columns:
column | unit | description | column | unit | description |
---|---|---|---|---|---|
distance | deg | The angular distance of the centre of the parent image from the requested position | em_min | m | Start in spectral coordinates |
dataproduct_type | None | Logical data product type from the IVOA controlled list. Catalogues will be null but described in the dataproduct_subtype field. | em_max | m | Stop in spectral coordinates |
calib_level | None | Calibration level {0, 1, 2, 3} | em_res_power | None | Spectral resolving power |
obs_collection | None | Name of the data collection | em_xel | None | Number of elements along the spectral axis |
obs_id | None | Observation ID | o_ucd | None | UCD of observable (e.g. phot. plux. density) |
obs_publisher_did | None | Dataset identifier given by the publisher | pol_states | None | List of polarization states or NULL if not applicable |
access_url | None | URL used to access (download) dataset | pol_xel | None | Number of polarization samples |
access_format | None | File content format | facility_name | None | Name of the facility used for this observation |
access_estsize | kbyte | Estimated size of dataset in kilobytes | instrument_name | None | Name of the instrument used for this observation |
target_name | None | Astronomical object observed, if any | dataproduct_subtype | None | Further description of the type of data product, including where the dataproduct_type is null. |
s_ra | deg | Central right ascension, ICRS | em_ucd | None | Nature of the spectral axis |
s_dec | deg | Central declination, ICRS | em_unit | None | Units along the spectral axis |
s_fov | deg | Diameter (bounds) of the covered region | em_resolution | m | Value of Resolution along the spectral axis |
s_region | None | Region covered as specified in STC or ADQL | s_resolution_min | arcsec | Resolution min value on spatial axis (FHWM of PSF) |
s_resolution | arcsec | Spatial resolution of data as FWHM | s_resolution_max | arcsec | Resolution max value on spatial axis |
s_xel1 | None | Number of elements along the first coordinate of the spatial axis | s_ucd | None | Ucd for the nature of the spatial axis (pos or u,v data) |
s_xel2 | None | Number of elements along the second coordinate of the spatial axis | s_unit | None | Unit used for spatial axis |
t_min | d | Start time in MJD | obs_release_date | None | The date that this data product was released |
t_max | d | Stop time in MJD | quality_level | None | Indicator of quality level, updated by validators |
t_exptime | s | Total exposure time | thumbnail_id | None | The id of the thumbnail or preview image for this data product. |
t_resolution | s | Temporal resolution FWHM | filename | None | The original filename of the dataset. |
t_xel | None | Number of elements along the time axis |
Example: Cone search with coordinates
from astroquery.casda import Casda
result_table = Casda.query_region("22h15m38.020s -45d51m00.410s", radius="30 arcmin")
public_data = Casda.filter_out_unreleased(result_table)
Because the metadata for unreleased data products is publicly available, these will be included in the returned table. Casda.filter_out_unreleased()
filters out these items so that only data that are released, i.e. publicly available for download, are included (this step is not necessary if your user account has access to a project’s unreleased data). The table contents can be previewed by printing the table:
The table is initially sorted by the first column, the angular distance from the image centre to the requested position. The table’s columns can be used to filter the results further. For example, to filter for only primary images:
mask = (public_data['dataproduct_subtype']=='cont.restored.t0')
Multiple filters can be combined with Python Boolean operators. For example, to filter for only primary images in Stokes I from VAST marked with a “good” quality:
mask = (public_data['dataproduct_subtype']=='cont.restored.t0') \
& (public_data['pol_states']=='/I/') \
& (public_data['obs_collection']=='VAST') \
& (public_data['quality_level']=='GOOD')
Rows in the table can also be directly indexed:
Downloading data requires authentication:
Casda.login(username='your_OPAL_username')
Once you have constructed/filtered a table with the desired data, the data must be staged (i.e. requested for download). This is done with casda.stage_data()
, which returns a list of URLs. Passing this list to casda.download_files()
will perform the download, and return a list of the downloaded filenames:
to_download = public_data[mask]
url_list = Casda.stage_data(to_download)
file_list = Casda.download_files(url_list, savedir='.')
The savedir
keyword can be used to specify the directory in which to download the files. Its use is recommended, as the default download directory is an astropy cache directory. The returned file_list
is a list of strings of the full paths to each of the downloaded files.
Each data product downloaded will be accompanied by a checksum file, which can be used to verify the validity of the download (see 3.1.3 for instructions).
Example: Using SkyCoord to search via object name
Astropy’s coordinates module provides the SkyCoord
class, which can be used to obtain the coordinates of an object via its name using the Sesame name resolver. For example, to perform a cone search centred on NGC 7232:
from astroquery.casda import Casda
from astropy.coordinates import SkyCoord
from astropy import units as u
centre = SkyCoord.from_name('NGC 7232')
result_table = Casda.query_region(centre, radius=30*u.arcmin)
public_data = Casda.filter_out_unreleased(result_table)
4.3.4. Generate image cutout
CASDA’s image cube cutout generation service is accessible via the Casda.cutout()
function, which takes in a table containing metadata of one or more image cubes and the parameters for the cutout in the spatial and/or spectral domain(s). The documentation provides the full specification of the function and its arguments, and the following examples demonstrate the functionality.
If a cutout fails, the cutout generation service returns a text file containing an error message. The most likely cause of this is that the parameters of the cutout include spatial or spectral regions outside the limits of the image cube.
Example: multiple spatial cutouts from a cone search
This example will obtain a table of image cubes using the cone search and filter given in the first example in the previous section, generate cutouts of the cone search region, then download the cutouts. First, performing the cone search and filtering the table:
from astroquery.casda import Casda
result_table = Casda.query_region("22h15m38.020s -45d51m00.410s", radius="30 arcmin")
public_data = Casda.filter_out_unreleased(result_table)
mask = (public_data['dataproduct_subtype']=='cont.restored.t0') \
& (public_data['pol_states']=='/I/') \
& (public_data['obs_collection']=='VAST') \
& (public_data['quality_level']=='GOOD')
to_cutout = public_data[mask]
mask
filters the initial cone search result for primary images in Stokes I from VAST with a “good” quality level.
Cutout generation requires authentication:
Casda.login(username='your_OPAL_username')
Now we request the cutouts from CASDA’s cutout generation service:
url_list = Casda.cutout(to_cutout, coordinates="22h15m38.020s -45d51m00.410s", radius="30 arcmin")
There is no need to stage the data, as the cutout request directly returns a list of URLs from which the cutouts can be downloaded. Now we download the cutouts (and their checksums):
file_list = Casda.download_files(url_list, savedir='.')
Example: spatial and spectral cutout
This example will make a spatial and spectral cutout around NGC 1371 from the WALLABY Early Science user derived cube from For et al. 2021 (CASDA project code AS035, CASDA collection DOI https://doi.org/10.25919/kqrt-pv24).
First, getting the table containing metadata for the image cube:
from astropy import units as u
from astropy.coordinates import SkyCoord
import numpy as np
from astroquery.casda import Casda
centre = SkyCoord.from_name('NGC 1371')
result = Casda.query_region(centre, radius=1.5*u.arcmin)
eridanus_cube = result[result['filename'] == 'Eridanus_full_image_V3.fits']
To make a cutout in the spectral domain, we need a list containing the lower and upper bounds of the desired frequency range. This can be constructed in several ways:
- Directly, e.g.:
freq = [1411*u.MHz, 1416*u.MHz]
- Via radial velocity measured relative to a specific spectral line, e.g. using HI:
vel = np.array([1000, 1850])*u.km/u.s
freq = vel.to(u.Hz, equivalencies=u.doppler_radio(1.420405751786*u.GHz)) - Via redshift, e.g. using HI as the rest frequency:
z = np.array([0.02, 0.05])*u.dimensionless_unscaled
vel = z.to(u.km/u.s, equivalencies=u.doppler_redshift)
freq = vel.to(u.Hz, equivalencies=u.doppler_radio(1.420405751786*u.GHz)) - By channel index, e.g.:
chans = [2000, 5000]
The spectral bounds are then provided to Casda.cutout
via the band
keyword argument (if defining with a frequency/velocity/redshift range) or the channel
keyword argument (if defining by channel indices). For any method of defining a spectral range, None
can be provided for either bound to make them open (i.e. continue all the way to the lower/upper bound of the data itself).
Continuing from the code block above, and defining a spectral cutout via radial velocity:
Casda.login(username='OPAL_USERNAME')
# set velocity range relative to HI
vel = np.array([1000, 1850])*u.km/u.s
freq = vel.to(u.Hz, equivalencies=u.doppler_radio(1.420405751786*u.GHz))
# make cutout using radius and freq params
url_list = Casda.cutout(eridanus_cube, coordinates=centre, radius=10*u.arcmin, band=freq)
# download
filelist = Casda.download_files(url_list, savedir='.')
4.3.5. Query catalogues and tables with TAP and ADQL
The Astroquery CASDA module can be used to make Table Access Protocol (TAP) queries of CASDA catalogues and metadata tables using the Astronomical Data Query Langauge (ADQL). For more details on ADQL and for generic instructions on constructing an ADQL query, see 4.1.2. Catalogue search via Table Access Protocol (TAP) Query. There you will also find descriptions of the available CASDA tables. TOPCAT can also be used to view the available tables and their respective metadata to facilitate their access via Astroquery.
To perform a TAP Query with Astroquery, a connection to CASDA’s TAP service must first be established:
from astroquery.casda import Casda
from astroquery.utils.tap.core import TapPlus
casdatap = TapPlus(url="https://casda.csiro.au/casda_vo_tools/tap")
casdatap
can then be used to launch TAP queries and obtain the returned tables. This is done by passing an ADQL query in a string to the launch_job_async
function:
query = {ADQL Query}
job = casdatap.launch_job_async(query)
res = job.get_results()
res
is then a Table containing the results of the submitted query.
All the examples given in 4.1.2. Catalogue search via Table Access Protocol (TAP) Query are applicable to making TAP queries using Astroquery. A number more that are appropriate for use in the Astroquery context are given below.
The TAP query functionality can be combined with examples and features described in previous sections, including obtaining sky positions via object name with SkyCoord
, applying masks to returned tables, generating image cutouts using a table of image metadata, and downloading data. The TAP query is simply a method of obtaining a table with rows meeting certain criteria.
Example: Search for images containing a point on the sky meeting a set of filter conditions
This query returns primary observational images in Stokes I from VAST with a “GOOD” quality level containing the specified sky position. This replicates the mask given as an example in 4.3.3, but could be considered more direct or efficient, as only the entries matching the criteria are returned, rather than returning all images and then filtering them down.
ra, dec = 8.2699400, -70.6625389
query = "SELECT * "\
"FROM ivoa.obscore "\
f"WHERE (1=CONTAINS(POINT('ICRS', {ra}, {dec}), s_region) "\
" AND (dataproduct_subtype = 'cont.restored.t0') "\
" AND (pol_states = '/I/') "\
" AND (obs_collection = 'VAST') "\
" AND (quality_level = 'GOOD'))"
Note the use of an f-string and string concatenation over multiple lines with \
to improve the readability of the query.
Example: Search for RACS DR1 images containing a position, selecting images by filename
ra, dec = 8.2699400, -70.6625389
query = "SELECT * "\
"FROM ivoa.obscore "\
"WHERE (filename LIKE 'RACS-DR1%' "\
" AND filename LIKE '%A.fits' "\
f" AND 1=CONTAINS(POINT('ICRS', {ra}, {dec}), s_region))"
4.4. VOEvent URL Endpoint
CASDA provides a lightweight URL endpoint for users to discover CASDA related events. This endpoint provides a list of VOEvent .xmls which describe when data is deposited, validated, released, and updated in CASDA. It can be filtered by event time, type, project code, and scheduling block ID. Users can use this to identify when their project has new data, or when a SBID has been deposited into CASDA. It also allows machines to query CASDA, identify when data has been deposited, and automatically start a workflow.
The VOEvent mechanism, including the format and recommended methods for programmatically making queries, is documented here.
Possible events are:
- DEPOSITED. Data for the observation was first deposited in CASDA.
- VALIDATED. The observation data was validated by the science team.
- RELEASED. The observation data was released by the observatory.
- UPDATED. Data was added to the observation, either for a new or existing project.
Use of this endpoint is by constructing a full URL from the endpoint base and a set of filters. The available filters are listed in the table below:
Filter | Description | Example | |
project | Select by CASDA project code | project=AS108 | |
sbid | Select by scheduling block ID | sbid=10944 | |
telescope | Select by observing telescope | telescope=ASKAP | |
from | Earliest date and time to include (ISO 8601 format, UTC time) | from=2021-01-26T12:00:00.000Z | |
to | Latest date and time to include(ISO 8601 format, UTC time) | to=2023-12-31T23:59:59.999Z | |
event | Type of event [DEPOSITED, UPDATED, VALIDATED, REJECTED or RELEASED] | event=RELEASED | |
maxrec | Include to limit the number of returned results | maxrec=10 |
Queries are constructed as:
https://casda.csiro.au/casda_data_access/observations/events?[filter]&[filter]&...
As many or as few filters may be applied as necessary. The URL endpoint can be accessed via any web browser or by downloading the VOEvent XMLs directly. Accessing via browser can be a useful way to quickly check what data has been released or deposited recently.
Some examples of making filtered queries follow.
Example: Query for all events for project code AS101
https://casda.csiro.au/casda_data_access/observations/events?project=AS101
Example: Query for all events for project code AS101 between July 31, 2019, 12pm and January 1, 2020, 12am UTC time
https://casda.csiro.au/casda_data_access/observations/events?from=2019-07-31T12:00:00.000Z&to=2020-01-01T00:00:00.000Z&project=AS101
Example: Query for RELEASED events for SBID 9325
https://casda.csiro.au/casda_data_access/observations/events?event=RELEASED&SBID=9325
5. Admin/Science team CASDA tools
5.1. Validate data
When ASKAP observational data is deposited into CASDA it is unreleased but available to members of the respective science team. The expectation is that survey science teams, with the guidance of the ASKAP observatory team, will perform an assessment of the quality of the data. This is called data validation. Data will then be released or rejected by the observatory staff (CASDA Administrator) depending on the quality of the data.
Validation is not intended to be an exhaustive assessment of the data, rather a set of metrics used to determine if data products are fit enough for purpose for public release. The question being asked is not “is the data perfect?” but rather “is the data usable?”.
Science teams should validate each scheduling block’s data as it comes in, not wait for an entire survey to be completed, and aim to meet the full survey requirement of a two-week maximum turnaround from data deposit to validation. Implementation of automated or semi-automated pipelines that provide metrics to quickly validate data is encouraged.
The basic steps of validation, per project and SBID, are:
- Set Quality Flags, e.g. Strong RFI, Baselines flagged, Beams flagged, etc. “No quality indicators required” is also an option.
- Set a quality level (“Bad”, “Uncertain”, or “Good”), either applied globally or applied per data product.
- Add validation notes describing the data. These do not have to be overly detailed, but should contain any known issues with the data and have enough information for the user to determine whether they should use that data.
Some examples of suitable validation notes:
- SBID: 10083 (EMU released dataset)
These data are EMU pilot survey observations taken with ASKAP in preparation for the full EMU survey. Although the image is scientifically valuable, ASKAP is still being commissioned and the data are not yet at the quality expected from the full EMU survey. Therefore, these data have been released with the validation flag set to “uncertain” — meaning user discretion is necessary in the interpretation for science purposes. Additionally, please note that the quantitative validation metrics associated with these data are still being developed and should not be used as a representative assessment of the data quality. Further details will be available on askap.pbworks.com/PilotSurvey as we fully evaluate these data. The ASKAP operations and EMU data validation teams would be glad to receive feedback on any issues discovered. Known issues include (a) artefacts exist near bright sources of a few hundred mJy and above. (b) the cataloged spectral indices for weak sources are too steep and the errors on several catalog parameters are underestimated. (c) approximately 20% of the coarse cube channels are heavily or entirely flagged due to a known correlator error. (d) the restoring beam varies both spatially and spectrally from the beam given in the image header. - SBID: 10626 (WALLABY released dataset)
These are pilot survey observations. There are known issues in the data cubes (e.g. continuum residuals, imaging artifacts) that can be mitigated if/when these observations are re-processed using the newer flagging and uv-based continuum subtraction tools in ASKAPsoft. - SBID: 11269 (VAST rejected dataset)
These data are part of a pilot survey for an ASKAP Survey for Variables and Slow Transients (VAST). This particular observation had issues associated with the bandpass that severely affected imaging. This observation is not reliable and it is not recommended for science.
Validators can see what data is available for them to validate via the “Tasks” page of the DAP, accessible in the top bar or user drop-down menu:
Once data has been validated, CASDA administrators are responsible for its release. The CASDA administrator will check the validation quality flags, quality level and notes for consistency. In general, the CASDA administrator will release all data which is marked “Good” or “Uncertain” and reject all data which is “Bad”. Occasionally, “Bad” data may still be released if it is of use to the astronomical community, but with the appropriate note for users to use with caution.
5.2. Managing science team roles and assignments
Project administrators can manage the permissions and roles of users that are part of their science projects. This is done in the DAP under Select project > Set roles in the user drop-down menu. There are three roles that can be assigned to users:
- Access Project Data. Enables a user to access project data, including unreleased data.
- Project Validator. Allows a user to perform validation of project data, as well as access all project data, including unreleased data.
- Project Administrator. Has all the permissions of a Project Validator and can access all project data, as well as being able to see and edit science team lists.
5.3. Deposit Level 7 data products
Science teams may deposit Level 7 “derived” data products that have been further processed than the standard ASKAPsoft pipeline outputs (Level 6 “observational” data products) to make them accessible via CASDA.
To be able to deposit data, your DAP user account must be given depositor access. For more information on receiving these permissions, please contact Minh Hyunh.
5.3.1. Accepted data products and requirements
The following table describes the accepted data products, and their respective requirements.
Data type | Requirements |
Image Cubes | FITS format FITS Header must include correct and readable WCS information |
Cubelets and moment maps | FITS format Included in a separate tab from image cubes |
Catalogues | Must conform to the VOTABLE v1.3 format All fields must be named Must not be named a reserved SQL or ADQL wordAlphanumeric characters and underscores onlyUnified Content Descriptors (UCDs) must be provided for a minimum of three fields:meta.id;meta.main (main name/ID field)pos.eq.ra;meta.main (main RA field)pos.eq.dec;meta.main (main Dec field)Providing UCDs and descriptions for each field are encouraged but not requiredWhere provided, UCDs must be validDescriptions must be 255 characters or lessAll fields should have correct units, data types, and precisionsThree additional parameters must be provided:Catalogue Name – the name used to query the catalogue with Table Access Protocol (TAP)Indexed Fields – which fields should have an indexPrincipal Fields – which fields should be recommended for returning in each query |
We strongly recommend reference and adherence to the International Astronomical Union’s conventions for naming astronomical sources and Best Practices for Data Publication in the Astronomical Literature (Chen et al., 2021).
5.3.2. Creating a new deposit
To create a new deposit, select “New Deposit” in the user drop-down menu on the DAP:
The next page provides information regarding the requirements for conducting a deposit to the DAP that should be read and understood before continuing.
You will also be prompted to select method for performing the deposit. All data intended for CASDA should be deposited via the option “Upload CSIRO ASKAP Science Data Archive (CASDA) data, and manually create a record to describe it”.
The next page requires acknowledgement that CASDA data is required to become publicly accessible, and that you have clearance to deposit your data. This page also prompts for the Project Code to assign the data to.
The next page contains the full data deposit form, made up of four parts accessible by the menu on the left side of the page:
A draft of the deposit can be saved at any time with the “Save Draft” button, and a preview of the resulting collection viewed with “Preview”. When the form is completed, click “Publish” to go through with the deposit.
Description
This part is where metadata about the data being deposited is entered. Each field is described by an info block on its right. This part is made up of several sections:
- About this Collection. The key metadata fields, including collection title, description, relevant Schedblock IDs, credits, start & end dates, and keywords.
- Field(s) of Research. A set of codes describing the research field(s) the data is relevant to.
- Location Details. Used to specify if the data is relevant to a geographic location (typically irrelevant for astronomical data).
- Related Links. Gives the option to provide a set of links to associated resources.
- Share With Other Systems. Gives the option to share the deposited data’s metadata with external systems (typically irrelevant for astronomical data).
- Supporting Documentation. Optionally provide supporting documentation, e.g. a README, for your data. As CASDA places strict requirements on the type and format of the uploaded data products, this should typically not be necessary.
- More about this Collection. Allows for specification of a metadata schema. Data deposited to CASDA should select “VO Resource” from the dropdown (selected by default) and ensure a valid IVO Identifier is entered in the relevant field (one is generated automatically).
- Project Details. N/A to CASDA data (project code already given).
- Organisation Details. Information about the organisation through which the data was acquired. For CASDA data, no fields in this section should require editing.
- Collaborating Organisations. If another organisation is a collaborator and holds rights to the data, it should be entered here. Alternatively, the credit field (under “About this Collection”) can be used to acknowledge collaborators that are not rights holders.
- Funding Sources. Provide information about sources of funding for the project.
- Activity Details. N/A to CASDA data.
Citation
This part is where information required to cite the deposited data must be entered. This includes the lead researcher, other contributors, and year of publication. These fields will already be filled out, default entering the Principal Investigator of the provided project code as the lead researcher, and all others assigned to that project code as contributors.
Files
All data in CASDA is stored at the Pawsey Supercomputing Centre. In order to deposit data into CASDA, it must first be uploaded to a directory on a Pawsey machine. The fields in the “Files” section of the deposit form take paths to directories containing the files to be deposited.
It is strongly recommended that a unique directory is made for each new deposit made, and that each data product type is further separated into sub-directories. For example, data to be deposited should be organised into directories such as “new_deposit/catalogues/”, “new_deposit/images/”, and so on.
Fill in at least one of these fields with a valid directory and click “Resolve” to scan the entered directories and return a list of files contained within them. These are the files that will be included in the new deposit.
Access
This part specifies the accessibility of the deposited data. As mentioned previously, all data in CASDA must become publicly available. If an embargo period is required, its end date can be entered here. Contact information for the deposited data can also be optionally provided, and a Creative Commons Attribution 4.0 International Licence must be selected as the End-User Licence.
Appendix: Example Astroquery Python scripts
Below are some example Python scripts applying Astroquery functionality to some common use cases.
Get a cutout from RACS by coordinates
"""Script to get a cutout of RACS-DR1 images of a given coordinate.
"""
import sys
from astropy import units as u
from astropy.coordinates import SkyCoord
from astroquery.casda import Casda
from astroquery.utils.tap.core import TapPlus
# set these to your own values
OPAL_USER = "danica.scott@csiro.au"
SAVEDIR = "/Users/sco320/sandbox"
# validate args
if len(sys.argv) != 4:
print("Need 3 args")
print(f"Usage: python {sys.argv[0]} ra_deg dec_deg radius_arcmin")
exit()
# get args
centre = SkyCoord(sys.argv[1], sys.argv[2], unit=(u.deg, u.deg))
ra = centre.ra.degree
dec = centre.dec.degree
radius = float(sys.argv[3]) * u.arcmin
# construct query for catalogued sources in RACS-DR1
# ivoa.obscore is the table name
# filters for RACS-DR1 fits images containing sources overlapping with
# the given coordinates
query = "select * from ivoa.obscore "\
"where filename LIKE 'RACS-DR1%' "\
"AND filename LIKE '%A.fits' "\
f"AND 1 = CONTAINS(POINT('ICRS',{ra},{dec}),s_region)"
# open connection to TAP service and run query
casdatap = TapPlus(url="https://casda.csiro.au/casda_vo_tools/tap")
job = casdatap.launch_job_async(query)
table = job.get_results()
print(table)
# login to CASDA to be able to download images
casda = Casda()
casda.login(username=OPAL_USER, store_password=True)
# request a cutout of the images returned by the query and download
url_list = casda.cutout(table, coordinates=centre, radius=radius)
file_list = casda.download_files(url_list, savedir=SAVEDIR)
print("Downloaded:")
for file in file_list:
print(f" {file}")
Get a cutout from RACS by object name
"""Script to get a cutout of RACS-DR1 images containing a given object.
"""
import sys
from astropy import units as u
from astropy.coordinates import SkyCoord
from astroquery.casda import Casda
from astroquery.utils.tap.core import TapPlus
# set these to your own values
OPAL_USER = "danica.scott@csiro.au"
SAVEDIR = "/Users/sco320/sandbox"
# validate args
if len(sys.argv) != 3:
print("Need 2 args")
print(f"Usage: python {sys.argv[0]} obj_name radius_arcmin")
exit()
# get args
obj_name = sys.argv[1]
radius = float(sys.argv[2]) * u.arcmin
# lookup object coords from name
centre = SkyCoord.from_name(obj_name)
ra = centre.ra.degree
dec = centre.dec.degree
# construct query for RACS-DR1 images containing the given object
# ivoa.obscore is the table name
# filters for RACS-DR1 fits images containing sources overlapping with
# the given coordinates
query = "select * from ivoa.obscore "\
"where filename LIKE 'RACS-DR1%' "\
"AND filename LIKE '%A.fits' "\
f"AND 1 = CONTAINS(POINT('ICRS',{ra},{dec}),s_region)"
# open connection to TAP service and run query
casdatap = TapPlus(url="https://casda.csiro.au/casda_vo_tools/tap")
job = casdatap.launch_job_async(query)
table = job.get_results()
print(table)
# login to CASDA to be able to download images
casda = Casda()
casda.login(username=OPAL_USER, store_password=True)
# request a cutout of the images returned by the query and download
url_list = casda.cutout(table, coordinates=centre, radius=radius)
file_list = casda.download_files(url_list, savedir=SAVEDIR)
print("Downloaded:")
for file in file_list:
print(f" {file}")
Search for data containing an object and download images from a specific scheduling block (schedblock/SBID)
import sys
from astropy import units as u
from astropy.coordinates import SkyCoord
from astroquery.casda import Casda
from astroquery.utils.tap.core import TapPlus
# set these to your own values
OPAL_USER = "danica.scott@csiro.au"
SAVEDIR = "/Users/sco320/sandbox"
# validate args
if len(sys.argv) != 3:
print("Need 2 args")
print(f"Usage: python {sys.argv[0]} obj_name radius_arcmin")
exit()
# get args
obj_name = sys.argv[1]
radius = float(sys.argv[2]) * u.arcmin
# lookup object coords from name
centre = SkyCoord.from_name(obj_name)
print(centre)
# do the cone search
result_table = Casda.query_region(centre, radius=radius)
print(result_table["obs_id","s_ra","s_dec","dataproduct_subtype",
"obs_release_date"])
# example: get images from SB2338
# filter for public data
public_data = Casda.filter_out_unreleased(result_table)
print(public_data["obs_id","s_ra","s_dec","dataproduct_subtype",
"obs_release_date"])
# filter for data from SB2338 and filter out auxilliary products
subset = public_data[(public_data["obs_id"]=="2338")
&(public_data["dataproduct_subtype"]=="cont.restored.t0")]
print(subset["obs_id","s_ra","s_dec","dataproduct_subtype",
"obs_release_date", "filename"])
# login to CASDA to be able to download images
casda = Casda()
casda.login(username=OPAL_USER)
# stage and download full images - can be quite large!
url_list = casda.stage_data(subset)
file_list = casda.download_files(url_list, savedir=SAVEDIR)
print("Downloaded:")
for file in file_list:
print(f" {file}")
Document versions
Version | Date | Author | Notes |
---|---|---|---|
2.0 | 2024-05-31 | Danica Scott | New user guide to replace the old one |