Overview#

psi-data-utils centers on a single module-level pooch fetcher (the module-level FETCHER) and a family of fetch_* helper functions. Each helper resolves the registry keys for the requested data, downloads any files that are not already cached, verifies them against the checksums recorded in the packaged registry, and returns their on-disk paths.

Because every download is checksum-verified and cached, calling a fetch_* function a second time is effectively free: the cached file is reused without re-downloading. This makes psi-data-utils well suited to documentation examples, tutorials, and test fixtures that need real PSI model output without bundling large binary files.

All public objects are importable directly from the top-level package:

import psi_data

psi_data.fetch_mas_data         # MAS coronal / heliospheric MHD fields
psi_data.fetch_mas_quantities   # MAS quantities at the inner radial boundary
psi_data.fetch_pot3d_data       # POT3D potential-field components
psi_data.fetch_all              # every file in the registry
psi_data.clear_psi_cache        # remove the local download cache

Available Data#

The bundled data come from a thermodynamic MAS standard run for Carrington rotation 2309, driven by an HMI photospheric magnetogram, together with a POT3D potential-field source-surface (PFSS) solution. Fields are defined on a structured spherical grid \((r, \theta, \varphi)\) and follow PSI’s HDF storage conventions.

The variable names accepted by each fetcher are summarized below. For the physical meaning, units, coordinate conventions, and mesh staggering of each quantity, refer to the psi-io overview, which documents the PSI model quantities in detail.

Fetcher

Domain key(s)

Available variables

fetch_mas_data()

cor, hel

br, bt, bp (\(\mathbf{B}\)); vr, vt, vp (\(\mathbf{v}\)); jr, jt, jp (\(\mathbf{J}\)); t, rho, p; and, for the coronal domain only, the wave/heating quantities ep, em, zp, zm, heat

fetch_pot3d_data()

pot3d

br, bt, bp (\(\mathbf{B}\))

fetch_mas_quantities()

quantities

ch_pm

The coronal (cor) domain spans the low corona out to the source surface, while the heliospheric (hel) domain extends from the source surface into the inner heliosphere. When variables is omitted for a multi-domain MAS request, only the variables common to all requested domains are fetched.

In addition to the model fields, three standalone example files are provided for use in tutorials and tests:

Quick Start#

Every fetcher accepts its target variables as a comma-separated string, as any iterable of names, or as None to select a sensible default set. The MAS and POT3D helpers return a Filepaths named tuple whose fields identify each file and whose values are the cached Path objects:

 1import psi_data
 2
 3# A single domain / variable pair
 4paths = psi_data.fetch_mas_data(domains="cor", variables="br")
 5paths.cor_br                      # -> PosixPath('.../cor/mhd/br002.h5')
 6
 7# Multiple domains and variables at once
 8paths = psi_data.fetch_mas_data(domains="cor,hel", variables="br,vr")
 9paths._fields                     # -> ('cor_br', 'cor_vr', 'hel_br', 'hel_vr')
10
11# POT3D potential-field components (all three when variables is None)
12pot3d = psi_data.fetch_pot3d_data(variables=None)
13pot3d.br, pot3d.bt, pot3d.bp

The single-file example helpers return a lone Path:

import psi_data

fieldline = psi_data.fetch_example_fieldline()
rscale = psi_data.fetch_example_radial_scale()

Note

fetch_example_chmapdb() is only available in HDF5. Requesting it with hdf=4 emits a RegistryWarning and returns None rather than raising.

Selecting the HDF format#

Every fetcher accepts an hdf keyword selecting the file format to download: 5 for HDF5 (.h5, the default) or 4 for HDF4 (.hdf). An unsupported value raises ValueError.

import psi_data

h5_paths = psi_data.fetch_mas_data(domains="cor", variables="br", hdf=5)
h4_paths = psi_data.fetch_mas_data(domains="cor", variables="br", hdf=4)

Note

The HDF format is also inferred from the file extension by downstream PSI tools such as psi-io (.hdf for HDF4, .h5 for HDF5), so the files returned by psi-data-utils can be passed straight through without specifying the format again.

Caching and offline use#

Downloaded files are stored in the platform-specific pooch cache, resolved by pooch.os_cache() under the psi project name (for example, ~/Library/Caches/psi on macOS or ~/.cache/psi on Linux). Set the PSI_DATA_CACHE environment variable to override this location:

export PSI_DATA_CACHE=/path/to/my/cache

To populate the cache up front — for example, before working offline — fetch the entire registry for a given format with fetch_all():

import psi_data

all_h5 = psi_data.fetch_all(hdf=5)   # download every HDF5 file

Warning

fetch_all() downloads the complete data collection for the chosen format and may transfer a large volume of data on its first invocation.

To reclaim disk space, clear the cache with clear_psi_cache(). By default it performs a dry run (reporting what would be removed) and prompts for confirmation before deleting; both the default OS cache and the PSI_DATA_CACHE override are considered:

import psi_data

psi_data.clear_psi_cache()                            # dry run — nothing deleted
psi_data.clear_psi_cache(dry_run=False)               # delete, with confirmation
psi_data.clear_psi_cache(dry_run=False, prompt=False) # delete without prompting

Development mode#

Setting the DEVELOPMENT environment variable before importing psi_data redirects the fetcher to a local development server (http://localhost:8000) and selects the development registry (registry-dev.txt) instead of the published PSI asset host. This is intended for maintainers regenerating or testing the registry against a local mirror of the asset tree, and is not required for normal use.