Overview#
psi-io is a Python package developed by Predictive Science Inc. for
interacting with the PSI HDF data ecosystem. It provides a unified interface for reading and writing
HDF4 and HDF5 files, as well as tools for reading portions of datasets and interpolating data to
arbitrary positions.
Quick Start#
The following example opens a MAS HDF5 file containing a radial magnetic field, reads the full 3-D dataset remeshed to the main (cell-center) grid in CGS units (Gauss), and returns the data array alongside its three spherical coordinate scales (\(r\), \(\theta\), \(\varphi\)).
import psi_io
with psi_io.PsiData('br002.h5') as mas_reader:
br, r_scale, t_scale, p_scale = mas_reader.read(mesh='main', unit='cgs')
PSI Conventions#
PSI follows certain conventions for storing data in HDF files. While these conventions are not strictly enforced, adhering to them ensures compatibility with PSI tools and libraries. Much of the scientific computing done through PSI relies on tools developed in Fortran. As such, the conventions used within PSI’s python ecosystem mirror those used in Fortran-based PSI tools. However, due to the fact that Python uses C-style (row-major) array ordering, while Fortran uses Fortran-style (column-major) ordering, there are some differences in how dimension scales are stored between HDF4 and HDF5 files.
To illustrate, a 3D dataset stored in Fortran order would have its dimensions ordered as (X, Y, Z) in Fortran. In Python, this same dataset would be represented as (Z, Y, X) due to the difference in array ordering conventions. For 3D datasets which are based in spherical coordinates – e.g. the “3D cubes” generated through MAS – datasets are represented in python as arrays with shape (phi, theta, r) and dimension scales r (radius), theta (colatitude), and phi (longitude).
Note
The difference in array ordering conventions between Fortran and Python can lead to confusion when working with dimension scales in HDF files. It is important to be aware of these conventions when reading or writing HDF files to ensure that data is interpreted correctly.
To further complicate things, HDF4 and HDF5 handle dimension scales differently. HDF5 supports Fortran ordering of dimension scales, while HDF4 does not i.e. the first dimension of the aforementioned 3D dataset would have its dimension scale corresponding to r in HDF5, but phi in HDF4. Below are the specific conventions used in HDF4 and HDF5 files within PSI.
HDF4 Conventions#
Datasets are stored in the root group (“/”) and named “Data-Set-2”.
Dimension scales are named “fakeDim0”, “fakeDim1”, …, “fakeDimN” for N-dimensional datasets.
Datasets are Fortran ordered (column-major). However, due to limitations in HDF4, the dimension scales are stored in C-order (row-major).
HDF5 Conventions#
Datasets are stored in the root group (“/”) and named “Data”.
Dimension scales are named “dim0”, “dim1”, …, “dimN” for N-dimensional datasets.
Datasets are Fortran ordered (column-major), and dimension scales are also stored in Fortran order.
psi-io Conventions#
To mitigate these idiosyncrasies, psi-io provides a unified interface for reading and writing HDF4 and HDF5 files
that abstracts away the differences in conventions. When data is read from an HDF (whether HDF4 or HDF5), psi-io
data is always returned in Fortran order with dimension scales correctly associated with their respective dimensions.
When writing data to an HDF file, psi-io ensures that the appropriate conventions for HDF4 or HDF5 are followed based
on the file extension.
When reading a 3D dataset (either HDF4 or HDF5) with psi-io, the returned array will have shape \((\phi, \theta, r)\) and
the associated dimension scales will be returned as \((r, \theta, \phi)\). This ensures that the data is always represented
in a consistent manner, regardless of the underlying HDF format.
PSI Model Quantities & Coordinate Scales#
PSI’s modeling codes — MAS and POT3D — write output as HDF files containing 3-D
physical fields defined on a structured spherical grid
\((r, \theta, \varphi)\). Each file also stores three 1-D coordinate arrays
that describe the grid. The tables below summarize every quantity that
psi-io recognises, together with its physical meaning and units.
Coordinate scales#
Every MAS and POT3D HDF file includes three 1-D coordinate arrays that define the spherical grid. These scale identifiers are shared across PSI codes:
Key |
Symbol |
Physical quantity |
Units |
Range |
NumPy axis |
|---|---|---|---|---|---|
|
\(r\) |
Radial distance |
solar radii |
\(r \geq 1\,R_\odot\) |
last ( |
|
\(\theta\) |
Co-latitude (pole = 0) |
radians |
\([0,\,\pi]\) |
middle ( |
|
\(\varphi\) |
Longitude |
radians |
\([0,\,2\pi]\) |
first ( |
Because PSI HDF files are Fortran-ordered (column-major), the r dimension
varies fastest in memory. When psi-io loads a 3-D dataset into NumPy
(row-major), the resulting array shape is \((N_\varphi,\,N_\theta,\,N_r)\):
r is the last axis, t is the middle axis, and p is the first axis.
The three returned scale arrays always correspond to those axes in that order.
MAS model quantities#
MAS outputs 19 3-D physical fields. Code-unit values are converted to physical (CGS) units by multiplying by MAS normalization constants; approximate physical scales are noted in the unit column.
Key |
Symbol |
Physical quantity |
Physical unit (CGS) |
Type |
Mesh code |
|---|---|---|---|---|---|
|
\(v_r\) |
Radial velocity |
km s−1 (≈ 481 km s−1 per code unit) |
vector |
|
|
\(v_\theta\) |
Co-latitude velocity |
km s−1 |
vector |
|
|
\(v_\varphi\) |
Longitude velocity |
km s−1 |
vector |
|
|
\(B_r\) |
Radial magnetic field |
Gauss (≈ 2.2 G per code unit) |
vector |
|
|
\(B_\theta\) |
Co-latitude magnetic field |
Gauss |
vector |
|
|
\(B_\varphi\) |
Longitude magnetic field |
Gauss |
vector |
|
|
\(J_r\) |
Radial current density |
A m−2 |
vector |
|
|
\(J_\theta\) |
Co-latitude current density |
A m−2 |
vector |
|
|
\(J_\varphi\) |
Longitude current density |
A m−2 |
vector |
|
|
\(T\) |
Single-fluid temperature |
MK (≈ 28 MK per code unit) |
scalar |
|
|
\(T_e\) |
Electron temperature |
MK |
scalar |
|
|
\(T_p\) |
Proton temperature |
MK |
scalar |
|
|
\(\rho\) |
Plasma density |
cm−3 (108 cm−3 per code unit) |
scalar |
|
|
\(p\) |
Plasma pressure |
erg cm−3 |
scalar |
|
|
\(e^+\) |
Forward Alfvén wave energy density |
erg cm−3 |
scalar |
|
|
\(e^-\) |
Backward Alfvén wave energy density |
erg cm−3 |
scalar |
|
|
\(z^+\) |
Outward Elsässer amplitude |
km s−1 |
scalar |
|
|
\(z^-\) |
Inward Elsässer amplitude |
km s−1 |
scalar |
|
|
\(Q\) |
Volumetric heating rate |
erg cm−3 s−1 |
scalar |
|
POT3D model quantities#
POT3D solves for the potential magnetic field \(\mathbf{B} = -\nabla\Psi\) driven by a photospheric boundary magnetogram. It outputs three spherical field components:
Key |
Symbol |
Physical quantity |
Physical unit |
Mesh code |
|---|---|---|---|---|
|
\(B_r\) |
Radial magnetic field |
input magnetogram units (typically G) |
|
|
\(B_\theta\) |
Co-latitude magnetic field |
input magnetogram units |
|
|
\(B_\varphi\) |
Longitude magnetic field |
input magnetogram units |
|
Note
POT3D mesh codes are the bitwise complement of the corresponding MAS magnetic field
codes (POT3D_mesh = 0b111 ^ MAS_mesh). Where MAS places each component on the
face through which it is the outward normal (face-centred), POT3D places the same
component on the opposite pair of edges (edge-centred):
br: MAS0b100(r-face) → POT3D0b011(r-edge)bt: MAS0b010(θ-face) → POT3D0b101(θ-edge)bp: MAS0b001(φ-face) → POT3D0b110(φ-edge)
Note
POT3D does not apply a normalization: values are stored in the same physical
units as the input boundary magnetogram (most commonly Gauss, but this is
run-dependent). When reading POT3D output, always specify the correct unit
explicitly; read(units='physical') without a unit override will return
dimensionless values.
Using psi-io#
- Reading Full Datasets & Scales:
- Writing Full Datasets & Scales:
- Reading Subsets of Datasets:
- Reading & Writing File Metadata:
- Interpolating Data to Arbitrary Positions:
- Reading Coordinate/Mesh-Aware MHD Model Output:
Note
The HDF type (HDF4 or HDF5) is automatically determined by the file extension
(“.hdf” for HDF4 and “.h5” for HDF5) when using psi-io functions.
Note
Not all PSI FORTRAN tools can read HDF4 files written by the pyhdf.SD interface.
If you have a problem, use the PSI tool hdfsd2hdf to convert.