Source code for psi_io.mhd_io

   1r"""Lazy HDF readers for PSI MAS and POT3D magnetohydrodynamic model output.
   2
   3This module provides a unified, unit-aware interface for reading three-dimensional
   4field variables from Predictive Science Inc.'s MAS and POT3D solvers.  Both HDF4
   5(``.hdf``) and HDF5 (``.h5``) files are supported through a common API.  The sole
   6public symbol exported by this module is :func:`PsiData`.
   7
   8.. rubric:: Entry Point
   9
  10:func:`PsiData` is a factory function that opens a PSI HDF file and returns a
  11lazy reader object.  The file extension determines the I/O backend — ``.h5``
  12files are read via `h5py <https://www.h5py.org/>`_ and ``.hdf`` files via
  13`pyhdf <https://fhs.github.io/pyhdf/>`_.  The *model* argument (``'mas'`` or
  14``'pot3d'``) selects the quantity-property and mesh-stagger metadata tables
  15applied during metadata resolution:
  16
  17.. code-block:: python
  18
  19    from psi_io.mhd_io import PsiData
  20
  21    reader = PsiData('br001001.h5')                    # MAS HDF5 (default)
  22    reader = PsiData('br001001.hdf', model='mas')      # MAS HDF4
  23    reader = PsiData('br001.h5', model='pot3d')        # POT3D HDF5
  24
  25.. rubric:: Lazy Loading and Caching
  26
  27No data are read from disk at construction time.  Metadata — quantity name,
  28sequence number, physical unit, and mesh stagger — are parsed from the filename
  29stem and/or HDF file attributes.  Data are transferred from disk only when
  30:meth:`read` or :meth:`vslice` is called.
  31
  32When a read or vslice call covers the *entire* dataset with no spatial
  33restrictions, the result is cached on the reader object.  Subsequent full-array
  34calls return the cached copy without a second disk read.  Partial reads — any
  35call with at least one non-full-axis argument — are never cached.  The cache
  36state is exposed via the ``is_cached`` property.
  37
  38.. rubric:: Attributes
  39
  40The following attributes are available on every object returned by
  41:func:`PsiData`:
  42
  43``quantity`` : :class:`str`
  44    Canonical lower-case quantity identifier (e.g. ``'br'``, ``'vr'``,
  45    ``'t'``).  Resolved from the filename stem, HDF file attributes, or the
  46    ``quantity`` keyword argument passed to :func:`PsiData`.
  47
  48``sequence`` : :class:`int`
  49    Time-step sequence number extracted from the filename
  50    (e.g. ``br001001.h5`` → ``1001``) or from HDF file attributes.
  51
  52``unit`` : :class:`~astropy.units.Unit`
  53    Conversion factor from one code unit to physical units.  For MAS
  54    quantities these are the custom normalization units defined in
  55    :mod:`psi_io._units` (e.g. :data:`~psi_io._units.MAS_b` ≈ 2.2 G for
  56    magnetic field, :data:`~psi_io._units.MAS_v` ≈ 481 km s⁻¹ for velocity).
  57    For POT3D the default is dimensionless — see the :func:`PsiData` warning.
  58
  59``mesh`` : :class:`tuple` of :class:`~psi_io._mesh.Mesh`
  60    Yee-grid stagger position for each spatial axis in physical ``(r, θ, φ)``
  61    order.  Each element is either :attr:`~psi_io._mesh.Mesh.HALF` (offset by
  62    half a cell spacing) or :attr:`~psi_io._mesh.Mesh.MAIN` (cell-centred).
  63    The integer encoding and per-quantity stagger codes are defined in
  64    :mod:`psi_io._mesh`; the canonical default for each quantity is stored in
  65    its :class:`~psi_io._models.Props` descriptor (see :mod:`psi_io._models`).
  66
  67``props`` : :class:`~psi_io._models.Props`
  68    Complete property descriptor for the quantity, bundling its name,
  69    description, unit, and mesh code.  Looked up from the appropriate
  70    quantity-properties mapping in :mod:`psi_io._models`.
  71
  72``description`` : :class:`str`
  73    Human-readable description of the physical quantity
  74    (e.g. ``'MAS Magnetic Field (Radial Component)'``).  Derived from
  75    :attr:`~psi_io._models.Props.desc`.
  76
  77``scales`` : ``Scales(r, t, p)``
  78    Named tuple of coordinate scale readers.  Each element wraps the
  79    one-dimensional coordinate array stored in the HDF file.  The radial
  80    coordinate uses :data:`~psi_io._units.PSI_rsun` (solar radii); the
  81    co-latitude and longitude coordinates use
  82    :data:`~psi_io._units.PSI_angle` (radians).  Each scale reader exposes
  83    the same :meth:`read` interface as the main data reader.
  84
  85``shape`` : :class:`tuple` of :class:`int`
  86    Array dimensions in HDF storage order ``(Nφ, Nθ, Nr)`` — the radial
  87    axis is *last* due to Fortran column-major convention.  All slicing APIs
  88    accept arguments in physical ``(r, θ, φ)`` order and reverse the indexing
  89    internally.
  90
  91``ndim`` : :class:`int`
  92    Number of spatial dimensions; always ``3`` for MAS and POT3D field
  93    variables.
  94
  95``size`` : :class:`int`
  96    Total element count (``Nφ × Nθ × Nr``).
  97
  98``nbytes`` : :class:`int`
  99    Dataset size in bytes.
 100
 101``dtype`` : :class:`numpy.dtype`
 102    Element type of the stored dataset (typically ``float32``).
 103
 104``attrs`` : :class:`dict`
 105    HDF file-level attributes as a plain Python dictionary.
 106
 107``is_cached`` : :class:`bool`
 108    ``True`` once a full-array read has populated the in-memory cache;
 109    ``False`` otherwise.
 110
 111.. rubric:: Reading Data — ``read``
 112
 113.. code-block:: python
 114
 115    odata[, r, t, p] = reader.read(*args, unit=None, mesh=None, scales=True)
 116
 117Each positional argument restricts one spatial axis in physical ``(r, θ, φ)``
 118order:
 119
 120- Omitted / ``None`` — full axis.
 121- ``int`` — single index; axis retained as a length-1 dimension.
 122- ``slice`` — standard Python slice.
 123- ``(start, stop)`` or ``(start, stop, step)`` — converted to a slice.
 124- ``Ellipsis`` — expands to ``None`` for all remaining axes.
 125
 126**unit**
 127    Output unit.  String aliases ``'native'`` / ``'code'`` / ``'model'`` /
 128    ``'psi'`` return values in MAS code units (the units defined in
 129    :mod:`psi_io._units`).  Aliases ``'real'`` / ``'phys'`` / ``'physical'``
 130    / ``'cgs'`` call :func:`~psi_io._units.decompose_mas_units` to express
 131    values in CGS base units.  Any other string or
 132    :class:`~astropy.units.Unit` instance is forwarded to
 133    :meth:`astropy.units.Quantity.to`.
 134
 135**mesh**
 136    Target mesh stagger (:data:`~psi_io._mesh.MeshCodeType`).  Axes that are
 137    on the half mesh in the stored data but on the main mesh in *mesh* are
 138    averaged via :func:`~psi_io._mesh.remesh_array`.  Up-sampling
 139    (main → half) raises :exc:`ValueError`.
 140
 141**scales**
 142    If ``True`` (default), return the corresponding coordinate slice for each
 143    axis as additional :class:`~astropy.units.Quantity` values ``(r, t, p)``
 144    after the data array.
 145
 146.. note::
 147    PSI HDF files are written in Fortran column-major order so that numpy
 148    reads them with shape ``(Nφ, Nθ, Nr)`` — the radial axis is *last*.  All
 149    positional arguments to :meth:`read` and the returned coordinate scales
 150    are in physical ``(r, θ, φ)`` order regardless of on-disk layout.
 151
 152.. rubric:: Value-Space Slicing — ``vslice``
 153
 154.. code-block:: python
 155
 156    odata[, r, t, p] = reader.vslice(*args, unit=None, mesh=None, scales=True,
 157                                     bounds_error=True, fill_value=None)
 158
 159``vslice`` is a superset of :meth:`read` that additionally accepts physical
 160coordinate values as positional arguments.  Passing a
 161:class:`~astropy.units.Quantity` or a bare scalar for an axis locates the two
 162nearest grid points, loads a 2-element window, and linearly interpolates to the
 163target value; the resulting axis has size 1.  Index-space arguments (``slice``,
 164``int``, ``None``, ``Ellipsis``) are handled identically to :meth:`read`.
 165
 166**bounds_error**
 167    If ``True`` (default), raise :exc:`ValueError` when a value is outside the
 168    coordinate range.  Set to ``False`` and supply a **fill_value** to replace
 169    out-of-bounds points; pass ``fill_value=None`` to extrapolate silently.
 170
 171When ``scales=True``, value-interpolated axes return the target coordinate as
 172a length-1 :class:`~astropy.units.Quantity`; index-space axes return the
 173corresponding coordinate slice as usual.
 174
 175.. rubric:: File Lifecycle
 176
 177Readers hold an open HDF file handle.  Use the context-manager protocol to
 178guarantee cleanup:
 179
 180.. code-block:: python
 181
 182    with PsiData('br001001.h5') as reader:
 183        data, r, t, p = reader.read(unit='Gauss')
 184
 185Alternatively, call ``reader.close()`` and ``reader.open()`` to manage the
 186handle explicitly.  The handle is also released automatically when the reader
 187is garbage-collected.
 188
 189.. rubric:: Supported Quantities
 190
 191**MAS** (19 quantities):
 192
 193.. list-table::
 194   :widths: 12 32 24 32
 195   :header-rows: 1
 196
 197   * - Key(s)
 198     - Description
 199     - Code unit
 200     - Stagger ``(r, θ, φ)``
 201   * - ``br``, ``bt``, ``bp``
 202     - Magnetic field components
 203     - :data:`~psi_io._units.MAS_b` (≈ 2.2 G)
 204     - ``HALF/MAIN/MAIN``, ``MAIN/HALF/MAIN``, ``MAIN/MAIN/HALF``
 205   * - ``vr``, ``vt``, ``vp``
 206     - Velocity components
 207     - :data:`~psi_io._units.MAS_v` (≈ 481 km s⁻¹)
 208     - ``MAIN/HALF/HALF``, ``HALF/MAIN/HALF``, ``HALF/HALF/MAIN``
 209   * - ``jr``, ``jt``, ``jp``
 210     - Current density components
 211     - :data:`~psi_io._units.MAS_j`
 212     - ``MAIN/HALF/HALF``, ``HALF/MAIN/HALF``, ``HALF/HALF/MAIN``
 213   * - ``t``, ``te``, ``tp``
 214     - Temperature (single / electron / proton)
 215     - :data:`~psi_io._units.MAS_t` (≈ 28 MK)
 216     - ``HALF/HALF/HALF``
 217   * - ``rho``
 218     - Mass density
 219     - :data:`~psi_io._units.MAS_n` (10⁸ cm⁻³)
 220     - ``HALF/HALF/HALF``
 221   * - ``p``
 222     - Thermal pressure
 223     - :data:`~psi_io._units.MAS_p`
 224     - ``HALF/HALF/HALF``
 225   * - ``ep``, ``em``
 226     - Alfvén wave energy density (outward / inward)
 227     - :data:`~psi_io._units.MAS_p`
 228     - ``HALF/HALF/HALF``
 229   * - ``zp``, ``zm``
 230     - Elsässer wave amplitudes (outward / inward)
 231     - :data:`~psi_io._units.MAS_v`
 232     - ``HALF/HALF/HALF``
 233   * - ``heat``
 234     - Volumetric coronal heating rate
 235     - :data:`~psi_io._units.MAS_heat`
 236     - ``HALF/HALF/HALF``
 237
 238**POT3D** (3 quantities): ``br``, ``bt``, ``bp`` — magnetic field components,
 239unit :data:`~psi_io._units.POT3D_b` (dimensionless by default; see the
 240:func:`PsiData` warning regarding unit declaration).
 241
 242.. rubric:: Examples
 243
 244Full MAS radial field read with coordinate scales:
 245
 246.. code-block:: python
 247
 248    from psi_io.mhd_io import PsiData
 249
 250    reader = PsiData('br001001.h5')
 251    data, r, t, p = reader.read()              # code units (MAS_b)
 252    data, r, t, p = reader.read(unit='Gauss')  # convert to Gauss on the fly
 253
 254Partial read — inner radial shell in CGS base units, coordinates suppressed:
 255
 256.. code-block:: python
 257
 258    data = reader.read(slice(0, 10), unit='physical', scales=False)
 259
 260Value-space slice — extract the r = 2.5 R☉ surface:
 261
 262.. code-block:: python
 263
 264    data, r, t, p = reader.vslice(2.5, unit='Gauss')  # bare scalar → native coord unit
 265
 266POT3D file — unit must be declared at construction because it is not encoded in
 267the file:
 268
 269.. code-block:: python
 270
 271    reader = PsiData('br001.hdf', model='pot3d', unit='Gauss')
 272    data, r, t, p = reader.read()
 273
 274Inspect metadata without loading data:
 275
 276.. code-block:: python
 277
 278    reader = PsiData('rho001001.h5')
 279    reader.quantity      # 'rho'
 280    reader.description   # 'MAS Density'
 281    reader.unit          # MAS_n
 282    reader.mesh          # (Mesh.HALF, Mesh.HALF, Mesh.HALF)
 283    reader.is_cached     # False
 284    reader.shape         # (Nφ, Nθ, Nr) — numpy storage order
 285"""
 286
 287from __future__ import annotations
 288
 289__all__ = ['PsiData',]
 290
 291from abc import abstractmethod, ABC
 292from collections import namedtuple
 293from collections.abc import Sequence
 294from itertools import repeat
 295from pathlib import Path
 296from types import MappingProxyType
 297from typing import TYPE_CHECKING, Optional, Literal, ClassVar
 298import numpy as np
 299import h5py as h5
 300import astropy.units as u
 301
 302if TYPE_CHECKING:
 303    from astropy.units.typing import UnitLike, QuantityLike
 304
 305try:
 306    import pyhdf.SD as h4
 307except ImportError:
 308    h4 = None
 309
 310from psi_io._mesh import (Mesh,
 311                          MeshCodeType,
 312                          _normalize_mesh_code,
 313                          _remesh_array,
 314                          _parse_remesh, ArrayOrdering,
 315                          )
 316from psi_io._models import (Props,
 317                            PsiScales,
 318                            ModelType,
 319                            extract_quantity_from_filepath,
 320                            extract_sequence_from_filepath, get_model_prop_caller, MODEL_TYPE)
 321from psi_io._units import decompose_mas_units
 322from psi_io.psi_io import (PathLike,
 323                           PSI_DATA_ID,
 324                           SDC_TYPE_CONVERSIONS,
 325                           _except_no_pyhdf,)
 326
 327
 328HdfVersionType = Literal[4, 5]
 329"""Literal type alias for the two supported HDF file format versions.
 330
 331``4`` — HDF4, accessed via pyhdf (optional dependency).
 332``5`` — HDF5, accessed via h5py.
 333"""
 334
 335
 336_HDF_EXT_MAPPING = MappingProxyType({'h4': '.hdf', 'h5': '.h5', })
 337"""Mapping from HDF version string to file extension.
 338
 339Used by :class:`_HdfData.__init__` to validate that the supplied file has an
 340extension consistent with the concrete class's format mixin.
 341
 342``'h5'`` → ``'.h5'`` (HDF5 files, read via h5py)
 343``'h4'`` → ``'.hdf'`` (HDF4 files, read via pyhdf)
 344"""
 345
 346
 347_DATA_SLOTS = ('_model', '_fileref', '_filepath', '_datalabel', '_quantity', '_sequence', '_unit', '_mesh', '_scales', '_values')
 348"""Slot names shared by all concrete :class:`_HdfData` subclasses.
 349
 350Stored as a module-level tuple so it can be referenced in ``__slots__`` declarations
 351of both the abstract base and the concrete classes without repetition.
 352"""
 353
 354_SCALE_SLOTS = ('_model', '_dataref', '_datalabel', '_quantity', '_unit', '_mesh', '_values')
 355"""Slot names shared by all concrete :class:`_HdfScale` subclasses.
 356
 357Stored as a module-level tuple so it can be referenced in ``__slots__`` declarations
 358of both :class:`H5Scale` and :class:`H4Scale` without repetition.
 359"""
 360
 361
 362_CODE_UNIT_ALIASES = {'native', 'code', 'model', 'psi'}
 363"""Set of strings that request code-unit output from :meth:`_HdfInterface.read`.
 364
 365When the ``units`` argument to ``read()`` is one of these strings, the data are
 366returned in MAS code units (dimensionless ratios) without any physical conversion.
 367"""
 368
 369
 370_REAL_UNIT_ALIASES = {'real', 'phys', 'physical', 'cgs'}
 371"""Set of strings that request decomposed CGS output from :meth:`_HdfInterface.read`.
 372
 373When the ``units`` argument to ``read()`` is one of these strings, the data are
 374converted to physical CGS units via :func:`~psi_io._units.decompose_mas_units`.
 375"""
 376
 377
 378METADATA_SCHEMA = dict.fromkeys(['quantity', 'sequence', 'unit', 'scalar', 'mesh'])
 379"""Template dict defining the five recognised metadata fields for PSI HDF datasets.
 380
 381Keys
 382----
 383``'quantity'``
 384    Canonical lower-case quantity identifier extracted from the filename or file
 385    attributes (e.g. ``'br'``, ``'vr'``).
 386``'sequence'``
 387    Integer time-step sequence number extracted from the filename or file attributes.
 388``'unit'``
 389    Code-to-physical unit for this quantity, as an :class:`~astropy.units.Unit`
 390    or a string parseable by it.
 391``'scalar'``
 392    ``True`` if the quantity is a scalar field; ``False`` for vector components.
 393``'mesh'``
 394    Staggered-grid mesh code (:data:`~psi_io._mesh.MeshCodeType`) describing the
 395    Yee-grid position of the quantity on each spatial axis.
 396
 397Used by :meth:`_HdfData._parse_properties` to filter keyword overrides and HDF file
 398attributes against the set of known metadata keys.
 399"""
 400
 401
 402Scales = namedtuple("Scales", ['r', 't', 'p'])
 403"""Named tuple holding the three coordinate scale readers for a data object.
 404
 405Fields
 406------
 407r : H5Scale or H4Scale
 408    Radial coordinate scale in solar radii.
 409t : H5Scale or H4Scale
 410    Co-latitude scale in radians.
 411p : H5Scale or H4Scale
 412    Longitude scale in radians.
 413
 414Created by :meth:`_HdfData._set_scales` during instantiation and exposed as
 415:attr:`_HdfData.scales`.
 416"""
 417
 418
 419def _interpolate_dim(arr: QuantityLike,
 420                     axis: int,
 421                     value: QuantityLike,
 422                     scale: QuantityLike,
 423                     fill_value: Optional[QuantityLike]
 424                     ) -> QuantityLike:
 425    """Linearly interpolate *arr* along *axis* to *value*, collapsing that axis to size 1.
 426
 427    Both *value* and *scale* must carry the same unit (or both be dimensionless).
 428    The result is a fill array when *value* is outside the 2-element *scale* window
 429    and *fill_value* is not ``None``; otherwise a linear interpolation is performed.
 430    Extrapolation occurs silently when *fill_value* is ``None``.
 431    """
 432    if arr.shape[axis] != 2 or len(scale) != 2:
 433        raise ValueError("Interpolation is only supported for 2-element arrays and scales.")
 434    if (value < scale[0] or value > scale[1]) and fill_value is not None:
 435
 436        return np.full(arr.shape[:axis] + (1,) + arr.shape[axis + 1:], fill_value, dtype=arr.dtype)
 437    t = (value - scale[0]) / (scale[1] - scale[0])
 438    slc_lo = [slice(None)] * arr.ndim
 439    slc_hi = [slice(None)] * arr.ndim
 440    slc_lo[axis] = slice(None, -1)
 441    slc_hi[axis] = slice(1, None)
 442    return (1.0 - t) * arr[tuple(slc_lo)] +  t * arr[tuple(slc_hi)]
 443
 444
 445def _slice_array(data: QuantityLike,
 446                 values: Sequence[Optional[QuantityLike]],
 447                 scales: Sequence[Optional[QuantityLike]],
 448                 fill_value: Optional[QuantityLike],
 449                 order: ArrayOrdering = 'F') -> QuantityLike:
 450    """Interpolate *data* to physical values along each axis that has a non-``None`` entry.
 451
 452    Iterates over axes in storage order (reversed from physical order when
 453    ``order='F'``) and calls :func:`_interpolate_dim` for each axis whose
 454    entry in *values* is not ``None``.  Axes with ``None`` entries are left
 455    unchanged.  *scales* must be in the same order as *values* and provide the
 456    two-element coordinate window for each interpolated axis.
 457    """
 458    if order == 'F':
 459        values, scales = reversed(values), reversed(scales)
 460    for i, (v, s) in enumerate(zip(values, scales)):
 461        if v is not None:
 462            if s is None:
 463                raise ValueError("Cannot interpolate to a value when the corresponding scale is not provided.")
 464            data = _interpolate_dim(data, i, v, s, fill_value)
 465    return data
 466
 467
 468def _expand_args(*args, ndim: int) -> tuple:
 469    """Expand *args* to a length-*ndim* tuple, replacing ``Ellipsis`` and padding with ``None``.
 470
 471    Parameters
 472    ----------
 473    *args : object
 474        User-supplied dimension arguments.  An ``Ellipsis`` is replaced by the
 475        appropriate number of ``None`` values so that the total length equals *ndim*.
 476        If fewer than *ndim* arguments are provided, trailing ``None``\\ s are appended.
 477    ndim : int
 478        Target length of the returned tuple.
 479
 480    Returns
 481    -------
 482    out : tuple
 483        Length-*ndim* tuple of dimension arguments.
 484    """
 485    if Ellipsis in args:
 486        n_missing = ndim - (len(args) - 1)
 487        idx = args.index(Ellipsis)
 488        args = args[:idx] + (None,) * n_missing + args[idx + 1:]
 489    if len(args) < ndim:
 490        args += (None,) * (ndim - len(args))
 491    return args
 492
 493
 494def _parse_islice_args(*args, shape: tuple[int, ...],):
 495    """Normalize index-space slice arguments to a tuple of :class:`slice` objects.
 496
 497    Accepts a mix of ``None``, ``int``, ``slice``, ``(start, stop[, step])`` tuples,
 498    and ``Ellipsis``, and yields one slice per spatial axis.
 499
 500    Parameters
 501    ----------
 502    *args : None, int, slice, tuple, or Ellipsis
 503        Index arguments in physical ``(r, θ, φ)`` user order.  Fewer arguments than
 504        dimensions are padded with ``None`` (full-axis slices).
 505    shape : tuple[int, ...]
 506        Physical ``(r, θ, φ)`` shape (i.e. ``reversed(self.shape)`` from storage).
 507
 508    Yields
 509    ------
 510    s : slice
 511        Normalized slice for each axis.
 512
 513    Raises
 514    ------
 515    ValueError
 516        If a slice argument produces an empty dimension (``stop <= start``).
 517    TypeError
 518        If an argument cannot be converted to a slice (via :func:`_cast_to_slice`).
 519    """
 520    sargs = _expand_args(*args, ndim=len(shape))
 521
 522    for arg, dim_size in zip(sargs, shape):
 523        slice_ = _cast_to_slice(arg)
 524        start, stop, step = slice_.indices(dim_size)
 525        if stop <= start:
 526            raise ValueError(f"Slice argument {arg!r} yields an empty dimension.")
 527        yield slice_
 528
 529
 530def _parse_vslice_args(*args,
 531                       scales: Scales,
 532                       bounds_error: bool):
 533    """Convert value-space dimension arguments to ``(value, index_slice)`` pairs.
 534
 535    For each axis, a bare scalar or :class:`~u.Quantity` triggers a
 536    value-space lookup: the coordinate scale is searched to find the two bracketing
 537    indices, and a 2-element slice is returned for later linear interpolation via
 538    :func:`_interpolate_dim`.  All other argument types (``None``, ``int``,
 539    ``slice``, tuple) are treated as index-space and passed through to
 540    :func:`_cast_to_slice` unchanged, with ``None`` returned as the value.
 541
 542    Parameters
 543    ----------
 544    *args : scalar, u.Quantity, int, slice, tuple, None, or Ellipsis
 545        One argument per spatial axis in physical ``(r, θ, φ)`` order.  Fewer
 546        arguments than dimensions are padded with ``None`` via :func:`_expand_args`.
 547    scales : Scales
 548        Coordinate scale readers in physical ``(r, θ, φ)`` order.  Used to look
 549        up the native unit and perform bounds checking.
 550    bounds_error : bool
 551        If ``True``, raise :class:`ValueError` when a value lies outside the
 552        range of its coordinate scale.
 553
 554    Yields
 555    ------
 556    value : u.Quantity or None
 557        The physical target value (in the scale's native unit) for value-space
 558        axes; ``None`` for index-space axes.
 559    s : slice
 560        Corresponding index-space slice.  For value-space axes this is the
 561        2-element bracketing window ``slice(i-1, i+1)``; for index-space axes
 562        it is the result of :func:`_cast_to_slice`.
 563
 564    Raises
 565    ------
 566    ValueError
 567        If *bounds_error* is ``True`` and a value lies outside its scale range.
 568    """
 569    sargs = _expand_args(*args, ndim=len(scales))
 570    for arg, scale in zip(sargs, scales):
 571        value = None
 572        if np.isscalar(arg):
 573            arg = arg * scale.unit
 574        if isinstance(arg, u.Quantity):
 575            value = arg.to(scale.unit)
 576            raw_value = value.value
 577            if bounds_error and (raw_value < scale[0] or raw_value > scale[-1]):
 578                raise ValueError(f"Value {arg} is out of bounds for scale '{scale.quantity}' "
 579                                 f"with range {scale.data[0]:.3f} to {scale.data[-1]:.3f} {scale.unit}.")
 580            index = int(np.clip(np.searchsorted(scale[...], raw_value), 1, scale.size - 1))
 581            arg = (index - 1, index + 1)
 582        slice_ = _cast_to_slice(arg)
 583        start, stop, step = slice_.indices(scale.size)
 584        if stop <= start:
 585            raise ValueError(f"Slice argument {arg!r} yields an empty dimension.")
 586        yield value, slice_
 587
 588
 589def _apply_units(data: u.Quantity, unit: Optional[str | UnitLike]) -> u.Quantity:
 590    """Apply a unit conversion to *data*, returning a :class:`~u.Quantity`.
 591
 592    Parameters
 593    ----------
 594    data : u.Quantity
 595        Data in code units.
 596    unit : str, u.Unit, or None
 597        Requested output unit.  ``None`` is a no-op.  Special string aliases:
 598        ``'native'`` / ``'code'`` / ``'model'`` / ``'psi'`` — return *data*
 599        unchanged; ``'real'`` / ``'phys'`` / ``'physical'`` / ``'cgs'`` —
 600        decompose to CGS base unit via
 601        :func:`~psi_io._units.decompose_mas_units`.  Any other value is
 602        forwarded to :meth:`~u.Quantity.to`.
 603
 604    Returns
 605    -------
 606    out : u.Quantity
 607        *data* in the requested unit.
 608    """
 609    if unit is None:
 610        return data
 611    ounit = str(unit).lower()
 612    if ounit in _CODE_UNIT_ALIASES:
 613        return data
 614    if ounit in _REAL_UNIT_ALIASES:
 615        return decompose_mas_units(data)
 616    return data.to(unit)
 617
 618
 619def _cast_to_slice(input: None | int | slice | Sequence) -> slice:
 620    """Convert a dimension index argument to a :class:`slice` object.
 621
 622    Parameters
 623    ----------
 624    input : None, int, slice, list, or tuple
 625        - ``None`` → ``slice(None)`` (full axis).
 626        - :class:`int` → ``slice(input, input + 1)`` (single element, axis retained).
 627        - :class:`slice` → returned unchanged.
 628        - :class:`list` or :class:`tuple` → ``slice(*input)`` (unpacked as
 629          ``(start, stop)`` or ``(start, stop, step)``).
 630
 631    Returns
 632    -------
 633    out : slice
 634
 635    Raises
 636    ------
 637    TypeError
 638        If *input* is not one of the recognized types.
 639    """
 640    if input is None:
 641        return slice(None)
 642    elif isinstance(input, int):
 643        return slice(input, input + 1)
 644    elif isinstance(input, slice):
 645        return input
 646    elif isinstance(input, (list, tuple)):
 647        return slice(*input)
 648    else:
 649        raise TypeError(f"Invalid slice argument: {input!r}. Expected int, slice, or sequence.")
 650
 651
 652# =============================================================================
 653# Abstract interface
 654# =============================================================================
 655
 656class _HdfInterface(ABC):
 657    """Abstract base class defining the full public interface for PSI HDF data objects.
 658
 659    All concrete readers (MAS, POT3D, coordinate scales) and their format mixins
 660    satisfy this interface.  Consumers should program against this class rather than
 661    against concrete subclasses.
 662
 663    Subclasses must implement all abstract properties and the :meth:`read` and
 664    :meth:`_read` methods.  The ``read`` method in this base class provides shared
 665    unit-conversion and remeshing logic that concrete implementations invoke via
 666    ``super().read(*args, **kwargs)``.
 667
 668    Notes
 669    -----
 670    The ``__slots__ = ()`` declaration in this class is intentional: it ensures that
 671    concrete subclasses using ``__slots__`` do not incur a ``__dict__`` overhead
 672    from the abstract base.
 673    """
 674
 675    __slots__ = ()
 676
 677    _HDFN: ClassVar[HdfVersionType]                        # provided by format mixin
 678
 679    def __getitem__(self, args):
 680        """Index the underlying HDF dataset with physical ``(r, θ, φ)`` axis ordering.
 681
 682        PSI HDF files are written in Fortran column-major order so that the radial
 683        index ``r`` varies fastest.  When read into numpy (row-major), this results
 684        in storage shape ``(Nφ, Nθ, Nr)``.  This method re-reverses the user-supplied
 685        index tuple so that callers can use natural physical ordering ``(r, θ, φ)``:
 686
 687        .. code-block:: python
 688
 689            obj[r_slice, t_slice, p_slice]   # user order
 690            # internally becomes:
 691            obj.data[p_slice, t_slice, r_slice]  # numpy storage order
 692
 693        Parameters
 694        ----------
 695        args : int, slice, or tuple
 696            Index argument(s) in physical ``(r, θ, φ)`` order.  A single non-tuple
 697            argument is treated as indexing along the first physical axis (``r``).
 698
 699        Returns
 700        -------
 701        out : numpy.ndarray
 702            Slice of the dataset in numpy storage order ``(Nφ, Nθ, Nr)``.
 703        """
 704        if not isinstance(args, tuple):
 705            args = (args,)
 706
 707        if self._values is not None:
 708            return self._values[args[::-1]]
 709        else:
 710            odata = self.data[args[::-1]]
 711            if odata.shape == self.shape:
 712                self._values = odata
 713            return odata
 714
 715    @property
 716    @abstractmethod
 717    def shape(self) -> tuple[int, ...]:
 718        """Shape of the underlying numpy array in storage order ``(Nφ, Nθ, Nr)``."""
 719        ...
 720
 721    @property
 722    @abstractmethod
 723    def dtype(self) -> np.dtype:
 724        """NumPy dtype of the stored array."""
 725        ...
 726
 727    @property
 728    @abstractmethod
 729    def size(self) -> int:
 730        """Total number of elements in the array."""
 731        ...
 732
 733    @property
 734    @abstractmethod
 735    def nbytes(self) -> int:
 736        """Total memory occupied by the array in bytes."""
 737        ...
 738
 739    @property
 740    @abstractmethod
 741    def ndim(self) -> int:
 742        """Number of array dimensions (always 3 for data objects, 1 for scales)."""
 743        ...
 744
 745    @property
 746    @abstractmethod
 747    def attrs(self) -> dict:
 748        """HDF attributes attached to this dataset as a plain Python dict."""
 749        ...
 750
 751    @property
 752    @abstractmethod
 753    def unit(self) -> u.Unit:
 754        """Astropy unit that converts one code-unit value to physical unit."""
 755        ...
 756
 757    @property
 758    @abstractmethod
 759    def mesh(self) -> tuple[Mesh, ...]:
 760        """Normalized mesh-stagger tuple, one :class:`~psi_io._mesh.Mesh` per axis."""
 761        ...
 762
 763    @property
 764    @abstractmethod
 765    def quantity(self) -> str:
 766        """Canonical lower-case quantity identifier (e.g. ``'br'``)."""
 767        ...
 768
 769    @property
 770    @abstractmethod
 771    def data(self) -> np.ndarray:
 772        """Raw array view into the HDF dataset (not yet loaded into RAM).
 773
 774        For HDF5 objects this is an h5py :class:`~h5py.Dataset`; for HDF4 objects
 775        it is a pyhdf ``SDS`` object.  Actual data transfer occurs only when the
 776        returned object is indexed or converted to a numpy array.
 777        """
 778        ...
 779
 780    @property
 781    def props(self) -> Props:
 782        """The :class:`~psi_io._models.Props` descriptor for this quantity.
 783
 784        Returns the full property bundle (name, description, unit, mesh code) from
 785        the appropriate mapping for this reader's model type and quantity.
 786
 787        Returns
 788        -------
 789        out : Props
 790            Immutable property descriptor for :attr:`quantity`.
 791        """
 792        prop_getter = get_model_prop_caller(self._model)
 793        return prop_getter(self._quantity)
 794
 795    @property
 796    def description(self) -> str:
 797        """Human-readable description of the physical quantity.
 798
 799        Looked up from the appropriate property mapping via :data:`_PROP_GETTER_MAPPING`
 800        using :attr:`_MODEL` and the stored :attr:`quantity` name.
 801
 802        Returns
 803        -------
 804        out : str
 805            Description string (e.g. ``'Magnetic Field (Radial Component)'``).
 806        """
 807        return self.props.desc
 808
 809    @property
 810    def is_scalar(self) -> bool:
 811        """``True`` if the quantity is a scalar field; ``False`` for vector components."""
 812        return self.props.scalar
 813
 814    @property
 815    def is_cached(self) -> bool:
 816        """Flag indicating whether the data have been loaded into memory."""
 817        return self._values is not None
 818
 819    def load(self):
 820        """Read the full dataset into memory and cache it in ``_values``."""
 821        self._values = self.data[...]
 822
 823    @abstractmethod
 824    def read(self,
 825             *args,
 826             unit: Optional[str | UnitLike] = None,
 827             mesh: Optional[MeshCodeType] = None,
 828             ) -> tuple[u.Quantity, tuple[slice, ...], tuple[bool, ...]]:
 829        """Read a slice of data, applying optional unit conversion and remeshing.
 830
 831        This base implementation handles unit conversion and remesh flag computation.
 832        Concrete subclasses call ``super().read(...)`` and then attach coordinate
 833        scales or perform additional post-processing.
 834
 835        Parameters
 836        ----------
 837        *args : int, slice, tuple, or None
 838            Index arguments in physical ``(r, θ, φ)`` axis order.  Each positional
 839            argument selects elements along one spatial axis in the order
 840            ``(r, θ, φ)``.  Accepted forms per axis:
 841
 842            - ``None`` or omitted — full axis (equivalent to ``slice(None)``).
 843            - ``int`` — single index (output retains that axis as a length-1
 844              dimension).
 845            - ``slice`` — standard Python slice.
 846            - ``(start, stop)`` or ``(start, stop, step)`` — converted to a slice.
 847            - ``Ellipsis`` — expands to ``None`` for all remaining axes.
 848
 849        unit : str or u.Unit, optional
 850            Requested output unit.  Special string aliases are accepted:
 851
 852            - ``'native'`` / ``'code'`` / ``'model'`` — return raw code-unit
 853              values (an :class:`~u.Quantity` whose unit is the custom
 854              MAS unit, e.g. ``MAS_b``).
 855            - ``'real'`` / ``'phys'`` / ``'physical'`` — decompose to CGS base
 856              unit via :func:`~psi_io._units.decompose_mas_units`.
 857            - Any other string or :class:`~astropy.units.Unit` — passed directly
 858              to :meth:`~u.Quantity.to`.
 859
 860            If ``None``, the data are returned in the native code units without
 861            conversion.
 862
 863        mesh : MeshCodeType, optional
 864            Target mesh stagger.  If provided, half-mesh axes in the stored data
 865            that are on the main mesh in *mesh* are averaged to the main mesh before
 866            return (via :func:`~psi_io._mesh.remesh_array`).  Attempting to up-sample
 867            from main to half mesh raises :class:`ValueError`.  If ``None``, no
 868            remeshing is performed.
 869
 870        Returns
 871        -------
 872        odata : u.Quantity
 873            Data array with requested unit and remeshing applied.
 874        sargs : tuple[slice, ...]
 875            Slice tuple in physical ``(r, θ, φ)`` order, suitable for applying to
 876            coordinate scale arrays.
 877        remesh : tuple[bool, ...]
 878            Boolean flags in physical ``(r, θ, φ)`` order indicating which axes were
 879            remeshed from half to main mesh.
 880        """
 881
 882        sargs = _parse_islice_args(*args, shape=self.shape[::-1])
 883        if mesh is None:
 884            remesh = repeat(False, self.ndim)
 885        else:
 886            omesh_norm = _normalize_mesh_code(mesh, self.ndim)
 887            remesh = _parse_remesh(self.mesh, omesh_norm, 'C')
 888
 889        sargs = tuple(sargs)
 890        remesh = tuple(remesh)
 891
 892        odata = _apply_units(self._read(*sargs, remesh=remesh), unit)
 893        return odata, sargs, remesh
 894
 895    @abstractmethod
 896    def _read(self,
 897              *sargs,
 898              remesh) -> u.Quantity:
 899        """Internal read using pre-validated slice args and remesh flags.
 900
 901        Applies *sargs* directly to the dataset via :meth:`__getitem__` (which
 902        performs the axis reversal) and then remeshes the result.  Called by
 903        :meth:`_HdfData.read` when reading coordinate scales to avoid re-parsing
 904        slice arguments.
 905
 906        Parameters
 907        ----------
 908        *sargs : slice
 909            Pre-validated slices in physical ``(r, θ, φ)`` order.
 910        remesh : tuple[bool, ...]
 911            Remesh flags in physical ``(r, θ, φ)`` order.
 912
 913        Returns
 914        -------
 915        out : u.Quantity
 916            Sliced and optionally remeshed data in code units.
 917        """
 918
 919        return _remesh_array(self[sargs], remesh=remesh, order='F') * self.unit
 920
 921
 922# =============================================================================
 923# Scale classes
 924# =============================================================================
 925
 926class _HdfScale(_HdfInterface, ABC):
 927    """Abstract base for HDF coordinate scale variables (r, t, p).
 928
 929    A scale object wraps a one-dimensional coordinate array stored alongside the
 930    main data in an HDF file.  It exposes the same :class:`_HdfInterface` API as
 931    data objects so that coordinate arrays can be sliced and unit-converted with the
 932    same :meth:`read` call.
 933
 934    Subclasses supply the format-specific :attr:`data` property and the array
 935    introspection properties (``shape``, ``dtype``, etc.).
 936    """
 937
 938    __slots__ = ()
 939
 940    def __init__(self,
 941                 parent,
 942                 dim_label: str,
 943                 data_label: str,
 944                 scale_model: str = 'scale'):
 945        """Initialize a scale from a parent data reader and dimension metadata.
 946
 947        Parameters
 948        ----------
 949        parent : _HdfData
 950            The data reader to which this scale belongs.  The scale holds a reference
 951            to *parent* so it can share the open file handle.
 952        dim_label : str
 953            Coordinate axis label — one of ``'r'``, ``'t'``, ``'p'``.  Used to look
 954            up the physical unit via :func:`get_psi_scale_properties`.
 955        data_label : str
 956            Dataset or SDS name within the HDF file where the coordinate values are
 957            stored.
 958
 959        Raises
 960        ------
 961        ValueError
 962            If the underlying coordinate dataset is not one-dimensional.
 963        """
 964        self._dataref = parent
 965        self._datalabel: str = data_label
 966        self._model: str = scale_model
 967        self._values = None
 968
 969        self._set_properties(dim_label)
 970
 971    def _set_properties(self, scale: str):
 972        """Look up and cache the unit for this coordinate axis."""
 973        prop_getter = get_model_prop_caller(self._model)
 974        qprops = prop_getter(scale)
 975        if self.ndim != qprops.ndim:
 976            raise ValueError(f"Expected {qprops.ndim}D coordinate variable for scale '{scale}', "
 977                             f"found {self.ndim}D dataset with shape {self.shape}.")
 978        self._quantity: PsiScales = qprops.name
 979        self._unit: u.Unit = qprops.unit
 980        self._mesh: Mesh = self._dataref.mesh['rtp'.index(self._quantity)],
 981
 982    @property
 983    def unit(self) -> u.Unit:
 984        """Astropy unit for this coordinate axis (``PSI_rsun`` or ``PSI_angle``)."""
 985        return self._unit
 986
 987    @property
 988    def quantity(self) -> PsiScales:
 989        """Coordinate axis identifier: ``'r'``, ``'t'``, or ``'p'``."""
 990        return self._quantity
 991
 992    @property
 993    def mesh(self) -> tuple[Mesh, ...]:
 994        """Single-element mesh-stagger tuple for this coordinate axis."""
 995        return self._mesh
 996
 997    def read(self,
 998             *args,
 999             **kwargs,
1000             ) -> u.Quantity:
1001        """Read the coordinate array, returning an astropy Quantity.
1002
1003        Delegates to :meth:`_HdfInterface.read` and returns only the data
1004        (discarding the ``sargs`` and ``remesh`` bookkeeping values).
1005
1006        Parameters
1007        ----------
1008        *args
1009            Slice arguments; see :meth:`_HdfInterface.read`.
1010        **kwargs
1011            Keyword arguments forwarded to :meth:`_HdfInterface.read`.
1012
1013        Returns
1014        -------
1015        out : u.Quantity
1016            Coordinate values with the appropriate PSI unit attached.
1017        """
1018        odata, *_ = super().read(*args, **kwargs)
1019        return odata
1020
1021    def _read(self,
1022              *sargs,
1023              remesh) -> u.Quantity:
1024        """Internal read using pre-validated slice args."""
1025        return super()._read(*sargs, remesh=remesh)
1026
1027
1028class H5Scale(_HdfScale):
1029    """HDF5 coordinate scale variable backed by an h5py dimension scale dataset."""
1030
1031    __slots__ = _SCALE_SLOTS
1032
1033    @property
1034    def shape(self) -> tuple[int, ...]:
1035        """Shape of the coordinate array (always a length-1 tuple for 1-D scales)."""
1036        return self.data.shape
1037
1038    @property
1039    def dtype(self) -> np.dtype:
1040        """NumPy dtype of the coordinate array."""
1041        return self.data.dtype
1042
1043    @property
1044    def size(self) -> int:
1045        """Total number of coordinate points."""
1046        return self.data.size
1047
1048    @property
1049    def nbytes(self) -> int:
1050        """Total memory occupied by the coordinate array in bytes."""
1051        return self.data.nbytes
1052
1053    @property
1054    def ndim(self) -> int:
1055        """Number of dimensions (always 1 for coordinate scale arrays)."""
1056        return self.data.ndim
1057
1058    @property
1059    def attrs(self) -> dict:
1060        """HDF5 attributes attached to this dimension scale dataset."""
1061        return dict(self.data.attrs)
1062
1063    @property
1064    def data(self) -> np.ndarray:
1065        """h5py Dataset object for this coordinate dimension."""
1066        return self._dataref._fileref[self._datalabel]
1067
1068
1069class H4Scale(_HdfScale):
1070    """HDF4 coordinate scale variable backed by a pyhdf SDS dimension."""
1071
1072    __slots__ = _SCALE_SLOTS
1073
1074    @property
1075    def shape(self) -> tuple[int, ...]:
1076        """Shape of the coordinate array (always a length-1 tuple for 1-D scales)."""
1077        return self.data.info()[2],
1078
1079    @property
1080    def dtype(self) -> np.dtype:
1081        """NumPy dtype of the coordinate array."""
1082        return SDC_TYPE_CONVERSIONS[self.data.info()[3]]
1083
1084    @property
1085    def size(self) -> int:
1086        """Total number of coordinate points."""
1087        return int(np.prod(self.shape))
1088
1089    @property
1090    def nbytes(self) -> int:
1091        """Total memory occupied by the coordinate array in bytes."""
1092        return self.size * self.dtype.itemsize
1093
1094    @property
1095    def ndim(self) -> int:
1096        """Number of dimensions (always 1 for coordinate scale arrays)."""
1097        return self.data.info()[1]
1098
1099    @property
1100    def attrs(self) -> dict:
1101        """HDF4 attributes attached to this SDS dimension."""
1102        return self.data.attributes()
1103
1104    @property
1105    def data(self) -> np.ndarray:
1106        """pyhdf SDS object for this coordinate dimension."""
1107        return self._dataref._fileref.select(self._datalabel)
1108
1109
1110# =============================================================================
1111# Format mixins (HDF5 and HDF4 file I/O + raw array access)
1112# =============================================================================
1113
1114class _H5DataMixin:
1115    """Mixin providing HDF5 file I/O and raw array access via h5py.
1116
1117    Concrete data classes inherit from both this mixin and :class:`_HdfData`.
1118    The mixin supplies the ``_HDFN = 'h5'`` class variable, the
1119    :meth:`read_file` class method, and properties that delegate to the open
1120    :class:`h5py.File` handle.
1121    """
1122
1123    __slots__ = ()
1124    _HDFN = 5
1125
1126    @classmethod
1127    def read_file(cls, ifile: PathLike):
1128        """Open an HDF5 file for reading and return the :class:`h5py.File` handle."""
1129        return h5.File(ifile, 'r')
1130
1131    def open(self):
1132        """Re-open the HDF5 file if it was previously closed.  Returns ``self``."""
1133        if not self._fileref:
1134            self._fileref = self.read_file(self._filepath)
1135        return self
1136
1137    def close(self):
1138        """Close the HDF5 file handle.  Returns ``self``."""
1139        if self._fileref is not None:
1140            self._fileref.close()
1141            self._fileref = None
1142        return self
1143
1144    def delete(self):
1145        """Close the file handle during garbage collection (called by ``__del__``)."""
1146        fileref = getattr(self, '_fileref', None)
1147        if fileref is not None:
1148            fileref.close()
1149            self._fileref = None
1150
1151    @property
1152    def shape(self) -> tuple[int, ...]:
1153        """Array shape in storage order ``(Nφ, Nθ, Nr)``."""
1154        return self.data.shape
1155
1156    @property
1157    def dtype(self) -> np.dtype:
1158        """NumPy dtype of the stored array."""
1159        return self.data.dtype
1160
1161    @property
1162    def size(self) -> int:
1163        """Total number of elements in the array."""
1164        return self.data.size
1165
1166    @property
1167    def nbytes(self) -> int:
1168        """Total memory occupied by the array in bytes."""
1169        return self.data.nbytes
1170
1171    @property
1172    def ndim(self) -> int:
1173        """Number of array dimensions."""
1174        return self.data.ndim
1175
1176    @property
1177    def attrs(self) -> dict:
1178        """HDF5 attributes attached to this dataset as a plain Python dict."""
1179        return dict(self.data.attrs)
1180
1181    @property
1182    def data(self) -> np.ndarray:
1183        """h5py Dataset object providing lazy access to the array."""
1184        return self._fileref[self._datalabel]
1185
1186    def _set_scales(self):
1187        """Construct :class:`H5Scale` objects from h5py dimension scales."""
1188        self._scales = Scales(*(H5Scale(self, scale, label.label)
1189                                     for scale, label in zip('rtp', self.data.dims)))
1190
1191
1192class _H4DataMixin:
1193    """Mixin providing HDF4 file I/O and raw array access via pyhdf.
1194
1195    Analogous to :class:`_H5DataMixin` but for HDF4 files.  Raises an informative
1196    error at import time if pyhdf is not installed, via :func:`_except_no_pyhdf`.
1197    """
1198
1199    __slots__ = ()
1200    _HDFN = 4
1201
1202    @classmethod
1203    def read_file(cls, ifile: PathLike):
1204        """Open an HDF4 file for reading and return the pyhdf ``SD`` object."""
1205        _except_no_pyhdf()
1206        return h4.SD(str(ifile), h4.SDC.READ)
1207
1208    def open(self):
1209        """Re-open the HDF4 file if it was previously closed.  Returns ``self``."""
1210        if not self._fileref:
1211            self._fileref = self.read_file(self._filepath)
1212        return self
1213
1214    def close(self):
1215        """Close the HDF4 file handle via ``end()``."""
1216        if self._fileref is not None:
1217            self._fileref.end()
1218            self._fileref = None
1219
1220    def delete(self):
1221        """Close the HDF4 file handle during garbage collection."""
1222        fileref = getattr(self, '_fileref', None)
1223        if fileref is not None:
1224            fileref.end()
1225            self._fileref = None
1226
1227    @property
1228    def shape(self) -> tuple[int, ...]:
1229        """Array shape in storage order ``(Nφ, Nθ, Nr)``."""
1230        return tuple(self.data.info()[2])
1231
1232    @property
1233    def dtype(self) -> np.dtype:
1234        """NumPy dtype of the stored array."""
1235        return SDC_TYPE_CONVERSIONS[self.data.info()[3]]
1236
1237    @property
1238    def size(self) -> int:
1239        """Total number of elements in the array."""
1240        return int(np.prod(self.shape))
1241
1242    @property
1243    def nbytes(self) -> int:
1244        """Total memory occupied by the array in bytes."""
1245        return self.size * self.dtype.itemsize
1246
1247    @property
1248    def ndim(self) -> int:
1249        """Number of array dimensions."""
1250        return self.data.info()[1]
1251
1252    @property
1253    def attrs(self) -> dict:
1254        """HDF4 attributes attached to this SDS dataset as a plain Python dict."""
1255        return self.data.attributes()
1256
1257    @property
1258    def data(self) -> np.ndarray:
1259        """pyhdf SDS object providing lazy access to the array."""
1260        return self._fileref.select(self._datalabel)
1261
1262    def _set_scales(self):
1263        """Construct :class:`H4Scale` objects from pyhdf SDS dimensions.
1264
1265        HDF4 dimension order is reversed relative to HDF5 (Fortran vs. C order),
1266        so the dimension list is reversed before zipping with ``'rtp'``.
1267        """
1268        sds = self.data
1269        dims = list(reversed(list(sds.dimensions(full=1).items())))
1270        self._scales = Scales(*tuple(H4Scale(self, scale, k_)
1271                                     for scale, (k_, v_) in zip('rtp', dims)))
1272
1273
1274# =============================================================================
1275# Abstract data base
1276# =============================================================================
1277
1278class _HdfData(_HdfInterface, ABC):
1279    """Abstract base for a single PSI HDF dataset (data fields, not coordinate scales).
1280
1281    Handles file opening, metadata resolution, and scale construction.  Concrete
1282    subclasses are produced by combining this class with a format mixin
1283    (:class:`_H5DataMixin` or :class:`_H4DataMixin`) to form :class:`H5Data` and
1284    :class:`H4Data`.  Use :func:`PsiData` rather than instantiating these directly.
1285    """
1286
1287    __slots__ = ()
1288
1289    def __init__(self,
1290                 ifile: PathLike,
1291                 model: ModelType, /,
1292                 dataset_id: Optional[str] = None,
1293                 **kwargs):
1294        """Open an HDF file and parse metadata for one PSI output quantity.
1295
1296        Parameters
1297        ----------
1298        ifile : PathLike
1299            Path to the HDF4 or HDF5 file.  Must exist and have the extension
1300            expected by the format mixin (``'.h5'`` or ``'.hdf'``).
1301        dataset_id : str, optional
1302            Name of the dataset (SDS in HDF4, group key in HDF5) to open.  Defaults
1303            to the PSI standard dataset identifier for this format
1304            (:data:`~psi_io.psi_io.PSI_DATA_ID`).
1305        **kwargs
1306            Optional metadata overrides.  Accepted keys (from
1307            :data:`METADATA_SCHEMA`): ``'quantity'``, ``'sequence'``, ``'unit'``,
1308            ``'scalar'``, ``'mesh'``.  Caller-supplied values take precedence over
1309            both file attributes and filename inference.
1310
1311        Raises
1312        ------
1313        FileNotFoundError
1314            If *ifile* does not exist on disk.
1315        ValueError
1316            If *ifile* has the wrong extension for this format mixin, if the
1317            dataset is not three-dimensional, or if any required metadata field
1318            cannot be resolved.
1319        """
1320        ifile = Path(ifile)
1321        hdfv = f'h{self._HDFN}'
1322        if not ifile.is_file():
1323            raise FileNotFoundError(f"File '{ifile}' does not exist.")
1324        if ifile.suffix != _HDF_EXT_MAPPING[hdfv]:
1325            raise ValueError(f"File '{ifile}' does not have the correct extension for "
1326                             f"{self._HDFN} files (expected '{_HDF_EXT_MAPPING[hdfv]}' extension).")
1327        if model not in MODEL_TYPE:
1328            raise ValueError(f"Invalid model type {model!r}. Expected one of: {', '.join(MODEL_TYPE)}.")
1329
1330        self._filepath: Path = ifile
1331        self._datalabel: str = dataset_id or PSI_DATA_ID[hdfv]
1332        self._fileref = self.read_file(ifile)
1333        self._model: str = model
1334        self._values = None
1335
1336        self._set_properties(**self._parse_properties(**kwargs))
1337        self._set_scales()
1338
1339    def __enter__(self):
1340        """Open (or re-open) the file and return ``self`` for use as a context manager."""
1341        self.open()
1342        return self
1343
1344    def __exit__(self, *args):
1345        """Close the file handle when exiting the context manager."""
1346        self.close()
1347
1348    def __del__(self):
1349        """Close the file handle when the object is garbage-collected."""
1350        self.delete()
1351
1352    @classmethod
1353    @abstractmethod
1354    def read_file(cls, ifile: PathLike):
1355        """Open the HDF file at *ifile* and return the format-specific file handle."""
1356        ...
1357
1358    @abstractmethod
1359    def open(self):
1360        """Re-open the file handle if it was previously closed."""
1361        ...
1362
1363    @abstractmethod
1364    def close(self):
1365        """Close the open file handle and set the internal reference to ``None``."""
1366        ...
1367
1368    @abstractmethod
1369    def delete(self):
1370        """Release the file handle during garbage collection (called by ``__del__``)."""
1371        ...
1372
1373    @abstractmethod
1374    def _set_scales(self):
1375        """Construct and cache the :class:`Scales` named tuple of coordinate readers."""
1376        ...
1377
1378    def _parse_properties(self, **kwargs):
1379        """Resolve metadata fields by merging caller kwargs, file attrs, and filename.
1380
1381        Merges three sources (highest to lowest priority):
1382
1383        1. Keyword arguments in *kwargs* that match :data:`METADATA_SCHEMA` keys.
1384        2. HDF file-level attributes that match :data:`METADATA_SCHEMA` keys.
1385        3. Quantity name and sequence number inferred from the filename stem.
1386        4. The canonical :class:`~psi_io._models.Props` defaults for the resolved
1387           quantity (unit and mesh).
1388
1389        Parameters
1390        ----------
1391        **kwargs
1392            Caller-supplied metadata overrides.
1393
1394        Returns
1395        -------
1396        out : dict
1397            Fully populated metadata dict with keys
1398            ``{'quantity', 'sequence', 'unit', 'mesh'}``; the ``'scalar'`` key from
1399            :data:`METADATA_SCHEMA` is resolved internally but not forwarded to
1400            :meth:`_set_properties`.
1401
1402        Raises
1403        ------
1404        ValueError
1405            If any metadata field is still ``None`` after merging all sources.
1406        """
1407        prop_getter = get_model_prop_caller(self._model)
1408
1409        input_attrs = {k: v for k, v in kwargs.items() if k in METADATA_SCHEMA}
1410        file_attrs = {k: v for k, v in self.attrs.items() if k in METADATA_SCHEMA}
1411
1412        quantity = input_attrs.get('quantity',
1413                                   file_attrs.get('quantity',
1414                                                  extract_quantity_from_filepath(self._filepath, '')))
1415        sequence = input_attrs.get('sequence',
1416                                   file_attrs.get('sequence',
1417                                                  extract_sequence_from_filepath(self._filepath, 0)))
1418
1419        native_props = prop_getter(quantity)
1420        if self.ndim != native_props.ndim:
1421            raise ValueError(f"Expected {native_props.ndim}D dataset for quantity '{quantity}', "
1422                             f"found {self.ndim}D dataset with shape {self.shape}.")
1423
1424        native_attrs = dict(quantity=native_props.name,
1425                            sequence=sequence,
1426                            unit=native_props.unit,
1427                            mesh=native_props.mesh)
1428
1429        attributes = {**native_attrs, **file_attrs, **input_attrs}
1430        if any(v is None for v in attributes.values()):
1431            missing_meta = ', '.join(k for k, v in attributes.items() if v is None)
1432            raise ValueError(f"Malformed metadata: {missing_meta} is missing. "
1433                             f"Provide these as keyword arguments or ensure they "
1434                             f"are present in the file attributes.")
1435
1436        return attributes
1437
1438    def _set_properties(self,
1439                        quantity: str,
1440                        sequence: int,
1441                        unit: str,
1442                        mesh: MeshCodeType):
1443        """Store validated and type-coerced metadata on the instance.
1444
1445        Parameters
1446        ----------
1447        quantity : str
1448            Quantity name; stored as a string.
1449        sequence : int
1450            Sequence number; coerced to int.
1451        unit : str or u.Unit
1452            Unit; converted to :class:`~astropy.units.Unit` via ``u.Unit(str(unit))``.
1453        mesh : MeshCodeType
1454            Mesh stagger; normalized to ``tuple[Mesh, ...]`` via
1455            :func:`~psi_io._mesh._normalize_mesh_code`.
1456
1457        Raises
1458        ------
1459        ValueError
1460            If type coercion of any field fails.
1461        """
1462        try:
1463            self._quantity: str = str(quantity)
1464            self._sequence: int = int(sequence)
1465            self._unit: u.Unit = u.Unit(str(unit))
1466            self._mesh: tuple[Mesh, ...] = _normalize_mesh_code(mesh, self.ndim)
1467        except (ValueError, TypeError) as e:
1468            raise ValueError(f"Metadata type coercion failed: {e}") from e
1469
1470    @property
1471    def unit(self) -> u.Unit:
1472        """Astropy unit for converting code-unit values to physical unit."""
1473        return self._unit
1474
1475    @property
1476    def mesh(self) -> tuple[Mesh, ...]:
1477        """Normalized mesh-stagger tuple, one :class:`~psi_io._mesh.Mesh` per axis."""
1478        return self._mesh
1479
1480    @property
1481    def quantity(self) -> str:
1482        """Canonical lower-case quantity identifier (e.g. ``'br'``)."""
1483        return self._quantity
1484
1485    @property
1486    def sequence(self) -> int:
1487        """Integer time-step sequence number from the filename or file attributes."""
1488        return self._sequence
1489
1490    @property
1491    def scales(self) -> Scales:
1492        """Named tuple of coordinate scale readers ``(r, t, p)``.
1493
1494        Each element is an :class:`H5Scale` or :class:`H4Scale` that exposes the
1495        same :meth:`read` interface as the main data object, so coordinate arrays
1496        can be sliced in sync with the data.
1497        """
1498        return self._scales
1499
1500    def read(self,
1501             *args,
1502             scales: bool = True,
1503             **kwargs
1504             ) -> u.Quantity | tuple[u.Quantity, ...]:
1505        """Read a slice of data and optionally the corresponding coordinate arrays.
1506
1507        Delegates slice parsing, remeshing, and unit conversion to
1508        :meth:`_HdfInterface.read`, then optionally reads the matching slice of
1509        each coordinate scale.
1510
1511        Parameters
1512        ----------
1513        *args : int, slice, tuple, None, or Ellipsis
1514            Index arguments in physical ``(r, θ, φ)`` order.  See
1515            :meth:`_HdfInterface.read` for the full description.
1516        scales : bool, optional
1517            If ``True`` (default), also return the corresponding coordinate slice
1518            for each spatial axis.  If ``False``, return only the data array.
1519        **kwargs
1520            Forwarded to :meth:`_HdfInterface.read`; see that method for
1521            ``units`` and ``mesh`` keyword arguments.
1522
1523        Returns
1524        -------
1525        odata : u.Quantity
1526            Data array in the requested unit.
1527        r_scale : u.Quantity
1528            Radial coordinate values in solar radii (only if ``scales=True``).
1529        t_scale : u.Quantity
1530            Co-latitude values in radians (only if ``scales=True``).
1531        p_scale : u.Quantity
1532            Longitude values in radians (only if ``scales=True``).
1533
1534        Examples
1535        --------
1536        Read the full array and coordinate grids:
1537
1538        >>> data, r, t, p = reader.read()                   # doctest: +SKIP
1539
1540        Read a radial sub-range and convert to physical CGS units:
1541
1542        >>> data, r, t, p = reader.read(slice(10, 50), unit='physical')  # doctest: +SKIP
1543
1544        Read data only (no coordinate arrays):
1545
1546        >>> data = reader.read(scales=False)                # doctest: +SKIP
1547        """
1548        odata, sargs, remesh = super().read(*args, **kwargs)
1549        if not scales:
1550            return odata
1551        oscales = (scale._read(sarg, remesh=rmesh) for scale, sarg, rmesh in zip(self.scales, sargs, remesh))
1552        return odata, *oscales
1553
1554    def _read(self,
1555              *sargs,
1556              remesh) -> u.Quantity:
1557        """Internal read using pre-validated slice args."""
1558        return super()._read(*sargs, remesh=remesh)
1559
1560
1561    def vslice(self,
1562               *args,
1563               scales: bool = True,
1564               unit: Optional[str | UnitLike] = None,
1565               mesh: Optional[MeshCodeType] = None,
1566               bounds_error: bool = True,
1567               fill_value: Optional[QuantityLike] = None,
1568               ) -> u.Quantity | tuple[u.Quantity, ...]:
1569        """Read a slice of data by physical coordinate value with optional interpolation.
1570
1571        Each positional argument may be a physical coordinate value
1572        (:class:`~u.Quantity` or bare scalar), in which case the
1573        dataset is reduced to a 2-element window and linearly interpolated to that
1574        value.  Index-space arguments (``slice``, ``int``, ``None``, ``Ellipsis``)
1575        are also accepted and passed through without interpolation, making this
1576        method a superset of :meth:`read`.
1577
1578        Parameters
1579        ----------
1580        *args : u.Quantity, scalar, int, slice, tuple, None, or Ellipsis
1581            One argument per spatial axis in physical ``(r, θ, φ)`` order.
1582            u.Quantity or bare scalar → value-space interpolation.
1583            All other types → index-space slice (see :meth:`read`).
1584        scales : bool, optional
1585            If ``True`` (default), also return the corresponding coordinate value
1586            for each axis.  Value-interpolated axes return the interpolation target
1587            as a length-1 :class:`~astropy.units.Quantity`; index-space axes return the full slice.
1588        unit : str or u.Unit, optional
1589            Output unit; see :meth:`read` for accepted aliases and formats.
1590        mesh : MeshCodeType, optional
1591            Target mesh stagger.  Remeshing is skipped for axes that are being
1592            value-interpolated (interpolation already collapses the half-mesh
1593            window to a single value).
1594        bounds_error : bool, optional
1595            If ``True`` (default), raise :class:`ValueError` when a value argument
1596            lies outside its coordinate scale range.
1597        fill_value : QuantityLike or None, optional
1598            Value substituted when a coordinate value falls outside the 2-element
1599            interpolation window and *bounds_error* is ``False``.  ``None``
1600            silently extrapolates.
1601
1602        Returns
1603        -------
1604        odata : u.Quantity
1605            Sliced and interpolated data in the requested unit.
1606        r_scale, t_scale, p_scale : u.Quantity
1607            Coordinate values for each axis (only if ``scales=True``).
1608            Value-interpolated axes return the interpolation target as a
1609            length-1 array; index-space axes return the full coordinate slice.
1610
1611        Raises
1612        ------
1613        ValueError
1614            If *bounds_error* is ``True`` and any value argument is out of range,
1615            or if a value axis has no corresponding coordinate scale.
1616        """
1617        vslice_args = tuple(_parse_vslice_args(*args, scales=self.scales, bounds_error=bounds_error))
1618        slice_values, sargs = zip(*vslice_args)
1619
1620        if mesh is None:
1621            remesh = repeat(False, self.ndim)
1622        else:
1623            omesh_norm = _normalize_mesh_code(mesh, self.ndim)
1624            remesh = _parse_remesh(self.mesh, omesh_norm, 'C')
1625        remesh = tuple(remesh)
1626        remesh_xand_svalue = tuple(rm and sv is None for rm, sv in zip(remesh, slice_values))
1627
1628        pre_slice_data = _apply_units(self._read(*sargs, remesh=remesh_xand_svalue), unit)
1629
1630        pre_slice_scales = tuple(scale[sarg] * scale.unit if sv is not None else None
1631                             for scale, sarg, sv in zip(self.scales, sargs, slice_values))
1632
1633        sliced_data = _slice_array(pre_slice_data, slice_values, pre_slice_scales, fill_value)
1634        if not scales:
1635            return sliced_data
1636        sliced_scales = tuple(scale._read(sarg, remesh=rmesh) if sv is None else np.atleast_1d(sv)
1637                                 for scale, sarg, sv, rmesh in zip(self.scales, sargs, slice_values, remesh))
1638        return sliced_data, *sliced_scales
1639
1640
1641# =============================================================================
1642# Concrete data classes
1643# =============================================================================
1644
1645class H5Data(_H5DataMixin, _HdfData):
1646    """HDF5-backed MAS model data reader.
1647
1648    Combines :class:`_H5DataMixin` (h5py file I/O) with :class:`_HdfData`
1649    (MAS metadata and :meth:`read` logic).  Use :func:`PsiData` to instantiate
1650    rather than calling this class directly.
1651    """
1652
1653    __slots__ = _DATA_SLOTS
1654
1655
1656class H4Data(_H4DataMixin, _HdfData):
1657    """HDF4-backed MAS model data reader.
1658
1659    Combines :class:`_H4DataMixin` (pyhdf file I/O) with :class:`_HdfData`
1660    (MAS metadata and :meth:`read` logic).  Requires the optional ``pyhdf``
1661    dependency.  Use :func:`PsiData` to instantiate rather than calling this class
1662    directly.
1663    """
1664
1665    __slots__ = _DATA_SLOTS
1666
1667# =============================================================================
1668# Private helpers
1669# =============================================================================
1670
1671_DATA_CLASS_MAP = MappingProxyType({
1672    '.h5':  H5Data,
1673    '.hdf': H4Data,
1674})
1675"""Read-only mapping from HDF file extension to the concrete data reader class.
1676
1677Used by :func:`PsiData` to select between :class:`H5Data` (h5py) and
1678:class:`H4Data` (pyhdf) based on the file's suffix.
1679"""
1680
1681
[docs] 1682def PsiData(ifile: PathLike, /, 1683 model: ModelType = 'mas', 1684 **kwargs): 1685 """Open a PSI MAS or POT3D HDF file and return the appropriate data reader. 1686 1687 Inspects the file extension and *model* argument, selects the correct 1688 concrete reader (HDF5 or HDF4 backend), and returns it. No data are read 1689 from disk at construction time — metadata is resolved from the filename and 1690 HDF file attributes, and data transfer happens only inside :meth:`read` or 1691 :meth:`vslice`. Full-array reads are cached automatically; partial reads 1692 are not. 1693 1694 The returned reader exposes the following attributes: 1695 1696 - ``quantity`` — canonical lower-case quantity identifier (e.g. ``'br'``). 1697 - ``sequence`` — integer time-step sequence number. 1698 - ``unit`` — :class:`~astropy.units.Unit` for code → physical conversion; 1699 normalization constants are defined in :mod:`psi_io._units`. 1700 - ``mesh`` — per-axis Yee-grid stagger as a tuple of 1701 :class:`~psi_io._mesh.Mesh` members; see :mod:`psi_io._mesh`. 1702 - ``props`` — full :class:`~psi_io._models.Props` descriptor (name, 1703 description, unit, mesh code); see :mod:`psi_io._models`. 1704 - ``description`` — human-readable quantity description. 1705 - ``scales`` — ``Scales(r, t, p)`` named tuple of coordinate scale readers, 1706 each supporting the same :meth:`read` interface as the main reader. 1707 - ``shape``, ``ndim``, ``size``, ``nbytes``, ``dtype``, ``attrs`` — array 1708 metadata; shape is in HDF storage order ``(Nφ, Nθ, Nr)``. 1709 - ``is_cached`` — ``True`` after a full-array read has been cached. 1710 1711 Use :meth:`read` to load a slice by index and :meth:`vslice` to slice by 1712 physical coordinate value with linear interpolation. Both return data as 1713 :class:`~astropy.units.Quantity` objects in physical ``(r, θ, φ)`` order. 1714 The object also supports the context-manager protocol. 1715 1716 .. warning:: **POT3D unit convention** 1717 1718 POT3D applies no normalization to its output. The stored values are in 1719 whatever physical unit the input photospheric magnetogram used — most 1720 commonly Gauss, but this is not encoded in the file. The default 1721 ``unit`` for POT3D is ``dimensionless_unscaled`` (scale factor 1), so 1722 ``read(unit='physical')`` will not perform a meaningful conversion 1723 unless the correct unit is supplied at construction: 1724 1725 .. code-block:: python 1726 1727 reader = PsiData('br001.h5', model='pot3d', unit='Gauss') 1728 data, r, t, p = reader.read() 1729 1730 Parameters 1731 ---------- 1732 ifile : PathLike 1733 Path to the HDF4 (``.hdf``) or HDF5 (``.h5``) file. 1734 model : {'mas', 'pot3d'}, optional 1735 PSI model type. Defaults to ``'mas'``. 1736 dataset_id : str, optional 1737 Dataset name within the HDF file. Defaults to the PSI standard 1738 identifier for the given format. 1739 quantity : str, optional 1740 Override the quantity name inferred from the filename or file attributes. 1741 sequence : int, optional 1742 Override the time-step sequence number. 1743 unit : str or u.Unit, optional 1744 Override the code-to-physical unit from the quantity's 1745 :class:`~psi_io._models.Props` entry. Accepts any string parseable by 1746 :class:`~astropy.units.Unit` or a :class:`~astropy.units.Unit` instance. 1747 mesh : MeshCodeType, optional 1748 Override the mesh stagger from the quantity's 1749 :class:`~psi_io._models.Props` entry. 1750 1751 Returns 1752 ------- 1753 out : H5Data or H4Data 1754 Open reader implementing the full ``_HdfInterface`` API. Concrete type 1755 depends on the file extension. 1756 1757 Raises 1758 ------ 1759 ValueError 1760 If the file extension / model combination is unsupported or required 1761 metadata cannot be resolved. 1762 FileNotFoundError 1763 If *ifile* does not exist. 1764 1765 See Also 1766 -------- 1767 astropy.units.Unit : Unit constructor — accepts strings, compound 1768 expressions, and :class:`~astropy.units.Unit` instances. 1769 astropy.units.Quantity.to : Unit conversion used internally when 1770 a ``unit`` string is supplied to :meth:`read`. 1771 1772 Examples 1773 -------- 1774 Read a MAS radial field — full array with coordinate scales, then convert: 1775 1776 >>> from psi_io.mhd_io import PsiData # doctest: +SKIP 1777 >>> reader = PsiData('br001001.h5') 1778 >>> data, r, t, p = reader.read() # code units (MAS_b) 1779 >>> data, r, t, p = reader.read(unit='Gauss') # convert to Gauss 1780 1781 Use as a context manager: 1782 1783 >>> with PsiData('vr001001.h5') as reader: # doctest: +SKIP 1784 ... data, r, t, p = reader.read(unit='km/s') 1785 1786 Inspect metadata without loading data: 1787 1788 >>> reader = PsiData('rho001001.h5') # doctest: +SKIP 1789 >>> reader.quantity # 'rho' 1790 >>> reader.unit # MAS_n 1791 >>> reader.mesh # (Mesh.HALF, Mesh.HALF, Mesh.HALF) 1792 >>> reader.is_cached # False 1793 """ 1794 ifile = Path(ifile) 1795 cls = _DATA_CLASS_MAP.get(ifile.suffix) 1796 if cls is None: 1797 raise ValueError( 1798 f"Unsupported HDF extension '{ifile.suffix}'" 1799 ) from None 1800 return cls(ifile, model.lower(), **kwargs)