1r"""Lazy HDF readers for PSI MAS and POT3D magnetohydrodynamic model output.
2
3This module provides a unified, unit-aware interface for reading three-dimensional
4field variables from Predictive Science Inc.'s MAS and POT3D solvers. Both HDF4
5(``.hdf``) and HDF5 (``.h5``) files are supported through a common API. The sole
6public symbol exported by this module is :func:`PsiData`.
7
8.. rubric:: Entry Point
9
10:func:`PsiData` is a factory function that opens a PSI HDF file and returns a
11lazy reader object. The file extension determines the I/O backend — ``.h5``
12files are read via `h5py <https://www.h5py.org/>`_ and ``.hdf`` files via
13`pyhdf <https://fhs.github.io/pyhdf/>`_. The *model* argument (``'mas'`` or
14``'pot3d'``) selects the quantity-property and mesh-stagger metadata tables
15applied during metadata resolution:
16
17.. code-block:: python
18
19 from psi_io.mhd_io import PsiData
20
21 reader = PsiData('br001001.h5') # MAS HDF5 (default)
22 reader = PsiData('br001001.hdf', model='mas') # MAS HDF4
23 reader = PsiData('br001.h5', model='pot3d') # POT3D HDF5
24
25.. rubric:: Lazy Loading and Caching
26
27No data are read from disk at construction time. Metadata — quantity name,
28sequence number, physical unit, and mesh stagger — are parsed from the filename
29stem and/or HDF file attributes. Data are transferred from disk only when
30:meth:`read` or :meth:`vslice` is called.
31
32When a read or vslice call covers the *entire* dataset with no spatial
33restrictions, the result is cached on the reader object. Subsequent full-array
34calls return the cached copy without a second disk read. Partial reads — any
35call with at least one non-full-axis argument — are never cached. The cache
36state is exposed via the ``is_cached`` property.
37
38.. rubric:: Attributes
39
40The following attributes are available on every object returned by
41:func:`PsiData`:
42
43``quantity`` : :class:`str`
44 Canonical lower-case quantity identifier (e.g. ``'br'``, ``'vr'``,
45 ``'t'``). Resolved from the filename stem, HDF file attributes, or the
46 ``quantity`` keyword argument passed to :func:`PsiData`.
47
48``sequence`` : :class:`int`
49 Time-step sequence number extracted from the filename
50 (e.g. ``br001001.h5`` → ``1001``) or from HDF file attributes.
51
52``unit`` : :class:`~astropy.units.Unit`
53 Conversion factor from one code unit to physical units. For MAS
54 quantities these are the custom normalization units defined in
55 :mod:`psi_io._units` (e.g. :data:`~psi_io._units.MAS_b` ≈ 2.2 G for
56 magnetic field, :data:`~psi_io._units.MAS_v` ≈ 481 km s⁻¹ for velocity).
57 For POT3D the default is dimensionless — see the :func:`PsiData` warning.
58
59``mesh`` : :class:`tuple` of :class:`~psi_io._mesh.Mesh`
60 Yee-grid stagger position for each spatial axis in physical ``(r, θ, φ)``
61 order. Each element is either :attr:`~psi_io._mesh.Mesh.HALF` (offset by
62 half a cell spacing) or :attr:`~psi_io._mesh.Mesh.MAIN` (cell-centred).
63 The integer encoding and per-quantity stagger codes are defined in
64 :mod:`psi_io._mesh`; the canonical default for each quantity is stored in
65 its :class:`~psi_io._models.Props` descriptor (see :mod:`psi_io._models`).
66
67``props`` : :class:`~psi_io._models.Props`
68 Complete property descriptor for the quantity, bundling its name,
69 description, unit, and mesh code. Looked up from the appropriate
70 quantity-properties mapping in :mod:`psi_io._models`.
71
72``description`` : :class:`str`
73 Human-readable description of the physical quantity
74 (e.g. ``'MAS Magnetic Field (Radial Component)'``). Derived from
75 :attr:`~psi_io._models.Props.desc`.
76
77``scales`` : ``Scales(r, t, p)``
78 Named tuple of coordinate scale readers. Each element wraps the
79 one-dimensional coordinate array stored in the HDF file. The radial
80 coordinate uses :data:`~psi_io._units.PSI_rsun` (solar radii); the
81 co-latitude and longitude coordinates use
82 :data:`~psi_io._units.PSI_angle` (radians). Each scale reader exposes
83 the same :meth:`read` interface as the main data reader.
84
85``shape`` : :class:`tuple` of :class:`int`
86 Array dimensions in HDF storage order ``(Nφ, Nθ, Nr)`` — the radial
87 axis is *last* due to Fortran column-major convention. All slicing APIs
88 accept arguments in physical ``(r, θ, φ)`` order and reverse the indexing
89 internally.
90
91``ndim`` : :class:`int`
92 Number of spatial dimensions; always ``3`` for MAS and POT3D field
93 variables.
94
95``size`` : :class:`int`
96 Total element count (``Nφ × Nθ × Nr``).
97
98``nbytes`` : :class:`int`
99 Dataset size in bytes.
100
101``dtype`` : :class:`numpy.dtype`
102 Element type of the stored dataset (typically ``float32``).
103
104``attrs`` : :class:`dict`
105 HDF file-level attributes as a plain Python dictionary.
106
107``is_cached`` : :class:`bool`
108 ``True`` once a full-array read has populated the in-memory cache;
109 ``False`` otherwise.
110
111.. rubric:: Reading Data — ``read``
112
113.. code-block:: python
114
115 odata[, r, t, p] = reader.read(*args, unit=None, mesh=None, scales=True)
116
117Each positional argument restricts one spatial axis in physical ``(r, θ, φ)``
118order:
119
120- Omitted / ``None`` — full axis.
121- ``int`` — single index; axis retained as a length-1 dimension.
122- ``slice`` — standard Python slice.
123- ``(start, stop)`` or ``(start, stop, step)`` — converted to a slice.
124- ``Ellipsis`` — expands to ``None`` for all remaining axes.
125
126**unit**
127 Output unit. String aliases ``'native'`` / ``'code'`` / ``'model'`` /
128 ``'psi'`` return values in MAS code units (the units defined in
129 :mod:`psi_io._units`). Aliases ``'real'`` / ``'phys'`` / ``'physical'``
130 / ``'cgs'`` call :func:`~psi_io._units.decompose_mas_units` to express
131 values in CGS base units. Any other string or
132 :class:`~astropy.units.Unit` instance is forwarded to
133 :meth:`astropy.units.Quantity.to`.
134
135**mesh**
136 Target mesh stagger (:data:`~psi_io._mesh.MeshCodeType`). Axes that are
137 on the half mesh in the stored data but on the main mesh in *mesh* are
138 averaged via :func:`~psi_io._mesh.remesh_array`. Up-sampling
139 (main → half) raises :exc:`ValueError`.
140
141**scales**
142 If ``True`` (default), return the corresponding coordinate slice for each
143 axis as additional :class:`~astropy.units.Quantity` values ``(r, t, p)``
144 after the data array.
145
146.. note::
147 PSI HDF files are written in Fortran column-major order so that numpy
148 reads them with shape ``(Nφ, Nθ, Nr)`` — the radial axis is *last*. All
149 positional arguments to :meth:`read` and the returned coordinate scales
150 are in physical ``(r, θ, φ)`` order regardless of on-disk layout.
151
152.. rubric:: Value-Space Slicing — ``vslice``
153
154.. code-block:: python
155
156 odata[, r, t, p] = reader.vslice(*args, unit=None, mesh=None, scales=True,
157 bounds_error=True, fill_value=None)
158
159``vslice`` is a superset of :meth:`read` that additionally accepts physical
160coordinate values as positional arguments. Passing a
161:class:`~astropy.units.Quantity` or a bare scalar for an axis locates the two
162nearest grid points, loads a 2-element window, and linearly interpolates to the
163target value; the resulting axis has size 1. Index-space arguments (``slice``,
164``int``, ``None``, ``Ellipsis``) are handled identically to :meth:`read`.
165
166**bounds_error**
167 If ``True`` (default), raise :exc:`ValueError` when a value is outside the
168 coordinate range. Set to ``False`` and supply a **fill_value** to replace
169 out-of-bounds points; pass ``fill_value=None`` to extrapolate silently.
170
171When ``scales=True``, value-interpolated axes return the target coordinate as
172a length-1 :class:`~astropy.units.Quantity`; index-space axes return the
173corresponding coordinate slice as usual.
174
175.. rubric:: File Lifecycle
176
177Readers hold an open HDF file handle. Use the context-manager protocol to
178guarantee cleanup:
179
180.. code-block:: python
181
182 with PsiData('br001001.h5') as reader:
183 data, r, t, p = reader.read(unit='Gauss')
184
185Alternatively, call ``reader.close()`` and ``reader.open()`` to manage the
186handle explicitly. The handle is also released automatically when the reader
187is garbage-collected.
188
189.. rubric:: Supported Quantities
190
191**MAS** (19 quantities):
192
193.. list-table::
194 :widths: 12 32 24 32
195 :header-rows: 1
196
197 * - Key(s)
198 - Description
199 - Code unit
200 - Stagger ``(r, θ, φ)``
201 * - ``br``, ``bt``, ``bp``
202 - Magnetic field components
203 - :data:`~psi_io._units.MAS_b` (≈ 2.2 G)
204 - ``HALF/MAIN/MAIN``, ``MAIN/HALF/MAIN``, ``MAIN/MAIN/HALF``
205 * - ``vr``, ``vt``, ``vp``
206 - Velocity components
207 - :data:`~psi_io._units.MAS_v` (≈ 481 km s⁻¹)
208 - ``MAIN/HALF/HALF``, ``HALF/MAIN/HALF``, ``HALF/HALF/MAIN``
209 * - ``jr``, ``jt``, ``jp``
210 - Current density components
211 - :data:`~psi_io._units.MAS_j`
212 - ``MAIN/HALF/HALF``, ``HALF/MAIN/HALF``, ``HALF/HALF/MAIN``
213 * - ``t``, ``te``, ``tp``
214 - Temperature (single / electron / proton)
215 - :data:`~psi_io._units.MAS_t` (≈ 28 MK)
216 - ``HALF/HALF/HALF``
217 * - ``rho``
218 - Mass density
219 - :data:`~psi_io._units.MAS_n` (10⁸ cm⁻³)
220 - ``HALF/HALF/HALF``
221 * - ``p``
222 - Thermal pressure
223 - :data:`~psi_io._units.MAS_p`
224 - ``HALF/HALF/HALF``
225 * - ``ep``, ``em``
226 - Alfvén wave energy density (outward / inward)
227 - :data:`~psi_io._units.MAS_p`
228 - ``HALF/HALF/HALF``
229 * - ``zp``, ``zm``
230 - Elsässer wave amplitudes (outward / inward)
231 - :data:`~psi_io._units.MAS_v`
232 - ``HALF/HALF/HALF``
233 * - ``heat``
234 - Volumetric coronal heating rate
235 - :data:`~psi_io._units.MAS_heat`
236 - ``HALF/HALF/HALF``
237
238**POT3D** (3 quantities): ``br``, ``bt``, ``bp`` — magnetic field components,
239unit :data:`~psi_io._units.POT3D_b` (dimensionless by default; see the
240:func:`PsiData` warning regarding unit declaration).
241
242.. rubric:: Examples
243
244Full MAS radial field read with coordinate scales:
245
246.. code-block:: python
247
248 from psi_io.mhd_io import PsiData
249
250 reader = PsiData('br001001.h5')
251 data, r, t, p = reader.read() # code units (MAS_b)
252 data, r, t, p = reader.read(unit='Gauss') # convert to Gauss on the fly
253
254Partial read — inner radial shell in CGS base units, coordinates suppressed:
255
256.. code-block:: python
257
258 data = reader.read(slice(0, 10), unit='physical', scales=False)
259
260Value-space slice — extract the r = 2.5 R☉ surface:
261
262.. code-block:: python
263
264 data, r, t, p = reader.vslice(2.5, unit='Gauss') # bare scalar → native coord unit
265
266POT3D file — unit must be declared at construction because it is not encoded in
267the file:
268
269.. code-block:: python
270
271 reader = PsiData('br001.hdf', model='pot3d', unit='Gauss')
272 data, r, t, p = reader.read()
273
274Inspect metadata without loading data:
275
276.. code-block:: python
277
278 reader = PsiData('rho001001.h5')
279 reader.quantity # 'rho'
280 reader.description # 'MAS Density'
281 reader.unit # MAS_n
282 reader.mesh # (Mesh.HALF, Mesh.HALF, Mesh.HALF)
283 reader.is_cached # False
284 reader.shape # (Nφ, Nθ, Nr) — numpy storage order
285"""
286
287from __future__ import annotations
288
289__all__ = ['PsiData',]
290
291from abc import abstractmethod, ABC
292from collections import namedtuple
293from collections.abc import Sequence
294from itertools import repeat
295from pathlib import Path
296from types import MappingProxyType
297from typing import TYPE_CHECKING, Optional, Literal, ClassVar
298import numpy as np
299import h5py as h5
300import astropy.units as u
301
302if TYPE_CHECKING:
303 from astropy.units.typing import UnitLike, QuantityLike
304
305try:
306 import pyhdf.SD as h4
307except ImportError:
308 h4 = None
309
310from psi_io._mesh import (Mesh,
311 MeshCodeType,
312 _normalize_mesh_code,
313 _remesh_array,
314 _parse_remesh, ArrayOrdering,
315 )
316from psi_io._models import (Props,
317 PsiScales,
318 ModelType,
319 extract_quantity_from_filepath,
320 extract_sequence_from_filepath, get_model_prop_caller, MODEL_TYPE)
321from psi_io._units import decompose_mas_units
322from psi_io.psi_io import (PathLike,
323 PSI_DATA_ID,
324 SDC_TYPE_CONVERSIONS,
325 _except_no_pyhdf,)
326
327
328HdfVersionType = Literal[4, 5]
329"""Literal type alias for the two supported HDF file format versions.
330
331``4`` — HDF4, accessed via pyhdf (optional dependency).
332``5`` — HDF5, accessed via h5py.
333"""
334
335
336_HDF_EXT_MAPPING = MappingProxyType({'h4': '.hdf', 'h5': '.h5', })
337"""Mapping from HDF version string to file extension.
338
339Used by :class:`_HdfData.__init__` to validate that the supplied file has an
340extension consistent with the concrete class's format mixin.
341
342``'h5'`` → ``'.h5'`` (HDF5 files, read via h5py)
343``'h4'`` → ``'.hdf'`` (HDF4 files, read via pyhdf)
344"""
345
346
347_DATA_SLOTS = ('_model', '_fileref', '_filepath', '_datalabel', '_quantity', '_sequence', '_unit', '_mesh', '_scales', '_values')
348"""Slot names shared by all concrete :class:`_HdfData` subclasses.
349
350Stored as a module-level tuple so it can be referenced in ``__slots__`` declarations
351of both the abstract base and the concrete classes without repetition.
352"""
353
354_SCALE_SLOTS = ('_model', '_dataref', '_datalabel', '_quantity', '_unit', '_mesh', '_values')
355"""Slot names shared by all concrete :class:`_HdfScale` subclasses.
356
357Stored as a module-level tuple so it can be referenced in ``__slots__`` declarations
358of both :class:`H5Scale` and :class:`H4Scale` without repetition.
359"""
360
361
362_CODE_UNIT_ALIASES = {'native', 'code', 'model', 'psi'}
363"""Set of strings that request code-unit output from :meth:`_HdfInterface.read`.
364
365When the ``units`` argument to ``read()`` is one of these strings, the data are
366returned in MAS code units (dimensionless ratios) without any physical conversion.
367"""
368
369
370_REAL_UNIT_ALIASES = {'real', 'phys', 'physical', 'cgs'}
371"""Set of strings that request decomposed CGS output from :meth:`_HdfInterface.read`.
372
373When the ``units`` argument to ``read()`` is one of these strings, the data are
374converted to physical CGS units via :func:`~psi_io._units.decompose_mas_units`.
375"""
376
377
378METADATA_SCHEMA = dict.fromkeys(['quantity', 'sequence', 'unit', 'scalar', 'mesh'])
379"""Template dict defining the five recognised metadata fields for PSI HDF datasets.
380
381Keys
382----
383``'quantity'``
384 Canonical lower-case quantity identifier extracted from the filename or file
385 attributes (e.g. ``'br'``, ``'vr'``).
386``'sequence'``
387 Integer time-step sequence number extracted from the filename or file attributes.
388``'unit'``
389 Code-to-physical unit for this quantity, as an :class:`~astropy.units.Unit`
390 or a string parseable by it.
391``'scalar'``
392 ``True`` if the quantity is a scalar field; ``False`` for vector components.
393``'mesh'``
394 Staggered-grid mesh code (:data:`~psi_io._mesh.MeshCodeType`) describing the
395 Yee-grid position of the quantity on each spatial axis.
396
397Used by :meth:`_HdfData._parse_properties` to filter keyword overrides and HDF file
398attributes against the set of known metadata keys.
399"""
400
401
402Scales = namedtuple("Scales", ['r', 't', 'p'])
403"""Named tuple holding the three coordinate scale readers for a data object.
404
405Fields
406------
407r : H5Scale or H4Scale
408 Radial coordinate scale in solar radii.
409t : H5Scale or H4Scale
410 Co-latitude scale in radians.
411p : H5Scale or H4Scale
412 Longitude scale in radians.
413
414Created by :meth:`_HdfData._set_scales` during instantiation and exposed as
415:attr:`_HdfData.scales`.
416"""
417
418
419def _interpolate_dim(arr: QuantityLike,
420 axis: int,
421 value: QuantityLike,
422 scale: QuantityLike,
423 fill_value: Optional[QuantityLike]
424 ) -> QuantityLike:
425 """Linearly interpolate *arr* along *axis* to *value*, collapsing that axis to size 1.
426
427 Both *value* and *scale* must carry the same unit (or both be dimensionless).
428 The result is a fill array when *value* is outside the 2-element *scale* window
429 and *fill_value* is not ``None``; otherwise a linear interpolation is performed.
430 Extrapolation occurs silently when *fill_value* is ``None``.
431 """
432 if arr.shape[axis] != 2 or len(scale) != 2:
433 raise ValueError("Interpolation is only supported for 2-element arrays and scales.")
434 if (value < scale[0] or value > scale[1]) and fill_value is not None:
435
436 return np.full(arr.shape[:axis] + (1,) + arr.shape[axis + 1:], fill_value, dtype=arr.dtype)
437 t = (value - scale[0]) / (scale[1] - scale[0])
438 slc_lo = [slice(None)] * arr.ndim
439 slc_hi = [slice(None)] * arr.ndim
440 slc_lo[axis] = slice(None, -1)
441 slc_hi[axis] = slice(1, None)
442 return (1.0 - t) * arr[tuple(slc_lo)] + t * arr[tuple(slc_hi)]
443
444
445def _slice_array(data: QuantityLike,
446 values: Sequence[Optional[QuantityLike]],
447 scales: Sequence[Optional[QuantityLike]],
448 fill_value: Optional[QuantityLike],
449 order: ArrayOrdering = 'F') -> QuantityLike:
450 """Interpolate *data* to physical values along each axis that has a non-``None`` entry.
451
452 Iterates over axes in storage order (reversed from physical order when
453 ``order='F'``) and calls :func:`_interpolate_dim` for each axis whose
454 entry in *values* is not ``None``. Axes with ``None`` entries are left
455 unchanged. *scales* must be in the same order as *values* and provide the
456 two-element coordinate window for each interpolated axis.
457 """
458 if order == 'F':
459 values, scales = reversed(values), reversed(scales)
460 for i, (v, s) in enumerate(zip(values, scales)):
461 if v is not None:
462 if s is None:
463 raise ValueError("Cannot interpolate to a value when the corresponding scale is not provided.")
464 data = _interpolate_dim(data, i, v, s, fill_value)
465 return data
466
467
468def _expand_args(*args, ndim: int) -> tuple:
469 """Expand *args* to a length-*ndim* tuple, replacing ``Ellipsis`` and padding with ``None``.
470
471 Parameters
472 ----------
473 *args : object
474 User-supplied dimension arguments. An ``Ellipsis`` is replaced by the
475 appropriate number of ``None`` values so that the total length equals *ndim*.
476 If fewer than *ndim* arguments are provided, trailing ``None``\\ s are appended.
477 ndim : int
478 Target length of the returned tuple.
479
480 Returns
481 -------
482 out : tuple
483 Length-*ndim* tuple of dimension arguments.
484 """
485 if Ellipsis in args:
486 n_missing = ndim - (len(args) - 1)
487 idx = args.index(Ellipsis)
488 args = args[:idx] + (None,) * n_missing + args[idx + 1:]
489 if len(args) < ndim:
490 args += (None,) * (ndim - len(args))
491 return args
492
493
494def _parse_islice_args(*args, shape: tuple[int, ...],):
495 """Normalize index-space slice arguments to a tuple of :class:`slice` objects.
496
497 Accepts a mix of ``None``, ``int``, ``slice``, ``(start, stop[, step])`` tuples,
498 and ``Ellipsis``, and yields one slice per spatial axis.
499
500 Parameters
501 ----------
502 *args : None, int, slice, tuple, or Ellipsis
503 Index arguments in physical ``(r, θ, φ)`` user order. Fewer arguments than
504 dimensions are padded with ``None`` (full-axis slices).
505 shape : tuple[int, ...]
506 Physical ``(r, θ, φ)`` shape (i.e. ``reversed(self.shape)`` from storage).
507
508 Yields
509 ------
510 s : slice
511 Normalized slice for each axis.
512
513 Raises
514 ------
515 ValueError
516 If a slice argument produces an empty dimension (``stop <= start``).
517 TypeError
518 If an argument cannot be converted to a slice (via :func:`_cast_to_slice`).
519 """
520 sargs = _expand_args(*args, ndim=len(shape))
521
522 for arg, dim_size in zip(sargs, shape):
523 slice_ = _cast_to_slice(arg)
524 start, stop, step = slice_.indices(dim_size)
525 if stop <= start:
526 raise ValueError(f"Slice argument {arg!r} yields an empty dimension.")
527 yield slice_
528
529
530def _parse_vslice_args(*args,
531 scales: Scales,
532 bounds_error: bool):
533 """Convert value-space dimension arguments to ``(value, index_slice)`` pairs.
534
535 For each axis, a bare scalar or :class:`~u.Quantity` triggers a
536 value-space lookup: the coordinate scale is searched to find the two bracketing
537 indices, and a 2-element slice is returned for later linear interpolation via
538 :func:`_interpolate_dim`. All other argument types (``None``, ``int``,
539 ``slice``, tuple) are treated as index-space and passed through to
540 :func:`_cast_to_slice` unchanged, with ``None`` returned as the value.
541
542 Parameters
543 ----------
544 *args : scalar, u.Quantity, int, slice, tuple, None, or Ellipsis
545 One argument per spatial axis in physical ``(r, θ, φ)`` order. Fewer
546 arguments than dimensions are padded with ``None`` via :func:`_expand_args`.
547 scales : Scales
548 Coordinate scale readers in physical ``(r, θ, φ)`` order. Used to look
549 up the native unit and perform bounds checking.
550 bounds_error : bool
551 If ``True``, raise :class:`ValueError` when a value lies outside the
552 range of its coordinate scale.
553
554 Yields
555 ------
556 value : u.Quantity or None
557 The physical target value (in the scale's native unit) for value-space
558 axes; ``None`` for index-space axes.
559 s : slice
560 Corresponding index-space slice. For value-space axes this is the
561 2-element bracketing window ``slice(i-1, i+1)``; for index-space axes
562 it is the result of :func:`_cast_to_slice`.
563
564 Raises
565 ------
566 ValueError
567 If *bounds_error* is ``True`` and a value lies outside its scale range.
568 """
569 sargs = _expand_args(*args, ndim=len(scales))
570 for arg, scale in zip(sargs, scales):
571 value = None
572 if np.isscalar(arg):
573 arg = arg * scale.unit
574 if isinstance(arg, u.Quantity):
575 value = arg.to(scale.unit)
576 raw_value = value.value
577 if bounds_error and (raw_value < scale[0] or raw_value > scale[-1]):
578 raise ValueError(f"Value {arg} is out of bounds for scale '{scale.quantity}' "
579 f"with range {scale.data[0]:.3f} to {scale.data[-1]:.3f} {scale.unit}.")
580 index = int(np.clip(np.searchsorted(scale[...], raw_value), 1, scale.size - 1))
581 arg = (index - 1, index + 1)
582 slice_ = _cast_to_slice(arg)
583 start, stop, step = slice_.indices(scale.size)
584 if stop <= start:
585 raise ValueError(f"Slice argument {arg!r} yields an empty dimension.")
586 yield value, slice_
587
588
589def _apply_units(data: u.Quantity, unit: Optional[str | UnitLike]) -> u.Quantity:
590 """Apply a unit conversion to *data*, returning a :class:`~u.Quantity`.
591
592 Parameters
593 ----------
594 data : u.Quantity
595 Data in code units.
596 unit : str, u.Unit, or None
597 Requested output unit. ``None`` is a no-op. Special string aliases:
598 ``'native'`` / ``'code'`` / ``'model'`` / ``'psi'`` — return *data*
599 unchanged; ``'real'`` / ``'phys'`` / ``'physical'`` / ``'cgs'`` —
600 decompose to CGS base unit via
601 :func:`~psi_io._units.decompose_mas_units`. Any other value is
602 forwarded to :meth:`~u.Quantity.to`.
603
604 Returns
605 -------
606 out : u.Quantity
607 *data* in the requested unit.
608 """
609 if unit is None:
610 return data
611 ounit = str(unit).lower()
612 if ounit in _CODE_UNIT_ALIASES:
613 return data
614 if ounit in _REAL_UNIT_ALIASES:
615 return decompose_mas_units(data)
616 return data.to(unit)
617
618
619def _cast_to_slice(input: None | int | slice | Sequence) -> slice:
620 """Convert a dimension index argument to a :class:`slice` object.
621
622 Parameters
623 ----------
624 input : None, int, slice, list, or tuple
625 - ``None`` → ``slice(None)`` (full axis).
626 - :class:`int` → ``slice(input, input + 1)`` (single element, axis retained).
627 - :class:`slice` → returned unchanged.
628 - :class:`list` or :class:`tuple` → ``slice(*input)`` (unpacked as
629 ``(start, stop)`` or ``(start, stop, step)``).
630
631 Returns
632 -------
633 out : slice
634
635 Raises
636 ------
637 TypeError
638 If *input* is not one of the recognized types.
639 """
640 if input is None:
641 return slice(None)
642 elif isinstance(input, int):
643 return slice(input, input + 1)
644 elif isinstance(input, slice):
645 return input
646 elif isinstance(input, (list, tuple)):
647 return slice(*input)
648 else:
649 raise TypeError(f"Invalid slice argument: {input!r}. Expected int, slice, or sequence.")
650
651
652# =============================================================================
653# Abstract interface
654# =============================================================================
655
656class _HdfInterface(ABC):
657 """Abstract base class defining the full public interface for PSI HDF data objects.
658
659 All concrete readers (MAS, POT3D, coordinate scales) and their format mixins
660 satisfy this interface. Consumers should program against this class rather than
661 against concrete subclasses.
662
663 Subclasses must implement all abstract properties and the :meth:`read` and
664 :meth:`_read` methods. The ``read`` method in this base class provides shared
665 unit-conversion and remeshing logic that concrete implementations invoke via
666 ``super().read(*args, **kwargs)``.
667
668 Notes
669 -----
670 The ``__slots__ = ()`` declaration in this class is intentional: it ensures that
671 concrete subclasses using ``__slots__`` do not incur a ``__dict__`` overhead
672 from the abstract base.
673 """
674
675 __slots__ = ()
676
677 _HDFN: ClassVar[HdfVersionType] # provided by format mixin
678
679 def __getitem__(self, args):
680 """Index the underlying HDF dataset with physical ``(r, θ, φ)`` axis ordering.
681
682 PSI HDF files are written in Fortran column-major order so that the radial
683 index ``r`` varies fastest. When read into numpy (row-major), this results
684 in storage shape ``(Nφ, Nθ, Nr)``. This method re-reverses the user-supplied
685 index tuple so that callers can use natural physical ordering ``(r, θ, φ)``:
686
687 .. code-block:: python
688
689 obj[r_slice, t_slice, p_slice] # user order
690 # internally becomes:
691 obj.data[p_slice, t_slice, r_slice] # numpy storage order
692
693 Parameters
694 ----------
695 args : int, slice, or tuple
696 Index argument(s) in physical ``(r, θ, φ)`` order. A single non-tuple
697 argument is treated as indexing along the first physical axis (``r``).
698
699 Returns
700 -------
701 out : numpy.ndarray
702 Slice of the dataset in numpy storage order ``(Nφ, Nθ, Nr)``.
703 """
704 if not isinstance(args, tuple):
705 args = (args,)
706
707 if self._values is not None:
708 return self._values[args[::-1]]
709 else:
710 odata = self.data[args[::-1]]
711 if odata.shape == self.shape:
712 self._values = odata
713 return odata
714
715 @property
716 @abstractmethod
717 def shape(self) -> tuple[int, ...]:
718 """Shape of the underlying numpy array in storage order ``(Nφ, Nθ, Nr)``."""
719 ...
720
721 @property
722 @abstractmethod
723 def dtype(self) -> np.dtype:
724 """NumPy dtype of the stored array."""
725 ...
726
727 @property
728 @abstractmethod
729 def size(self) -> int:
730 """Total number of elements in the array."""
731 ...
732
733 @property
734 @abstractmethod
735 def nbytes(self) -> int:
736 """Total memory occupied by the array in bytes."""
737 ...
738
739 @property
740 @abstractmethod
741 def ndim(self) -> int:
742 """Number of array dimensions (always 3 for data objects, 1 for scales)."""
743 ...
744
745 @property
746 @abstractmethod
747 def attrs(self) -> dict:
748 """HDF attributes attached to this dataset as a plain Python dict."""
749 ...
750
751 @property
752 @abstractmethod
753 def unit(self) -> u.Unit:
754 """Astropy unit that converts one code-unit value to physical unit."""
755 ...
756
757 @property
758 @abstractmethod
759 def mesh(self) -> tuple[Mesh, ...]:
760 """Normalized mesh-stagger tuple, one :class:`~psi_io._mesh.Mesh` per axis."""
761 ...
762
763 @property
764 @abstractmethod
765 def quantity(self) -> str:
766 """Canonical lower-case quantity identifier (e.g. ``'br'``)."""
767 ...
768
769 @property
770 @abstractmethod
771 def data(self) -> np.ndarray:
772 """Raw array view into the HDF dataset (not yet loaded into RAM).
773
774 For HDF5 objects this is an h5py :class:`~h5py.Dataset`; for HDF4 objects
775 it is a pyhdf ``SDS`` object. Actual data transfer occurs only when the
776 returned object is indexed or converted to a numpy array.
777 """
778 ...
779
780 @property
781 def props(self) -> Props:
782 """The :class:`~psi_io._models.Props` descriptor for this quantity.
783
784 Returns the full property bundle (name, description, unit, mesh code) from
785 the appropriate mapping for this reader's model type and quantity.
786
787 Returns
788 -------
789 out : Props
790 Immutable property descriptor for :attr:`quantity`.
791 """
792 prop_getter = get_model_prop_caller(self._model)
793 return prop_getter(self._quantity)
794
795 @property
796 def description(self) -> str:
797 """Human-readable description of the physical quantity.
798
799 Looked up from the appropriate property mapping via :data:`_PROP_GETTER_MAPPING`
800 using :attr:`_MODEL` and the stored :attr:`quantity` name.
801
802 Returns
803 -------
804 out : str
805 Description string (e.g. ``'Magnetic Field (Radial Component)'``).
806 """
807 return self.props.desc
808
809 @property
810 def is_scalar(self) -> bool:
811 """``True`` if the quantity is a scalar field; ``False`` for vector components."""
812 return self.props.scalar
813
814 @property
815 def is_cached(self) -> bool:
816 """Flag indicating whether the data have been loaded into memory."""
817 return self._values is not None
818
819 def load(self):
820 """Read the full dataset into memory and cache it in ``_values``."""
821 self._values = self.data[...]
822
823 @abstractmethod
824 def read(self,
825 *args,
826 unit: Optional[str | UnitLike] = None,
827 mesh: Optional[MeshCodeType] = None,
828 ) -> tuple[u.Quantity, tuple[slice, ...], tuple[bool, ...]]:
829 """Read a slice of data, applying optional unit conversion and remeshing.
830
831 This base implementation handles unit conversion and remesh flag computation.
832 Concrete subclasses call ``super().read(...)`` and then attach coordinate
833 scales or perform additional post-processing.
834
835 Parameters
836 ----------
837 *args : int, slice, tuple, or None
838 Index arguments in physical ``(r, θ, φ)`` axis order. Each positional
839 argument selects elements along one spatial axis in the order
840 ``(r, θ, φ)``. Accepted forms per axis:
841
842 - ``None`` or omitted — full axis (equivalent to ``slice(None)``).
843 - ``int`` — single index (output retains that axis as a length-1
844 dimension).
845 - ``slice`` — standard Python slice.
846 - ``(start, stop)`` or ``(start, stop, step)`` — converted to a slice.
847 - ``Ellipsis`` — expands to ``None`` for all remaining axes.
848
849 unit : str or u.Unit, optional
850 Requested output unit. Special string aliases are accepted:
851
852 - ``'native'`` / ``'code'`` / ``'model'`` — return raw code-unit
853 values (an :class:`~u.Quantity` whose unit is the custom
854 MAS unit, e.g. ``MAS_b``).
855 - ``'real'`` / ``'phys'`` / ``'physical'`` — decompose to CGS base
856 unit via :func:`~psi_io._units.decompose_mas_units`.
857 - Any other string or :class:`~astropy.units.Unit` — passed directly
858 to :meth:`~u.Quantity.to`.
859
860 If ``None``, the data are returned in the native code units without
861 conversion.
862
863 mesh : MeshCodeType, optional
864 Target mesh stagger. If provided, half-mesh axes in the stored data
865 that are on the main mesh in *mesh* are averaged to the main mesh before
866 return (via :func:`~psi_io._mesh.remesh_array`). Attempting to up-sample
867 from main to half mesh raises :class:`ValueError`. If ``None``, no
868 remeshing is performed.
869
870 Returns
871 -------
872 odata : u.Quantity
873 Data array with requested unit and remeshing applied.
874 sargs : tuple[slice, ...]
875 Slice tuple in physical ``(r, θ, φ)`` order, suitable for applying to
876 coordinate scale arrays.
877 remesh : tuple[bool, ...]
878 Boolean flags in physical ``(r, θ, φ)`` order indicating which axes were
879 remeshed from half to main mesh.
880 """
881
882 sargs = _parse_islice_args(*args, shape=self.shape[::-1])
883 if mesh is None:
884 remesh = repeat(False, self.ndim)
885 else:
886 omesh_norm = _normalize_mesh_code(mesh, self.ndim)
887 remesh = _parse_remesh(self.mesh, omesh_norm, 'C')
888
889 sargs = tuple(sargs)
890 remesh = tuple(remesh)
891
892 odata = _apply_units(self._read(*sargs, remesh=remesh), unit)
893 return odata, sargs, remesh
894
895 @abstractmethod
896 def _read(self,
897 *sargs,
898 remesh) -> u.Quantity:
899 """Internal read using pre-validated slice args and remesh flags.
900
901 Applies *sargs* directly to the dataset via :meth:`__getitem__` (which
902 performs the axis reversal) and then remeshes the result. Called by
903 :meth:`_HdfData.read` when reading coordinate scales to avoid re-parsing
904 slice arguments.
905
906 Parameters
907 ----------
908 *sargs : slice
909 Pre-validated slices in physical ``(r, θ, φ)`` order.
910 remesh : tuple[bool, ...]
911 Remesh flags in physical ``(r, θ, φ)`` order.
912
913 Returns
914 -------
915 out : u.Quantity
916 Sliced and optionally remeshed data in code units.
917 """
918
919 return _remesh_array(self[sargs], remesh=remesh, order='F') * self.unit
920
921
922# =============================================================================
923# Scale classes
924# =============================================================================
925
926class _HdfScale(_HdfInterface, ABC):
927 """Abstract base for HDF coordinate scale variables (r, t, p).
928
929 A scale object wraps a one-dimensional coordinate array stored alongside the
930 main data in an HDF file. It exposes the same :class:`_HdfInterface` API as
931 data objects so that coordinate arrays can be sliced and unit-converted with the
932 same :meth:`read` call.
933
934 Subclasses supply the format-specific :attr:`data` property and the array
935 introspection properties (``shape``, ``dtype``, etc.).
936 """
937
938 __slots__ = ()
939
940 def __init__(self,
941 parent,
942 dim_label: str,
943 data_label: str,
944 scale_model: str = 'scale'):
945 """Initialize a scale from a parent data reader and dimension metadata.
946
947 Parameters
948 ----------
949 parent : _HdfData
950 The data reader to which this scale belongs. The scale holds a reference
951 to *parent* so it can share the open file handle.
952 dim_label : str
953 Coordinate axis label — one of ``'r'``, ``'t'``, ``'p'``. Used to look
954 up the physical unit via :func:`get_psi_scale_properties`.
955 data_label : str
956 Dataset or SDS name within the HDF file where the coordinate values are
957 stored.
958
959 Raises
960 ------
961 ValueError
962 If the underlying coordinate dataset is not one-dimensional.
963 """
964 self._dataref = parent
965 self._datalabel: str = data_label
966 self._model: str = scale_model
967 self._values = None
968
969 self._set_properties(dim_label)
970
971 def _set_properties(self, scale: str):
972 """Look up and cache the unit for this coordinate axis."""
973 prop_getter = get_model_prop_caller(self._model)
974 qprops = prop_getter(scale)
975 if self.ndim != qprops.ndim:
976 raise ValueError(f"Expected {qprops.ndim}D coordinate variable for scale '{scale}', "
977 f"found {self.ndim}D dataset with shape {self.shape}.")
978 self._quantity: PsiScales = qprops.name
979 self._unit: u.Unit = qprops.unit
980 self._mesh: Mesh = self._dataref.mesh['rtp'.index(self._quantity)],
981
982 @property
983 def unit(self) -> u.Unit:
984 """Astropy unit for this coordinate axis (``PSI_rsun`` or ``PSI_angle``)."""
985 return self._unit
986
987 @property
988 def quantity(self) -> PsiScales:
989 """Coordinate axis identifier: ``'r'``, ``'t'``, or ``'p'``."""
990 return self._quantity
991
992 @property
993 def mesh(self) -> tuple[Mesh, ...]:
994 """Single-element mesh-stagger tuple for this coordinate axis."""
995 return self._mesh
996
997 def read(self,
998 *args,
999 **kwargs,
1000 ) -> u.Quantity:
1001 """Read the coordinate array, returning an astropy Quantity.
1002
1003 Delegates to :meth:`_HdfInterface.read` and returns only the data
1004 (discarding the ``sargs`` and ``remesh`` bookkeeping values).
1005
1006 Parameters
1007 ----------
1008 *args
1009 Slice arguments; see :meth:`_HdfInterface.read`.
1010 **kwargs
1011 Keyword arguments forwarded to :meth:`_HdfInterface.read`.
1012
1013 Returns
1014 -------
1015 out : u.Quantity
1016 Coordinate values with the appropriate PSI unit attached.
1017 """
1018 odata, *_ = super().read(*args, **kwargs)
1019 return odata
1020
1021 def _read(self,
1022 *sargs,
1023 remesh) -> u.Quantity:
1024 """Internal read using pre-validated slice args."""
1025 return super()._read(*sargs, remesh=remesh)
1026
1027
1028class H5Scale(_HdfScale):
1029 """HDF5 coordinate scale variable backed by an h5py dimension scale dataset."""
1030
1031 __slots__ = _SCALE_SLOTS
1032
1033 @property
1034 def shape(self) -> tuple[int, ...]:
1035 """Shape of the coordinate array (always a length-1 tuple for 1-D scales)."""
1036 return self.data.shape
1037
1038 @property
1039 def dtype(self) -> np.dtype:
1040 """NumPy dtype of the coordinate array."""
1041 return self.data.dtype
1042
1043 @property
1044 def size(self) -> int:
1045 """Total number of coordinate points."""
1046 return self.data.size
1047
1048 @property
1049 def nbytes(self) -> int:
1050 """Total memory occupied by the coordinate array in bytes."""
1051 return self.data.nbytes
1052
1053 @property
1054 def ndim(self) -> int:
1055 """Number of dimensions (always 1 for coordinate scale arrays)."""
1056 return self.data.ndim
1057
1058 @property
1059 def attrs(self) -> dict:
1060 """HDF5 attributes attached to this dimension scale dataset."""
1061 return dict(self.data.attrs)
1062
1063 @property
1064 def data(self) -> np.ndarray:
1065 """h5py Dataset object for this coordinate dimension."""
1066 return self._dataref._fileref[self._datalabel]
1067
1068
1069class H4Scale(_HdfScale):
1070 """HDF4 coordinate scale variable backed by a pyhdf SDS dimension."""
1071
1072 __slots__ = _SCALE_SLOTS
1073
1074 @property
1075 def shape(self) -> tuple[int, ...]:
1076 """Shape of the coordinate array (always a length-1 tuple for 1-D scales)."""
1077 return self.data.info()[2],
1078
1079 @property
1080 def dtype(self) -> np.dtype:
1081 """NumPy dtype of the coordinate array."""
1082 return SDC_TYPE_CONVERSIONS[self.data.info()[3]]
1083
1084 @property
1085 def size(self) -> int:
1086 """Total number of coordinate points."""
1087 return int(np.prod(self.shape))
1088
1089 @property
1090 def nbytes(self) -> int:
1091 """Total memory occupied by the coordinate array in bytes."""
1092 return self.size * self.dtype.itemsize
1093
1094 @property
1095 def ndim(self) -> int:
1096 """Number of dimensions (always 1 for coordinate scale arrays)."""
1097 return self.data.info()[1]
1098
1099 @property
1100 def attrs(self) -> dict:
1101 """HDF4 attributes attached to this SDS dimension."""
1102 return self.data.attributes()
1103
1104 @property
1105 def data(self) -> np.ndarray:
1106 """pyhdf SDS object for this coordinate dimension."""
1107 return self._dataref._fileref.select(self._datalabel)
1108
1109
1110# =============================================================================
1111# Format mixins (HDF5 and HDF4 file I/O + raw array access)
1112# =============================================================================
1113
1114class _H5DataMixin:
1115 """Mixin providing HDF5 file I/O and raw array access via h5py.
1116
1117 Concrete data classes inherit from both this mixin and :class:`_HdfData`.
1118 The mixin supplies the ``_HDFN = 'h5'`` class variable, the
1119 :meth:`read_file` class method, and properties that delegate to the open
1120 :class:`h5py.File` handle.
1121 """
1122
1123 __slots__ = ()
1124 _HDFN = 5
1125
1126 @classmethod
1127 def read_file(cls, ifile: PathLike):
1128 """Open an HDF5 file for reading and return the :class:`h5py.File` handle."""
1129 return h5.File(ifile, 'r')
1130
1131 def open(self):
1132 """Re-open the HDF5 file if it was previously closed. Returns ``self``."""
1133 if not self._fileref:
1134 self._fileref = self.read_file(self._filepath)
1135 return self
1136
1137 def close(self):
1138 """Close the HDF5 file handle. Returns ``self``."""
1139 if self._fileref is not None:
1140 self._fileref.close()
1141 self._fileref = None
1142 return self
1143
1144 def delete(self):
1145 """Close the file handle during garbage collection (called by ``__del__``)."""
1146 fileref = getattr(self, '_fileref', None)
1147 if fileref is not None:
1148 fileref.close()
1149 self._fileref = None
1150
1151 @property
1152 def shape(self) -> tuple[int, ...]:
1153 """Array shape in storage order ``(Nφ, Nθ, Nr)``."""
1154 return self.data.shape
1155
1156 @property
1157 def dtype(self) -> np.dtype:
1158 """NumPy dtype of the stored array."""
1159 return self.data.dtype
1160
1161 @property
1162 def size(self) -> int:
1163 """Total number of elements in the array."""
1164 return self.data.size
1165
1166 @property
1167 def nbytes(self) -> int:
1168 """Total memory occupied by the array in bytes."""
1169 return self.data.nbytes
1170
1171 @property
1172 def ndim(self) -> int:
1173 """Number of array dimensions."""
1174 return self.data.ndim
1175
1176 @property
1177 def attrs(self) -> dict:
1178 """HDF5 attributes attached to this dataset as a plain Python dict."""
1179 return dict(self.data.attrs)
1180
1181 @property
1182 def data(self) -> np.ndarray:
1183 """h5py Dataset object providing lazy access to the array."""
1184 return self._fileref[self._datalabel]
1185
1186 def _set_scales(self):
1187 """Construct :class:`H5Scale` objects from h5py dimension scales."""
1188 self._scales = Scales(*(H5Scale(self, scale, label.label)
1189 for scale, label in zip('rtp', self.data.dims)))
1190
1191
1192class _H4DataMixin:
1193 """Mixin providing HDF4 file I/O and raw array access via pyhdf.
1194
1195 Analogous to :class:`_H5DataMixin` but for HDF4 files. Raises an informative
1196 error at import time if pyhdf is not installed, via :func:`_except_no_pyhdf`.
1197 """
1198
1199 __slots__ = ()
1200 _HDFN = 4
1201
1202 @classmethod
1203 def read_file(cls, ifile: PathLike):
1204 """Open an HDF4 file for reading and return the pyhdf ``SD`` object."""
1205 _except_no_pyhdf()
1206 return h4.SD(str(ifile), h4.SDC.READ)
1207
1208 def open(self):
1209 """Re-open the HDF4 file if it was previously closed. Returns ``self``."""
1210 if not self._fileref:
1211 self._fileref = self.read_file(self._filepath)
1212 return self
1213
1214 def close(self):
1215 """Close the HDF4 file handle via ``end()``."""
1216 if self._fileref is not None:
1217 self._fileref.end()
1218 self._fileref = None
1219
1220 def delete(self):
1221 """Close the HDF4 file handle during garbage collection."""
1222 fileref = getattr(self, '_fileref', None)
1223 if fileref is not None:
1224 fileref.end()
1225 self._fileref = None
1226
1227 @property
1228 def shape(self) -> tuple[int, ...]:
1229 """Array shape in storage order ``(Nφ, Nθ, Nr)``."""
1230 return tuple(self.data.info()[2])
1231
1232 @property
1233 def dtype(self) -> np.dtype:
1234 """NumPy dtype of the stored array."""
1235 return SDC_TYPE_CONVERSIONS[self.data.info()[3]]
1236
1237 @property
1238 def size(self) -> int:
1239 """Total number of elements in the array."""
1240 return int(np.prod(self.shape))
1241
1242 @property
1243 def nbytes(self) -> int:
1244 """Total memory occupied by the array in bytes."""
1245 return self.size * self.dtype.itemsize
1246
1247 @property
1248 def ndim(self) -> int:
1249 """Number of array dimensions."""
1250 return self.data.info()[1]
1251
1252 @property
1253 def attrs(self) -> dict:
1254 """HDF4 attributes attached to this SDS dataset as a plain Python dict."""
1255 return self.data.attributes()
1256
1257 @property
1258 def data(self) -> np.ndarray:
1259 """pyhdf SDS object providing lazy access to the array."""
1260 return self._fileref.select(self._datalabel)
1261
1262 def _set_scales(self):
1263 """Construct :class:`H4Scale` objects from pyhdf SDS dimensions.
1264
1265 HDF4 dimension order is reversed relative to HDF5 (Fortran vs. C order),
1266 so the dimension list is reversed before zipping with ``'rtp'``.
1267 """
1268 sds = self.data
1269 dims = list(reversed(list(sds.dimensions(full=1).items())))
1270 self._scales = Scales(*tuple(H4Scale(self, scale, k_)
1271 for scale, (k_, v_) in zip('rtp', dims)))
1272
1273
1274# =============================================================================
1275# Abstract data base
1276# =============================================================================
1277
1278class _HdfData(_HdfInterface, ABC):
1279 """Abstract base for a single PSI HDF dataset (data fields, not coordinate scales).
1280
1281 Handles file opening, metadata resolution, and scale construction. Concrete
1282 subclasses are produced by combining this class with a format mixin
1283 (:class:`_H5DataMixin` or :class:`_H4DataMixin`) to form :class:`H5Data` and
1284 :class:`H4Data`. Use :func:`PsiData` rather than instantiating these directly.
1285 """
1286
1287 __slots__ = ()
1288
1289 def __init__(self,
1290 ifile: PathLike,
1291 model: ModelType, /,
1292 dataset_id: Optional[str] = None,
1293 **kwargs):
1294 """Open an HDF file and parse metadata for one PSI output quantity.
1295
1296 Parameters
1297 ----------
1298 ifile : PathLike
1299 Path to the HDF4 or HDF5 file. Must exist and have the extension
1300 expected by the format mixin (``'.h5'`` or ``'.hdf'``).
1301 dataset_id : str, optional
1302 Name of the dataset (SDS in HDF4, group key in HDF5) to open. Defaults
1303 to the PSI standard dataset identifier for this format
1304 (:data:`~psi_io.psi_io.PSI_DATA_ID`).
1305 **kwargs
1306 Optional metadata overrides. Accepted keys (from
1307 :data:`METADATA_SCHEMA`): ``'quantity'``, ``'sequence'``, ``'unit'``,
1308 ``'scalar'``, ``'mesh'``. Caller-supplied values take precedence over
1309 both file attributes and filename inference.
1310
1311 Raises
1312 ------
1313 FileNotFoundError
1314 If *ifile* does not exist on disk.
1315 ValueError
1316 If *ifile* has the wrong extension for this format mixin, if the
1317 dataset is not three-dimensional, or if any required metadata field
1318 cannot be resolved.
1319 """
1320 ifile = Path(ifile)
1321 hdfv = f'h{self._HDFN}'
1322 if not ifile.is_file():
1323 raise FileNotFoundError(f"File '{ifile}' does not exist.")
1324 if ifile.suffix != _HDF_EXT_MAPPING[hdfv]:
1325 raise ValueError(f"File '{ifile}' does not have the correct extension for "
1326 f"{self._HDFN} files (expected '{_HDF_EXT_MAPPING[hdfv]}' extension).")
1327 if model not in MODEL_TYPE:
1328 raise ValueError(f"Invalid model type {model!r}. Expected one of: {', '.join(MODEL_TYPE)}.")
1329
1330 self._filepath: Path = ifile
1331 self._datalabel: str = dataset_id or PSI_DATA_ID[hdfv]
1332 self._fileref = self.read_file(ifile)
1333 self._model: str = model
1334 self._values = None
1335
1336 self._set_properties(**self._parse_properties(**kwargs))
1337 self._set_scales()
1338
1339 def __enter__(self):
1340 """Open (or re-open) the file and return ``self`` for use as a context manager."""
1341 self.open()
1342 return self
1343
1344 def __exit__(self, *args):
1345 """Close the file handle when exiting the context manager."""
1346 self.close()
1347
1348 def __del__(self):
1349 """Close the file handle when the object is garbage-collected."""
1350 self.delete()
1351
1352 @classmethod
1353 @abstractmethod
1354 def read_file(cls, ifile: PathLike):
1355 """Open the HDF file at *ifile* and return the format-specific file handle."""
1356 ...
1357
1358 @abstractmethod
1359 def open(self):
1360 """Re-open the file handle if it was previously closed."""
1361 ...
1362
1363 @abstractmethod
1364 def close(self):
1365 """Close the open file handle and set the internal reference to ``None``."""
1366 ...
1367
1368 @abstractmethod
1369 def delete(self):
1370 """Release the file handle during garbage collection (called by ``__del__``)."""
1371 ...
1372
1373 @abstractmethod
1374 def _set_scales(self):
1375 """Construct and cache the :class:`Scales` named tuple of coordinate readers."""
1376 ...
1377
1378 def _parse_properties(self, **kwargs):
1379 """Resolve metadata fields by merging caller kwargs, file attrs, and filename.
1380
1381 Merges three sources (highest to lowest priority):
1382
1383 1. Keyword arguments in *kwargs* that match :data:`METADATA_SCHEMA` keys.
1384 2. HDF file-level attributes that match :data:`METADATA_SCHEMA` keys.
1385 3. Quantity name and sequence number inferred from the filename stem.
1386 4. The canonical :class:`~psi_io._models.Props` defaults for the resolved
1387 quantity (unit and mesh).
1388
1389 Parameters
1390 ----------
1391 **kwargs
1392 Caller-supplied metadata overrides.
1393
1394 Returns
1395 -------
1396 out : dict
1397 Fully populated metadata dict with keys
1398 ``{'quantity', 'sequence', 'unit', 'mesh'}``; the ``'scalar'`` key from
1399 :data:`METADATA_SCHEMA` is resolved internally but not forwarded to
1400 :meth:`_set_properties`.
1401
1402 Raises
1403 ------
1404 ValueError
1405 If any metadata field is still ``None`` after merging all sources.
1406 """
1407 prop_getter = get_model_prop_caller(self._model)
1408
1409 input_attrs = {k: v for k, v in kwargs.items() if k in METADATA_SCHEMA}
1410 file_attrs = {k: v for k, v in self.attrs.items() if k in METADATA_SCHEMA}
1411
1412 quantity = input_attrs.get('quantity',
1413 file_attrs.get('quantity',
1414 extract_quantity_from_filepath(self._filepath, '')))
1415 sequence = input_attrs.get('sequence',
1416 file_attrs.get('sequence',
1417 extract_sequence_from_filepath(self._filepath, 0)))
1418
1419 native_props = prop_getter(quantity)
1420 if self.ndim != native_props.ndim:
1421 raise ValueError(f"Expected {native_props.ndim}D dataset for quantity '{quantity}', "
1422 f"found {self.ndim}D dataset with shape {self.shape}.")
1423
1424 native_attrs = dict(quantity=native_props.name,
1425 sequence=sequence,
1426 unit=native_props.unit,
1427 mesh=native_props.mesh)
1428
1429 attributes = {**native_attrs, **file_attrs, **input_attrs}
1430 if any(v is None for v in attributes.values()):
1431 missing_meta = ', '.join(k for k, v in attributes.items() if v is None)
1432 raise ValueError(f"Malformed metadata: {missing_meta} is missing. "
1433 f"Provide these as keyword arguments or ensure they "
1434 f"are present in the file attributes.")
1435
1436 return attributes
1437
1438 def _set_properties(self,
1439 quantity: str,
1440 sequence: int,
1441 unit: str,
1442 mesh: MeshCodeType):
1443 """Store validated and type-coerced metadata on the instance.
1444
1445 Parameters
1446 ----------
1447 quantity : str
1448 Quantity name; stored as a string.
1449 sequence : int
1450 Sequence number; coerced to int.
1451 unit : str or u.Unit
1452 Unit; converted to :class:`~astropy.units.Unit` via ``u.Unit(str(unit))``.
1453 mesh : MeshCodeType
1454 Mesh stagger; normalized to ``tuple[Mesh, ...]`` via
1455 :func:`~psi_io._mesh._normalize_mesh_code`.
1456
1457 Raises
1458 ------
1459 ValueError
1460 If type coercion of any field fails.
1461 """
1462 try:
1463 self._quantity: str = str(quantity)
1464 self._sequence: int = int(sequence)
1465 self._unit: u.Unit = u.Unit(str(unit))
1466 self._mesh: tuple[Mesh, ...] = _normalize_mesh_code(mesh, self.ndim)
1467 except (ValueError, TypeError) as e:
1468 raise ValueError(f"Metadata type coercion failed: {e}") from e
1469
1470 @property
1471 def unit(self) -> u.Unit:
1472 """Astropy unit for converting code-unit values to physical unit."""
1473 return self._unit
1474
1475 @property
1476 def mesh(self) -> tuple[Mesh, ...]:
1477 """Normalized mesh-stagger tuple, one :class:`~psi_io._mesh.Mesh` per axis."""
1478 return self._mesh
1479
1480 @property
1481 def quantity(self) -> str:
1482 """Canonical lower-case quantity identifier (e.g. ``'br'``)."""
1483 return self._quantity
1484
1485 @property
1486 def sequence(self) -> int:
1487 """Integer time-step sequence number from the filename or file attributes."""
1488 return self._sequence
1489
1490 @property
1491 def scales(self) -> Scales:
1492 """Named tuple of coordinate scale readers ``(r, t, p)``.
1493
1494 Each element is an :class:`H5Scale` or :class:`H4Scale` that exposes the
1495 same :meth:`read` interface as the main data object, so coordinate arrays
1496 can be sliced in sync with the data.
1497 """
1498 return self._scales
1499
1500 def read(self,
1501 *args,
1502 scales: bool = True,
1503 **kwargs
1504 ) -> u.Quantity | tuple[u.Quantity, ...]:
1505 """Read a slice of data and optionally the corresponding coordinate arrays.
1506
1507 Delegates slice parsing, remeshing, and unit conversion to
1508 :meth:`_HdfInterface.read`, then optionally reads the matching slice of
1509 each coordinate scale.
1510
1511 Parameters
1512 ----------
1513 *args : int, slice, tuple, None, or Ellipsis
1514 Index arguments in physical ``(r, θ, φ)`` order. See
1515 :meth:`_HdfInterface.read` for the full description.
1516 scales : bool, optional
1517 If ``True`` (default), also return the corresponding coordinate slice
1518 for each spatial axis. If ``False``, return only the data array.
1519 **kwargs
1520 Forwarded to :meth:`_HdfInterface.read`; see that method for
1521 ``units`` and ``mesh`` keyword arguments.
1522
1523 Returns
1524 -------
1525 odata : u.Quantity
1526 Data array in the requested unit.
1527 r_scale : u.Quantity
1528 Radial coordinate values in solar radii (only if ``scales=True``).
1529 t_scale : u.Quantity
1530 Co-latitude values in radians (only if ``scales=True``).
1531 p_scale : u.Quantity
1532 Longitude values in radians (only if ``scales=True``).
1533
1534 Examples
1535 --------
1536 Read the full array and coordinate grids:
1537
1538 >>> data, r, t, p = reader.read() # doctest: +SKIP
1539
1540 Read a radial sub-range and convert to physical CGS units:
1541
1542 >>> data, r, t, p = reader.read(slice(10, 50), unit='physical') # doctest: +SKIP
1543
1544 Read data only (no coordinate arrays):
1545
1546 >>> data = reader.read(scales=False) # doctest: +SKIP
1547 """
1548 odata, sargs, remesh = super().read(*args, **kwargs)
1549 if not scales:
1550 return odata
1551 oscales = (scale._read(sarg, remesh=rmesh) for scale, sarg, rmesh in zip(self.scales, sargs, remesh))
1552 return odata, *oscales
1553
1554 def _read(self,
1555 *sargs,
1556 remesh) -> u.Quantity:
1557 """Internal read using pre-validated slice args."""
1558 return super()._read(*sargs, remesh=remesh)
1559
1560
1561 def vslice(self,
1562 *args,
1563 scales: bool = True,
1564 unit: Optional[str | UnitLike] = None,
1565 mesh: Optional[MeshCodeType] = None,
1566 bounds_error: bool = True,
1567 fill_value: Optional[QuantityLike] = None,
1568 ) -> u.Quantity | tuple[u.Quantity, ...]:
1569 """Read a slice of data by physical coordinate value with optional interpolation.
1570
1571 Each positional argument may be a physical coordinate value
1572 (:class:`~u.Quantity` or bare scalar), in which case the
1573 dataset is reduced to a 2-element window and linearly interpolated to that
1574 value. Index-space arguments (``slice``, ``int``, ``None``, ``Ellipsis``)
1575 are also accepted and passed through without interpolation, making this
1576 method a superset of :meth:`read`.
1577
1578 Parameters
1579 ----------
1580 *args : u.Quantity, scalar, int, slice, tuple, None, or Ellipsis
1581 One argument per spatial axis in physical ``(r, θ, φ)`` order.
1582 u.Quantity or bare scalar → value-space interpolation.
1583 All other types → index-space slice (see :meth:`read`).
1584 scales : bool, optional
1585 If ``True`` (default), also return the corresponding coordinate value
1586 for each axis. Value-interpolated axes return the interpolation target
1587 as a length-1 :class:`~astropy.units.Quantity`; index-space axes return the full slice.
1588 unit : str or u.Unit, optional
1589 Output unit; see :meth:`read` for accepted aliases and formats.
1590 mesh : MeshCodeType, optional
1591 Target mesh stagger. Remeshing is skipped for axes that are being
1592 value-interpolated (interpolation already collapses the half-mesh
1593 window to a single value).
1594 bounds_error : bool, optional
1595 If ``True`` (default), raise :class:`ValueError` when a value argument
1596 lies outside its coordinate scale range.
1597 fill_value : QuantityLike or None, optional
1598 Value substituted when a coordinate value falls outside the 2-element
1599 interpolation window and *bounds_error* is ``False``. ``None``
1600 silently extrapolates.
1601
1602 Returns
1603 -------
1604 odata : u.Quantity
1605 Sliced and interpolated data in the requested unit.
1606 r_scale, t_scale, p_scale : u.Quantity
1607 Coordinate values for each axis (only if ``scales=True``).
1608 Value-interpolated axes return the interpolation target as a
1609 length-1 array; index-space axes return the full coordinate slice.
1610
1611 Raises
1612 ------
1613 ValueError
1614 If *bounds_error* is ``True`` and any value argument is out of range,
1615 or if a value axis has no corresponding coordinate scale.
1616 """
1617 vslice_args = tuple(_parse_vslice_args(*args, scales=self.scales, bounds_error=bounds_error))
1618 slice_values, sargs = zip(*vslice_args)
1619
1620 if mesh is None:
1621 remesh = repeat(False, self.ndim)
1622 else:
1623 omesh_norm = _normalize_mesh_code(mesh, self.ndim)
1624 remesh = _parse_remesh(self.mesh, omesh_norm, 'C')
1625 remesh = tuple(remesh)
1626 remesh_xand_svalue = tuple(rm and sv is None for rm, sv in zip(remesh, slice_values))
1627
1628 pre_slice_data = _apply_units(self._read(*sargs, remesh=remesh_xand_svalue), unit)
1629
1630 pre_slice_scales = tuple(scale[sarg] * scale.unit if sv is not None else None
1631 for scale, sarg, sv in zip(self.scales, sargs, slice_values))
1632
1633 sliced_data = _slice_array(pre_slice_data, slice_values, pre_slice_scales, fill_value)
1634 if not scales:
1635 return sliced_data
1636 sliced_scales = tuple(scale._read(sarg, remesh=rmesh) if sv is None else np.atleast_1d(sv)
1637 for scale, sarg, sv, rmesh in zip(self.scales, sargs, slice_values, remesh))
1638 return sliced_data, *sliced_scales
1639
1640
1641# =============================================================================
1642# Concrete data classes
1643# =============================================================================
1644
1645class H5Data(_H5DataMixin, _HdfData):
1646 """HDF5-backed MAS model data reader.
1647
1648 Combines :class:`_H5DataMixin` (h5py file I/O) with :class:`_HdfData`
1649 (MAS metadata and :meth:`read` logic). Use :func:`PsiData` to instantiate
1650 rather than calling this class directly.
1651 """
1652
1653 __slots__ = _DATA_SLOTS
1654
1655
1656class H4Data(_H4DataMixin, _HdfData):
1657 """HDF4-backed MAS model data reader.
1658
1659 Combines :class:`_H4DataMixin` (pyhdf file I/O) with :class:`_HdfData`
1660 (MAS metadata and :meth:`read` logic). Requires the optional ``pyhdf``
1661 dependency. Use :func:`PsiData` to instantiate rather than calling this class
1662 directly.
1663 """
1664
1665 __slots__ = _DATA_SLOTS
1666
1667# =============================================================================
1668# Private helpers
1669# =============================================================================
1670
1671_DATA_CLASS_MAP = MappingProxyType({
1672 '.h5': H5Data,
1673 '.hdf': H4Data,
1674})
1675"""Read-only mapping from HDF file extension to the concrete data reader class.
1676
1677Used by :func:`PsiData` to select between :class:`H5Data` (h5py) and
1678:class:`H4Data` (pyhdf) based on the file's suffix.
1679"""
1680
1681
[docs]
1682def PsiData(ifile: PathLike, /,
1683 model: ModelType = 'mas',
1684 **kwargs):
1685 """Open a PSI MAS or POT3D HDF file and return the appropriate data reader.
1686
1687 Inspects the file extension and *model* argument, selects the correct
1688 concrete reader (HDF5 or HDF4 backend), and returns it. No data are read
1689 from disk at construction time — metadata is resolved from the filename and
1690 HDF file attributes, and data transfer happens only inside :meth:`read` or
1691 :meth:`vslice`. Full-array reads are cached automatically; partial reads
1692 are not.
1693
1694 The returned reader exposes the following attributes:
1695
1696 - ``quantity`` — canonical lower-case quantity identifier (e.g. ``'br'``).
1697 - ``sequence`` — integer time-step sequence number.
1698 - ``unit`` — :class:`~astropy.units.Unit` for code → physical conversion;
1699 normalization constants are defined in :mod:`psi_io._units`.
1700 - ``mesh`` — per-axis Yee-grid stagger as a tuple of
1701 :class:`~psi_io._mesh.Mesh` members; see :mod:`psi_io._mesh`.
1702 - ``props`` — full :class:`~psi_io._models.Props` descriptor (name,
1703 description, unit, mesh code); see :mod:`psi_io._models`.
1704 - ``description`` — human-readable quantity description.
1705 - ``scales`` — ``Scales(r, t, p)`` named tuple of coordinate scale readers,
1706 each supporting the same :meth:`read` interface as the main reader.
1707 - ``shape``, ``ndim``, ``size``, ``nbytes``, ``dtype``, ``attrs`` — array
1708 metadata; shape is in HDF storage order ``(Nφ, Nθ, Nr)``.
1709 - ``is_cached`` — ``True`` after a full-array read has been cached.
1710
1711 Use :meth:`read` to load a slice by index and :meth:`vslice` to slice by
1712 physical coordinate value with linear interpolation. Both return data as
1713 :class:`~astropy.units.Quantity` objects in physical ``(r, θ, φ)`` order.
1714 The object also supports the context-manager protocol.
1715
1716 .. warning:: **POT3D unit convention**
1717
1718 POT3D applies no normalization to its output. The stored values are in
1719 whatever physical unit the input photospheric magnetogram used — most
1720 commonly Gauss, but this is not encoded in the file. The default
1721 ``unit`` for POT3D is ``dimensionless_unscaled`` (scale factor 1), so
1722 ``read(unit='physical')`` will not perform a meaningful conversion
1723 unless the correct unit is supplied at construction:
1724
1725 .. code-block:: python
1726
1727 reader = PsiData('br001.h5', model='pot3d', unit='Gauss')
1728 data, r, t, p = reader.read()
1729
1730 Parameters
1731 ----------
1732 ifile : PathLike
1733 Path to the HDF4 (``.hdf``) or HDF5 (``.h5``) file.
1734 model : {'mas', 'pot3d'}, optional
1735 PSI model type. Defaults to ``'mas'``.
1736 dataset_id : str, optional
1737 Dataset name within the HDF file. Defaults to the PSI standard
1738 identifier for the given format.
1739 quantity : str, optional
1740 Override the quantity name inferred from the filename or file attributes.
1741 sequence : int, optional
1742 Override the time-step sequence number.
1743 unit : str or u.Unit, optional
1744 Override the code-to-physical unit from the quantity's
1745 :class:`~psi_io._models.Props` entry. Accepts any string parseable by
1746 :class:`~astropy.units.Unit` or a :class:`~astropy.units.Unit` instance.
1747 mesh : MeshCodeType, optional
1748 Override the mesh stagger from the quantity's
1749 :class:`~psi_io._models.Props` entry.
1750
1751 Returns
1752 -------
1753 out : H5Data or H4Data
1754 Open reader implementing the full ``_HdfInterface`` API. Concrete type
1755 depends on the file extension.
1756
1757 Raises
1758 ------
1759 ValueError
1760 If the file extension / model combination is unsupported or required
1761 metadata cannot be resolved.
1762 FileNotFoundError
1763 If *ifile* does not exist.
1764
1765 See Also
1766 --------
1767 astropy.units.Unit : Unit constructor — accepts strings, compound
1768 expressions, and :class:`~astropy.units.Unit` instances.
1769 astropy.units.Quantity.to : Unit conversion used internally when
1770 a ``unit`` string is supplied to :meth:`read`.
1771
1772 Examples
1773 --------
1774 Read a MAS radial field — full array with coordinate scales, then convert:
1775
1776 >>> from psi_io.mhd_io import PsiData # doctest: +SKIP
1777 >>> reader = PsiData('br001001.h5')
1778 >>> data, r, t, p = reader.read() # code units (MAS_b)
1779 >>> data, r, t, p = reader.read(unit='Gauss') # convert to Gauss
1780
1781 Use as a context manager:
1782
1783 >>> with PsiData('vr001001.h5') as reader: # doctest: +SKIP
1784 ... data, r, t, p = reader.read(unit='km/s')
1785
1786 Inspect metadata without loading data:
1787
1788 >>> reader = PsiData('rho001001.h5') # doctest: +SKIP
1789 >>> reader.quantity # 'rho'
1790 >>> reader.unit # MAS_n
1791 >>> reader.mesh # (Mesh.HALF, Mesh.HALF, Mesh.HALF)
1792 >>> reader.is_cached # False
1793 """
1794 ifile = Path(ifile)
1795 cls = _DATA_CLASS_MAP.get(ifile.suffix)
1796 if cls is None:
1797 raise ValueError(
1798 f"Unsupported HDF extension '{ifile.suffix}'"
1799 ) from None
1800 return cls(ifile, model.lower(), **kwargs)