Post-processing and Analysis Tools

This page summarizes SPIDER post-processing utilities for reading samples, building summaries, diagnostics, and plotting.

1) Load samples

Primary reader:

  • spider.io.samples.read_all_samples(...)

Example:

from spider.io.samples import read_all_samples

samples = read_all_samples(
    {"samples_outfile": "SPIDER_samples_final.h5"},
    backend="numpy",
    thin=5,
)

Returned core arrays:

  • event_ids

  • longitude, latitude, depth

  • X, Y, Z, delta_t

Useful metadata keys for chain-aware analysis:

  • _batch_names, _batch_boundaries, _batch_slices

  • _sample_chain_idx

  • _chain_segments, _chain_slices, _chain_indices

Map-only behavior:

  • If no batch_* groups exist but root map_* datasets exist, read_all_samples returns a single-sample map-only structure.

Related I/O helpers:

  • merge_samples_hdf5(...) (merge multi-chain sample files)

  • read_growclust_bootstrap(...) (convert GrowClust bootstrap outputs into SPIDER-like sample dict)

2) Build event summaries

High-level entrypoint:

  • spider.analysis.compute_cat_dd_and_xyz(...)

This returns an EventSamplesSummary with centered sample arrays and optional catalog summary table.

Example:

from spider.analysis import compute_cat_dd_and_xyz

summary = compute_cat_dd_and_xyz(
    samples,
    burn_in=1000,
    include=["X", "Y", "Z", "T", "cat_dd"],
    uncertainty_metrics=["sigma", "std", "mad", "qhw_0.95"],
)

Key options:

  • include: choose outputs (X, Y, Z, T, lats, lons, deps, cat_dd)

  • burn_in, thin

  • compute_map / map_bins for histogram mode estimates

  • add_wasserstein to append per-event prior-vs-posterior Wasserstein diagnostics

3) ESS diagnostics

ESS utilities in spider.analysis.results:

  • compute_effective_sample_size(summary, ...)

  • compute_ess_summary(summary, ...)

compute_ess_summary returns per-event and aggregate ESS metrics, including:

  • ess_per_event_x, ess_per_event_y, ess_per_event_z, ess_per_event_t

  • conservative ess_per_event_xyzt_min

These are useful for identifying under-mixed events and uneven exploration.

4) Wasserstein prior-vs-posterior diagnostics

Module:

  • spider.analysis.prior_posterior_wasserstein

Main routines:

  • compute_event_wasserstein(...) (from HDF5 samples + config)

  • compute_event_wasserstein_from_samples(...) (from in-memory sample dict)

CLI-style usage:

python -m spider.analysis.prior_posterior_wasserstein \
  --samples SPIDER_samples_final.h5 \
  --config SPIDER.json \
  --burn 0.2 \
  --thin 5 \
  --dims 0,1,2

5) Plotting tools

Main plotting API (spider.plotting):

  • plot_event_distributions(...)

  • plot_event_chains(...)

  • plot_uncertainty_histograms(...)

  • plot_event_marginal_hist2d(...)

  • plot_noise_scale_posterior_vs_prior(...)

Additional 2D KDE marginal helper (currently defined in spider.plotting.events):

  • plot_event_marginal_kde2d(...)

Example:

from spider.plotting import (
    plot_event_distributions,
    plot_event_chains,
    plot_uncertainty_histograms,
)
from spider.plotting.events import plot_event_marginal_kde2d

fig, ax = plot_event_distributions(summary, coords=("X", "Y", "Z"))
fig, ax = plot_event_chains(summary, coords=("X", "Y", "Z", "T"))
fig, ax = plot_uncertainty_histograms(summary, coords=("X", "Y", "Z", "T"))
fig, axes = plot_event_marginal_kde2d(samples, event_index=0, coords=("X", "Y", "Z"))

6) Typical post-processing workflow

  1. Read samples with thinning (read_all_samples).

  2. Build summary (compute_cat_dd_and_xyz) including X/Y/Z/T and cat_dd.

  3. Compute ESS summary and inspect low-tail events.

  4. Generate chain and marginal plots.

  5. (Optional) run calibration and Wasserstein diagnostics for deeper quality checks.