recovar.output¶
Result I/O, visualization, and reconstruction quality metrics.
output¶
Volume I/O, result saving, and MRC file writing utilities.
recovar.output.output
¶
Volume I/O, result saving, and MRC file writing utilities.
PipelineOutput(result_path)
¶
has_embedding_entry(entry)
¶
Check whether a top-level embedding entry exists.
get_embedding_component(entry, key)
¶
Return one embedding array in dataset-local order.
The stored embeddings are NaN-padded in original-file space.
This method selects only computed entries (identified by
particles_halfsets) and returns them in sorted original-index
order, which matches the unified dataset's local ordering.
get_unsorted_embedding_component(entry, key)
¶
Return raw embedding values in original dataset order, without halfset reindexing.
load_embedding()
¶
Load all embedding arrays into dataset-local order.
Selects only computed entries (no NaN) and sorts by original index, matching the unified dataset's local ordering.
plot_over_density(density, trajectories=None, latent_space_bounds=None, subsampled=None, colors=None, plot_folder=None, cmap='inferno', same_st_end=True, zs=None, cov_zs=None, points=None, projection_function=None, annotate=False, slice_point=None)
¶
Plot 2-D density projections with optional trajectories and cluster centers.
For each pair of latent dimensions, creates a density heatmap and overlays trajectory paths, subsampled volume positions, and/or cluster center markers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
density
|
N-D density array on a regular grid (or None to compute on the fly). |
required | |
trajectories
|
List of trajectory arrays, each of shape (n_pts, n_dims). |
None
|
|
latent_space_bounds
|
Array of shape (n_dims, 2) giving [min, max] per axis. |
None
|
|
subsampled
|
List of subsampled trajectory arrays for volume markers. |
None
|
|
colors
|
Color list for trajectories. |
None
|
|
plot_folder
|
Directory to save PNG files. If None, plots are not saved. |
None
|
|
cmap
|
Matplotlib colormap name for the density. |
'inferno'
|
|
same_st_end
|
If True, only draw start/end markers for the first trajectory. |
True
|
|
zs
|
Latent coordinates, shape (n_particles, n_dims). |
None
|
|
cov_zs
|
Per-particle covariance matrices for on-the-fly density computation. |
None
|
|
points
|
Extra points (e.g. cluster centers) to scatter on top. |
None
|
|
projection_function
|
Callable to project the density onto 2 axes. |
None
|
|
annotate
|
Whether to annotate points with integer labels. |
False
|
|
slice_point
|
Slice coordinate for the projection function. |
None
|
build_params_dict(*, volume_shape, voxel_size, s_rescaled, noise_var_from_hf, noise_var_from_het_residual, noise_var_used, noise_result, ub_noise_var_by_var_est, variance_est, variance_fsc, noise_p_variance_est, covariance_options, column_fscs, picked_frequencies, input_args)
¶
Build the params dict saved as model/params.pkl.
This is the authoritative schema for the params dict (v0.7).
Schema¶
version : str
Format version (currently '0.7').
volume_shape : tuple of int
3-D grid dimensions, e.g. (128, 128, 128).
voxel_size : float
Angstroms per voxel.
s : ndarray, shape (n_pcs,)
Rescaled eigenvalues from PCA of the covariance.
noise_var_from_hf : ndarray
Noise variance estimated from half-map differences.
noise_var_from_het_residual : ndarray or None
Noise variance estimated from heterogeneity residuals (None for tilt series).
noise_var_used : ndarray
The noise variance actually used during estimation.
radial_noise_var_outside_mask : ndarray
Radial noise profile estimated from outside the solvent mask.
radial_ub_noise_var : ndarray
Upper-bound radial noise variance from inside the mask.
white_noise_var_outside_mask : float
Scalar white noise variance (median of radial profile).
ub_noise_var_by_var_est : ndarray
Upper-bound noise variance from signal+noise variance estimation.
image_PS : ndarray
Radial image power spectrum.
masked_image_PS : ndarray
Radial power spectrum of masked images.
variance_est : dict
Per-halfset variance estimates.
variance_fsc : ndarray
FSC of variance half-maps.
noise_p_variance_est : ndarray
Noise-plus-variance estimate.
covariance_options : dict
Options used for covariance estimation.
column_fscs : ndarray
Per-column FSC values.
picked_frequencies : ndarray
Frequency indices selected for covariance columns.
input_args : Namespace
The full command-line arguments used to run the pipeline.
build_embedding_dict(latent_coords, latent_coords_noreg, latent_precision, latent_precision_noreg, contrasts, contrasts_noreg)
¶
Build the embedding dict saved as model/embeddings.pkl.
All six sub-dicts are keyed by zdim (int).
Schema¶
latent_coords : dict[int, ndarray] Regularized latent coordinates. latent_coords_noreg : dict[int, ndarray] Unregularized latent coordinates. latent_precision : dict[int, ndarray] Posterior covariance of regularized latent coordinates. latent_precision_noreg : dict[int, ndarray] Posterior covariance of unregularized latent coordinates. contrasts : dict[int, ndarray] Per-image contrast estimates (regularized). contrasts_noreg : dict[int, ndarray] Per-image contrast estimates (unregularized).
save_pipeline_results(paths, result, embedding_dict, covariance_cols, particles_ind_split, ind_split, zs_full=None)
¶
Save all pipeline results to disk.
Parameters¶
paths : ResultPaths
Centralized output paths.
result : dict
The params dict built by :func:build_params_dict.
embedding_dict : dict
The embedding dict built by :func:build_embedding_dict.
covariance_cols : ndarray or None
Covariance columns (None if --keep-intermediate is off).
particles_ind_split : list of ndarray
Per-particle halfset indices.
ind_split : list of ndarray
Per-image halfset indices.
zs_full : dict or None
Full latent coordinates before complement-mask trimming (if applicable).
write_metadata_json(paths, result)
¶
Write a human-readable JSON manifest alongside the pickle files.
This file is not loaded by the pipeline -- it exists for users to quickly inspect run parameters without unpickling.
kmeans_analysis(output_folder, zs, n_clusters=20)
¶
Run k-means clustering on latent coordinates and save scatter plots.
Generates annotated and unannotated scatter plots for all pairwise combinations of the first few latent dimensions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_folder
|
Directory to save PCA scatter PNGs and center data. |
required | |
zs
|
Latent coordinates, shape (n_particles, n_dims). |
required | |
n_clusters
|
Number of k-means clusters. |
20
|
Returns:
| Type | Description |
|---|---|
|
Tuple of (labels, centers) from k-means clustering. |
plot_umap(output_folder, zs, centers)
¶
Generate UMAP embedding plots with cluster center overlay.
Creates scatter and hexbin UMAP projections saved as PNGs in
output_folder/umap/.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_folder
|
Parent directory; a |
required | |
zs
|
Latent coordinates, shape (n_particles, n_dims). |
required | |
centers
|
Cluster centers, shape (n_clusters, n_dims). |
required |
compute_and_save_reweighted(dataset, path_subsampled, zs, cov_zs, output_folder, B_factor, n_bins=30, n_min_particles=100, embedding_option='cov_dist', save_all_estimates=False, maskrad_fraction=20, apply_global_filtering=False, fsc_mask=None, fsc_mask_radius=None, fsc_mask_edgewidth=None, vol_prefix='state')
¶
Compute reweighted volume estimates and save with standardized organization.
Primary volumes (filtered, half-maps) are placed directly in
output_folder. Diagnostics (params, local resolution, etc.) go
into output_folder/diagnostics/{prefix}{idx:03d}/.
Parameters¶
dataset : CryoEMDataset
Dataset with halfset_indices set. Halfset datasets are
obtained lazily via dataset.get_halfset(k).
make_trajectory_plots_from_results(pipeline_output, basis_size, output_folder, cryos=None, z_st=None, z_end=None, gt_volumes=None, n_vols_along_path=6, plot_llh=False, input_density=None, latent_space_bounds=None)
¶
Compute minimum-energy trajectories and generate volume/density plots.
Finds optimal paths between start and end latent coordinates (or between ground-truth volume endpoints), generates volumes along the path, and saves density overlay plots.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pipeline_output
|
Pipeline output object with embeddings and covariance. |
required | |
basis_size
|
Number of PCA dimensions for trajectory computation. |
required | |
output_folder
|
Directory to write trajectory volumes and plots. |
required | |
cryos
|
CryoEMDataset (optional; loaded from pipeline_output if None). |
None
|
|
z_st
|
Start point in latent space, shape (n_dims,). |
None
|
|
z_end
|
End point in latent space, shape (n_dims,). |
None
|
|
gt_volumes
|
Ground-truth volumes for automatic endpoint selection. |
None
|
|
n_vols_along_path
|
Number of volumes to generate along the trajectory. |
6
|
|
plot_llh
|
Whether to generate per-volume likelihood scatter plots. |
False
|
|
input_density
|
Pre-computed density array (or None to compute). |
None
|
|
latent_space_bounds
|
Bounds for the latent space grid. |
None
|
standard_pipeline_plots(po, zdim_key, output_folder)
¶
Generate standard pipeline output plots.
Produces individual plots (eigenvolumes, contrast histogram, eigenvalues,
FSC, PC scatter) plus a consolidated pipeline_summary.png.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
po
|
Pipeline output object. |
required | |
zdim_key
|
Latent dimension key for embeddings/contrasts. |
required | |
output_folder
|
Directory to save plots into. |
required |
scatter_annotate(x, y, centers=None, centers_ind=None, annotate=True, labels=None, alpha=0.6, s=1, colors=None)
¶
Scatter plot with optional cluster-center markers and annotations.
get_nearest_point(data, query, chunk_size=None)
¶
For each row of query, return the closest row of data and its index.
The computation is chunked over query rows to avoid materializing a full
(n_query, n_data, n_dim) distance tensor for larger inputs.
cluster_kmeans(z, K, on_data=True, reorder=True)
¶
K-means clustering of z into K clusters.
Returns (labels, centers). If reorder=True, clusters are sorted by agglomerative linkage of their centers.
output_paths¶
Structured path management for pipeline results directories.
recovar.output.output_paths
¶
Centralized output path definitions for RECOVAR pipeline results.
All output file paths are defined here as the single source of truth. Both the saving side (pipeline.py, analyze.py) and the loading side (PipelineOutput) should use these definitions to avoid path mismatches.
ResultPaths(root_dir)
¶
Single source of truth for all output file paths.
Usage::
paths = ResultPaths("/path/to/outdir")
paths.ensure_dirs()
utils.pickle_dump(result, paths.params)
vol = utils.load_mrc(paths.mean_volume)
embeddings
property
¶
Legacy path for monolithic embeddings.pkl (backward compat).
embedding_zdim_dir(zdim)
¶
Per-zdim embedding directory, e.g. model/zdim_4/.
ensure_dirs()
¶
Create all standard output directories.
ensure_model_dir()
¶
Create the model directory only.
ensure_volumes_dir()
¶
Create the volumes directory only.
AnalysisPaths(analysis_dir)
¶
Path helpers for downstream analysis outputs (kmeans, trajectories).
Follows RELION-inspired conventions: - 1-indexed, zero-padded volume names (center001.mrc, state001.mrc) - Primary volumes flat in the output directory - Half-maps alongside: center001_half1_unfil.mrc - Diagnostics in subdirectories: diagnostics/center001/
traj_dir(index)
¶
Return trajectory directory (1-indexed, zero-padded).
vol_stem(prefix, index)
staticmethod
¶
Volume stem without extension, e.g. 'center000'.
vol_filename(prefix, index)
staticmethod
¶
Primary volume filename, e.g. 'center000.mrc'.
halfmap_filename(prefix, index, half)
staticmethod
¶
Half-map filename, e.g. 'center000_half1_unfil.mrc'.
diagnostics_subdir(prefix, index)
staticmethod
¶
Diagnostics subdirectory, e.g. 'diagnostics/center000'.
VolumeOutputPaths(output_dir, prefix, index)
¶
Path abstraction for a single reconstructed volume's output files.
Primary outputs (filtered volume, half-maps) are placed directly in
output_dir. Diagnostics (params, local resolution, split choices)
go into output_dir/diagnostics/{prefix}{index:03d}/.
This replaces the ad-hoc string concatenation previously used in
heterogeneity_volume.make_volumes_kernel_estimate_local.
Usage::
vp = VolumeOutputPaths("/path/to/kmeans", "center", 0)
vp.ensure_dirs()
write_mrc(vp.filtered, volume)
write_mrc(vp.half1_unfil, half1)
pickle_dump(params, vp.params)
Parameters¶
output_dir : str
Parent output directory (e.g. the kmeans/ or traj000/ folder).
prefix : str
Volume name prefix (e.g. "center", "state").
index : int
Zero-based volume index.
stem
property
¶
Volume stem without extension, e.g. 'center000'.
diag_dir
property
¶
Diagnostics subdirectory for this volume.
filtered
property
¶
Filtered volume: {stem}.mrc.
half1_unfil
property
¶
Unfiltered half-map 1: {stem}_half1_unfil.mrc.
half2_unfil
property
¶
Unfiltered half-map 2: {stem}_half2_unfil.mrc.
unfil
property
¶
Unfiltered combined volume: {stem}_unfil.mrc.
locres
property
¶
Local resolution map.
sampling
property
¶
Sampling volume (diagnostic).
params
property
¶
Heterogeneity parameters pickle.
split_choice
property
¶
Per-shell bin selection pickle.
choice
property
¶
Per-voxel bin selection MRC (locmost_likely mode).
choice_smooth
property
¶
Smoothed per-voxel bin selection MRC.
heterogeneity_distances
property
¶
Per-image heterogeneity distances text file.
latent_coords
property
¶
Latent coordinates text file for this volume.
filtered_smooth
property
¶
Smoothed filtered volume (debug).
locres_smooth
property
¶
Smoothed local resolution map (debug).
filtered_before
property
¶
Filter-before-choose volume (debug).
filtered_before_smooth
property
¶
Smoothed filter-before-choose volume (debug).
cv_half1_unfil
property
¶
Cross-validation estimate, half 1 (debug).
cv_half2_unfil
property
¶
Cross-validation estimate, half 2 (debug).
cv_noise_half1
property
¶
Cross-validation noise, half 1 (debug).
cv_noise_half2
property
¶
Cross-validation noise, half 2 (debug).
estimates_dir(half, filtered=False)
¶
Directory for all kernel regression estimates (debug only).
Parameters¶
half : int Half-set index (1 or 2). filtered : bool If True, return the filtered estimates directory.
ensure_dirs()
¶
Create output_dir and diagnostics subdirectory.
eigenvector_filename(index)
¶
Return filename for eigenvector volume at given index.
variance_filename(n_eigs)
¶
Return filename for variance volume computed from n_eigs eigenvectors.
resolve_volume_diag_path(vol_folder, filename, prefix=None, index=None)
¶
Resolve a diagnostic file path with backward compatibility.
Checks the new diagnostics/{stem}/ layout first, then falls back
to the old flat layout where files lived directly in vol_folder.
Parameters¶
vol_folder : str
The volume output directory (e.g. kmeans/ or a flat diag dir).
filename : str
The file to find (e.g. "params.pkl").
prefix : str, optional
Volume prefix for new layout lookup.
index : int, optional
Volume index for new layout lookup.
Returns¶
str Resolved path (may not exist if file is missing in both locations).
plot_utils¶
Visualization helpers: FSC plots, volume slices, embedding scatter.
recovar.output.plot_utils
¶
Visualization helpers: FSC plots, volume slices, embedding scatter.
plot_noise_profile(pipeline_output, yscale='linear', ax=None)
¶
Plot noise power spectrum profiles from pipeline output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pipeline_output
|
Pipeline output object with noise variance data. |
required | |
yscale
|
Y-axis scale ('linear' or 'log'). |
'linear'
|
|
ax
|
Optional matplotlib Axes to draw into. If None, creates a new figure. |
None
|
Returns:
| Type | Description |
|---|---|
|
Tuple of (fig, ax) matplotlib Figure and Axes objects. |
plot_summary_t(pipeline_output, n_eigs=3, filename=None)
¶
Plot mean, mask, variance, and top principal component volumes.
Creates a grid of volume projections and central slices: 3 rows for mean/mask/variance plus one row per eigenvolume.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pipeline_output
|
Pipeline output object with 'mean', 'volume_mask', 'variance', and eigenvolume data. |
required | |
n_eigs
|
Number of eigenvolumes (principal components) to show. |
3
|
|
filename
|
Path to save the figure. If None, figure is not saved. |
None
|
plot_cov_results(u, s, max_eig=40, savefile=None)
¶
Plot eigenvalue spectra and subspace angle comparison.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u
|
Dict of eigenvector arrays keyed by method name. |
required | |
s
|
Dict of eigenvalue arrays keyed by method name. |
required | |
max_eig
|
Maximum number of eigenvalues to display. |
40
|
|
savefile
|
If provided, saves eigenvalue plot to |
None
|
Returns:
| Type | Description |
|---|---|
|
Dict mapping method names to their subspace angle arrays |
|
|
(empty if no ground truth key |
plot_mean_fsc(pipeline_output, cryos)
¶
Plot FSC curves for the mean reconstruction (masked and unmasked).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pipeline_output
|
Pipeline output object with 'mean_halfmaps', 'volume_shape', 'voxel_size', and 'volume_mask'. |
required | |
cryos
|
Unused (kept for backward compatibility). |
required |
Returns:
| Type | Description |
|---|---|
|
Matplotlib Axes with the FSC curves. |
plot_fsc(cryo, vol1, vol2, mask=None, threshold=1 / 7, ax=None, voxel_size=None, volume_shape=None, name='unmasked', fmat='', filename=None)
¶
Plot FSC between two volumes using cryo dataset metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cryo
|
CryoEMDataset providing voxel_size and volume_shape. |
required | |
vol1
|
First half-map (flattened Fourier volume). |
required | |
vol2
|
Second half-map (flattened Fourier volume). |
required | |
mask
|
Optional real-space mask to apply before FSC. |
None
|
|
threshold
|
FSC resolution threshold (default 1/7). |
1 / 7
|
|
ax
|
Optional Axes to draw into. |
None
|
|
voxel_size
|
Override voxel size from cryo. |
None
|
|
volume_shape
|
Override volume shape from cryo. |
None
|
|
name
|
Label for the curve in the legend. |
'unmasked'
|
|
fmat
|
Matplotlib format string for the line. |
''
|
|
filename
|
Path to save the figure. |
None
|
Returns:
| Type | Description |
|---|---|
|
Matplotlib Axes with the FSC curve. |
plot_fsc_new(image1, image2, volume_shape=None, voxel_size=1, curve=None, ax=None, threshold=1 / 7, filename=None, volume_mask=None, name='', fmat='')
¶
Plot Fourier Shell Correlation between two half-maps.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image1
|
First half-map (flattened Fourier or real-space volume). |
required | |
image2
|
Second half-map. |
required | |
volume_shape
|
3-tuple giving the volume dimensions. |
None
|
|
voxel_size
|
Voxel size in Angstroms. |
1
|
|
curve
|
Pre-computed FSC curve. If None, computed from the inputs. |
None
|
|
ax
|
Optional Axes to draw into. If None, creates a new figure. |
None
|
|
threshold
|
FSC threshold for resolution estimation (default 1/7). |
1 / 7
|
|
filename
|
Path to save the figure. |
None
|
|
volume_mask
|
Optional real-space mask applied before FSC. |
None
|
|
name
|
Label prefix for the resolution annotation. |
''
|
|
fmat
|
Matplotlib format string for the line. |
''
|
Returns:
| Type | Description |
|---|---|
|
Tuple of (ax, score) where score is the frequency at which FSC |
|
|
crosses the threshold. |
FSC(image1, image2, r_dict=None)
¶
Compute Fourier Shell Correlation between two 3-D volumes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image1
|
First volume as a 3-D numpy array. |
required | |
image2
|
Second volume as a 3-D numpy array. |
required | |
r_dict
|
Unused (kept for backward compatibility). |
None
|
Returns:
| Type | Description |
|---|---|
|
1-D array of FSC values per frequency shell. |
fsc_score(fsc_curve, grid_size, voxel_size, threshold=0.5)
¶
Find the frequency at which FSC crosses a threshold.
Uses linear interpolation between the last shell above threshold and the first shell below.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fsc_curve
|
1-D array of FSC values per shell. |
required | |
grid_size
|
Number of voxels along one side of the volume. |
required | |
voxel_size
|
Voxel size in Angstroms. |
required | |
threshold
|
FSC threshold (default 0.5). |
0.5
|
Returns:
| Type | Description |
|---|---|
|
Frequency value (in 1/Angstrom) at the threshold crossing. |
plot_latent_space_scatter(z, axes=None, centers=None, labels=None, title='Latent Space Analysis', figsize=(18, 12), save_path=None, show_plot=True)
¶
Create scatter plots for latent space visualization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
z
|
Latent coordinates of shape (n_particles, n_dimensions). |
required | |
axes
|
List of (i, j) tuples specifying which dimensions to plot. If None, plots all pairwise combinations of first 4 dimensions. |
None
|
|
centers
|
Cluster centers to overlay, shape (n_clusters, n_dimensions). |
None
|
|
labels
|
Labels for cluster centers. |
None
|
|
title
|
Main title for the plot. |
'Latent Space Analysis'
|
|
figsize
|
Figure size as (width, height). |
(18, 12)
|
|
save_path
|
Path to save the plot. If None, figure is not saved. |
None
|
|
show_plot
|
Whether to display the plot interactively. |
True
|
Returns:
| Type | Description |
|---|---|
|
Tuple of (fig, axes_plt) matplotlib Figure and array of Axes. |
plot_eigenvalues(eigenvalues, ax=None, n_eigs=40)
¶
Plot eigenvalue spectrum on a semilogy scale.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
eigenvalues
|
1-D array of eigenvalues. |
required | |
ax
|
Optional Axes to draw into. If None, creates a new figure. |
None
|
|
n_eigs
|
Number of eigenvalues to display. |
40
|
Returns:
| Type | Description |
|---|---|
|
Matplotlib Axes with the eigenvalue plot. |
plot_contrast_histogram(contrasts, ax=None, zdim_key=None)
¶
Plot histogram of per-particle contrast values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contrasts
|
1-D array of contrast values. |
required | |
ax
|
Optional Axes to draw into. If None, creates a new figure. |
None
|
|
zdim_key
|
Latent dimension key for the title annotation. |
None
|
Returns:
| Type | Description |
|---|---|
|
Matplotlib Axes with the histogram. |
plot_pipeline_summary(po, zdim_key, output_folder)
¶
Create a single consolidated summary figure for the pipeline.
Generates a 3x3 grid showing mean volume, FSC, eigenvalues, variance, PC scatter plots, noise profile, and contrast histogram.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
po
|
Pipeline output object. |
required | |
zdim_key
|
Latent dimension key for accessing embeddings/contrasts. |
required | |
output_folder
|
Directory to save the summary PNG. |
required |
pipeline_summary(po, save_path)
¶
Create a 2x3 overview figure summarizing key pipeline results.
Panels
(0,0) Mean volume — central XY slice of the mean reconstruction. (0,1) Eigenvalue spectrum — bar chart of top eigenvalues. (0,2) Mean FSC — FSC curve with 1/7 threshold line. (1,0) Contrast histogram — distribution of per-particle contrasts. (1,1) Variance volume — central XY slice of the variance map. (1,2) Mask — central XY slice of the reconstruction mask.
Each panel is wrapped in try/except so that a single missing quantity does not prevent the rest of the summary from being generated.
Parameters¶
po : PipelineOutput Pipeline output object. save_path : str Path where the PNG is written.
analyze_summary(zs, centers, labels, save_path, density=None)
¶
Create a 2x2 overview figure summarizing k-means analysis results.
Panels
(0,0) Latent space — hexbin density of particles with cluster centers. (0,1) Cluster sizes — bar chart of particles per cluster. (1,0) Cluster distances — heatmap of pairwise center distances. (1,1) Latent variance — histogram of per-particle distances to nearest cluster center.
Parameters¶
zs : ndarray, shape (n_particles, n_dims) Latent coordinates. centers : ndarray, shape (n_clusters, n_dims) K-means cluster centers. labels : ndarray, shape (n_particles,) Per-particle cluster assignments. save_path : str Path where the PNG is written. density : ndarray, optional Per-particle density values for background coloring (unused if None).
junk_detection_summary(results_dict, save_path)
¶
Create a 2x2 overview figure summarizing junk particle detection.
Panels
(0,0) Cluster quality — histogram of per-cluster FSC AUC values. (0,1) Particle distribution — bar chart colored by good/junk status. (1,0) Quality summary — text panel with key detection statistics. (1,1) FSC curves — top-3 and bottom-3 cluster FSC curves if available.
Parameters¶
results_dict : dict
Must contain at minimum:
- 'fsc_aucs': per-cluster FSC AUC values (array)
- 'n_particles_per_cluster': particles in each cluster (array)
- 'junk_threshold': threshold used to classify junk (float)
- 'n_junk': number of junk particles (int)
- 'n_good': number of good particles (int)
Optionally:
- 'fsc_curves': dict or list of per-cluster FSC curves
save_path : str
Path where the PNG is written.
metrics¶
Reconstruction quality metrics: FSC, subspace angles, per-voxel error.
recovar.output.metrics
¶
Reconstruction quality metrics: FSC, subspace angles, per-voxel error.
captured_variance(test_v, U, s)
¶
Compute cumulative captured variance of test vectors in a subspace.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_v
|
Test vectors, shape (n_voxels, n_test). |
required | |
U
|
Eigenvector matrix, shape (n_voxels, n_pcs). |
required | |
s
|
Eigenvalue array, shape (n_pcs,). |
required |
Returns:
| Type | Description |
|---|---|
|
Cumulative captured variance array, shape (n_test,). |
subspace_angles(u, v, max_rank=None, check_orthogonalize=False)
¶
Compute principal angles between two subspaces of increasing rank.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u
|
First set of basis vectors, shape (n_voxels, n_pcs). |
required | |
v
|
Second set of basis vectors, shape (n_voxels, n_pcs). |
required | |
max_rank
|
Maximum subspace rank to evaluate. |
None
|
|
check_orthogonalize
|
If True, QR-orthogonalize u and v first. |
False
|
Returns:
| Type | Description |
|---|---|
|
Array of sine of principal angles, shape (max_rank,). |
local_fsc_metric(map1, map2, voxel_size, mask, fsc_threshold=1 / 7, locres_sampling=25)
¶
Compute local resolution and local AUC metrics within a mask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
map1
|
First half-map (3-D real-space array). |
required | |
map2
|
Second half-map (3-D real-space array). |
required | |
voxel_size
|
Voxel size in Angstroms. |
required | |
mask
|
Boolean mask selecting voxels to evaluate. |
required | |
fsc_threshold
|
FSC threshold for resolution (default 1/7). |
1 / 7
|
|
locres_sampling
|
Sampling factor for local resolution windows. |
25
|
Returns:
| Type | Description |
|---|---|
|
Tuple of (median_locres, ninety_pc_locres, median_auc, ten_pc_auc). |
make_union_gt_mask_from_hvd(gt_thing, volume_shape)
¶
Build a union mask from all GT volumes in a HeterogeneousReconstruction.
Converts each Fourier-space volume to real space, then delegates to
mask.make_union_gt_mask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gt_thing
|
A |
required | |
volume_shape
|
3-D grid dimensions tuple. |
required |
Returns:
| Type | Description |
|---|---|
|
Tuple |
make_moving_gt_mask_from_hvd(gt_thing, volume_shape)
¶
Build a moving-piece mask from GT volumes in a HeterogeneousReconstruction.
Isolates the region that differs across GT states. Uses
mask.make_moving_gt_mask on real-space GT volumes.
variance_of_zs(z, gt_image_assignment)
¶
Estimate per-label variances and overall variance of z.
Parameters¶
z : np.ndarray Array of shape (n_samples, features). gt_image_assignment : np.ndarray Array of shape (n_samples,) with integer labels for each sample.
Returns¶
label_variances : np.ndarray Variance computed for each label (flattened across features). weighted_avg_variance : float Overall variance computed as the weighted average of per-label variances. overall_variance : float Variance computed over the entire z data.
fro_norm_diff_low_rank(U, s, V, d)
¶
Compute the Frobenius norm of (A - B) where A = U * diag(s) * U^T and B = V * diag(d) * V^T, using only their low-rank representations.
Parameters¶
U : jnp.ndarray An n x r matrix (orthonormal columns). s : jnp.ndarray A length-r vector for A's eigenvalues. V : jnp.ndarray An n x r matrix (orthonormal columns). d : jnp.ndarray A length-r vector for B's eigenvalues.
Returns¶
jnp.ndarray The Frobenius norm of A - B.