Basis

Routines and Class definitions for constructing basis sets using the diffusion maps algorithm.

@author: Erik

class pyedgar.basis.DiffusionAtlas(dmap_object=None)[source]

The diffusion atlas is a factory object for constructing diffusion map bases with various boundary conditions.

extend_FK_soln(soln, Y, b, in_domain)[source]

Extends the values of the Feynman-Kac solution onto new points. In the DGA framework, this is intended to be used to extend guess functions onto new datapoints.

Parameters:
  • soln (Dataset of same type as the data.) – Solution to the Feynman-Kac problem on the original type.
  • Y (2D array-like OR list of trajectories OR flat data format) – Data for which to perform the out-of-sample extension.
  • b (1D array-like, OR list of such arrays, OR flat data format.) – Values of the right hand-side for the OOS points.
  • in_domain (1D array-like, OR list of such arrays, OR flat data format.) – Dataset of the same shape as the input datapoints, where each element is 1 or True if that datapoint is inside the domain, and 0 or False if it is in the domain.
Returns:

extended_soln (Dataset of same type as the data.) – Solution to the Feynman-Kac problem.

extend_dirichlet_basis(Y, in_domain, basis, evals)[source]

Performs out-of-sample extension an a dirichlet basis set.

Parameters:
  • Y (2D array-like OR list of trajectories OR flat data format) – Data for which to perform the out-of-sample extension.
  • in_domain (1D array-like, OR list of such arrays, OR flat data format) – Dataset of the same shape as the input datapoints, where each element is 1 or True if that datapoint is inside the domain, and 0 or False if it is in the domain.
  • basis (2D array-like OR list of trajectories OR Flat data format) – The basis functions.
  • evals (1D numpy array) – The eigenvalues corresponding to each basis vector.
Returns:

basis_extended (Dataset of same type as the data) – Transformed value of the given values.

fit(data)[source]

Constructs the diffusion map on the dataset.

Parameters:data (2D array-like OR list of trajectories OR Flat data format) – Dataset on which to construct the diffusion map.
classmethod from_kernel(kernel_object, alpha=0.5, weight_fxn=None, density_fxn=None, bandwidth_normalize=False, oos='nystroem')[source]

Builds the Diffusion Atlas using a pyDiffMap kernel. See the pyDiffMap.DiffusionMap constructor for a description of arguments.

classmethod from_sklearn(alpha=0.5, k=64, kernel_type='gaussian', epsilon='bgh', neighbor_params=None, metric='euclidean', metric_params=None, weight_fxn=None, density_fxn=None, bandwidth_type=None, bandwidth_normalize=False, oos='nystroem')[source]

Builds the Diffusion Atlas using the standard pyDiffMap kernel. See the pyDiffMap.DiffusionMap.from_sklearn for a description of arguments.

make_FK_soln(b, in_domain)[source]

Solves a Feynman-Kac problem on the data. Specifically, solves Lx = b on the domain and x=b off of the domain. In the DGA framework, this is intended to be used to solve for guess functions.

Parameters:
  • b (1D array-like, OR list of such arrays, OR flat data format.) – Dataset of the same shape as the input datapoints. Right hand side of the Feynman-Kac equation.
  • in_domain (1D array-like, OR list of such arrays, OR flat data format.) – Dataset of the same shape as the input datapoints, where each element is 1 or True if that datapoint is inside the domain, and 0 or False if it is in the domain.
Returns:

soln (Dataset of same type as the data.) – Solution to the Feynman-Kac problem.

make_dirichlet_basis(k, in_domain=None, return_evals=False)[source]

Creates a diffusion map basis set that obeys the homogeneous Dirichlet boundary conditions on the domain. This is done by taking the eigenfunctions of the diffusion map submatrix on the domain.

Parameters:
  • k (int) – Number of basis functions to create.
  • in_domain (1D array-like, OR list of such arrays, OR flat data format, optional) – Array of the same shape as the data, where each element is 1 or True if that datapoint is inside the domain, and 0 or False if it is in the domain. Naturally, this must be the length as the current dataset. If None (default), all points assumed to be in the domain.
  • return_evals (Boolean, optional) – Whether or not to return the eigenvalues as well. These are useful for out of sample extension.
Returns:

  • basis (Dataset of same type as the data) – The basis functions evaluated on each datapoint. Of the same type as the input data.
  • evals (1D numpy array, optional) – The eigenvalues corresponding to each basis vector. Only returned if return_evals is True.

pyedgar.basis.nystroem_oos(dmap_object, Y, evecs, evals)[source]

Performs Nystroem out-of-sample extension to calculate the values of the diffusion coordinates at each given point.

Parameters:
  • dmap_object (DiffusionMap object) – Diffusion map upon which to perform the out-of-sample extension.
  • Y (array-like, shape (n_query, n_features)) – Data for which to perform the out-of-sample extension.
Returns:

phi (numpy array, shape (n_query, n_eigenvectors)) – Transformed value of the given values.

pyedgar.basis.power_oos(dmap_object, Y, evecs, evals)[source]

Performs out-of-sample extension to calculate the values of the diffusion coordinates at each given point using the power-like method.

Parameters:
  • dmap_object (DiffusionMap object) – Diffusion map upon which to perform the out-of-sample extension.
  • Y (array-like, shape (n_query, n_features)) – Data for which to perform the out-of-sample extension.
Returns:

phi (numpy array, shape (n_query, n_eigenvectors)) – Transformed value of the given values.