Data Manipulation¶
A collection of useful functions for manipulating trajectory data and dynamical basis set objects.
@author: Erik
-
pyedgar.data_manipulation.
delay_embed
(traj_data, n_embed, lag=1, verbosity=0)[source]¶ Performs delay embedding on the trajectory data. Takes in trajectory data of format types, and returns the delay embedded data in the same type.
Parameters: - traj_data (list of arrays OR tuple of two arrays OR single numpy array) – Dynamical data on which to perform the delay embedding. This can be of multiple types, and the type dictates the format of the data. Specifically, it can be either a list of trajectories, the internal flattened format, or a single trajectory in the form of an array.
- n_embed (int) – The number of delay embeddings to perform.
- lag (int, optional) – The number of timesteps to look back in time for each delay. Default is 1.
- verbosity (int, optional) – The level of status messages that are output. Default is 0 (no messages).
Returns: embedded_data (list of arrays OR tuple of two arrays OR single numpy array) – Dynamical data with delay embedding performed, of the same type as the trajectory data.
-
pyedgar.data_manipulation.
flat_to_tlist
(traj_2d, traj_edges)[source]¶ Takes a flattened trajectory with stop and start points and reformats it into a list of separate trajectories.
Parameters: - traj2D (2D numpy array) – Numpy array containing the flattened trajectory information.
- traj_edges (1D numpy array) – Numpy array where each element is the start of each trajectory: the n’th trajectory runs from traj_edges[n] to traj_edges[n+1]
Returns: trajs (list of array-likes) – List where each element n is a array-like object of shape N_n x d, where N_n is the number of data points in that trajectory and d is the number of coordinates for each datapoint.
-
pyedgar.data_manipulation.
get_initial_final_split
(traj_edges, lag=1)[source]¶ Returns the incides of the points in the flat trajectory of the initial and final sample points. In this context, initial means the first N-lag points, and final means the last N-lag points.
Parameters: lag (int, optional) – Number of timepoints in the future to look into the future for the transfer operator. Default is 1. Returns: - t_0_indices (1D numpy array) – Indices in the flattened trajectory data of all the points at the initial times.
- t_0_indices (1D numpy array) – Indices in the flattened trajectory data of all the points at the final times.
-
pyedgar.data_manipulation.
lift_function
(function, n_embed, lag=1)[source]¶ Lift a function into the delay-embedded space.
-
pyedgar.data_manipulation.
tlist_to_flat
(trajs)[source]¶ Flattens a list of two dimensional trajectories into a single two dimensional datastructure, and returns it along with a list of tuples giving the locations of each trajectory.
Parameters: trajs (list of array-likes) – List where each element n is a array-like object of shape N_n x d, where N_n is the number of data points in that trajectory and d is the number of coordinates for each datapoint. Returns: - traj2D (2D numpy array) – Numpy array containing the flattened trajectory information.
- traj_edges (1D numpy array) – Numpy array where each element is the start of each trajectory: the n’th trajectory runs from traj_edges[n] to traj_edges[n+1]