Embeddings
Functions to create low dimensional embeddings.
- ecgan.utils.embeddings.calculate_tsne(data, perplexity=30, early_exaggeration=12.0, n_components=2)[source]
Calculate t-SNE.
This is a wrapper function for the corresponding sklearn implementation. Can be applied to both, univariate as well as multivariate series. Can be visualized with the
ecgan.visualization.plotter.ScatterPlotter
. Keep in mind rerunning t-SNE will not return the same embeddings on different runs because its cost function is not convex. t-SNE is slow in comparison to e.g. UMAP, to speed up training the reducer, one might want to train it on the GPU, a cuda implementation can be found on GitHub (CannyLab).- Parameters
data (
ndarray
) -- Data whose dimensionality shall be reduced. Either (batch, seq_len) or (batch, seq_len, channel) format.perplexity (
float
) -- t-SNE perplexity (more information e.g. here.early_exaggeration (
float
) -- Controls how tight the embedded points are packed.n_components (
int
) -- Dimension of the embedded space.
References
van der Maaten and Hinton, 2008
- Return type
Tuple
[ndarray
,BaseEstimator
]- Returns
The resulting low-dim embedding with shape (dims, samples) and the trained reducer
- ecgan.utils.embeddings.calculate_umap(data, target, n_neighbors=25, supervised_umap=True, n_components=2, rnd_seed=None, low_memory=True)[source]
UMAP embeddings according to McInnes et al. 2018.
Using the public UMAP implementation for 2D visualizations.
- Parameters
data (
ndarray
) -- Univariate or multivariate series as numpy array tensor.target (
Union
[List
[int
],object
]) -- List of the target classes encoded as integers.n_neighbors (
int
) -- Amount of UMAP neighbors used to construct the graph in high dimensionality.supervised_umap (
bool
) -- Flag indicating if we want to use supervised umap, utilizing the target info.n_components (
int
) -- Dimensionality of low dim. embedding.rnd_seed (
Optional
[int
]) -- Set random seed if you want to reproduce the embedding. Warning: Slows down performance!low_memory (
bool
) -- Enables or disables the low memory mode. Should be True if you run into memory problems during NNDescent. More time required during computation if enabled.
- Return type
Tuple
[ndarray
,BaseEstimator
]- Returns
The resulting low-dim UMAP embedding of shape (dim, samples).
- ecgan.utils.embeddings.calculate_pca(data, n_components=2)[source]
PCA embeddings using the sklearn library.
- Parameters
data (
ndarray
) -- Univariate or multivariate series as numpy array tensor.n_components (
int
) -- Dimensionality of low dim. embedding.
- Return type
Tuple
[ndarray
,BaseEstimator
]- Returns
The resulting low-dim PCA embedding of shape (dim, samples) and the trained reducer.