D-Iteration#
This module transforms queries into relevance vectors that can be used to rank and organize documents and features.
- class gismo.diteration.DIteration(n, m)[source]#
This class is in charge of performing the DIteration algorithm.
- gismo.diteration.jit_diffusion(x_pointers, x_indices, x_data, y_pointers, y_indices, y_data, z_indices, z_data, x_relevance, y_relevance, alpha, n_iter, offset: float, x_fluid, y_fluid)[source]#
Core diffusion engine written to be compatible with Numba. This is where the DIteration algorithm is applied inline.
- Parameters:
x_pointers (
ndarray
) – Pointers of thecsr_matrix
embedding of documents.x_indices (
ndarray
) – Indices of thecsr_matrix
embedding of documents.x_data (
ndarray
) – Data of thecsr_matrix
embedding of documents.y_pointers (
ndarray
) – Pointers of thecsr_matrix
embedding of features.y_indices (
ndarray
) – Indices of thecsr_matrix
embedding of features.y_data (
ndarray
) – Data of thecsr_matrix
embedding of features.z_indices (
ndarray
) – Indices of thecsr_matrix
embedding of the query projection.z_data (
ndarray
) – Data of thecsr_matrix
embedding of the query_projection.x_relevance (
ndarray
) – Placeholder for relevance of documents.y_relevance (
ndarray
) – Placeholder for relevance of features.alpha (float in range [0.0, 1.0]) – Damping factor. Controls the trade-off between closeness and centrality.
n_iter (int) – Number of round-trip diffusions to perform. Higher value means better precision but longer execution time.
offset (float in range [0.0, 1.0]) – Controls how much of the initial fluid should be deduced form the relevance.
x_fluid (
ndarray
) – Placeholder for fluid on the side of documents.y_fluid (
ndarray
) – Placeholder for fluid on the side of features.