deeppavlov.models.ranking¶

Ranking classes.

class deeppavlov.models.ranking.bilstm_siamese_network.BiLSTMSiameseNetwork(len_vocab: int, seed: int = None, shared_weights: bool = True, embedding_dim: int = 300, reccurent: str = 'bilstm', hidden_dim: int = 300, max_pooling: bool = True, triplet_loss: bool = True, margin: float = 0.1, hard_triplets: bool = False, *args, **kwargs)[source]¶

The class implementing a siamese neural network with BiLSTM and max pooling.

There is a possibility to use a binary cross-entropy loss as well as a triplet loss with random or hard negative sampling.

Parameters:

len_vocab – A size of the vocabulary to build embedding layer.
seed – Random seed.
shared_weights – Whether to use shared weights in the model to encode contexts and responses.
embedding_dim – Dimensionality of token (word) embeddings.
reccurent – A type of the RNN cell. Possible values are lstm and bilstm.
hidden_dim – Dimensionality of the hidden state of the RNN cell. If reccurent equals bilstm hidden_dim should be doubled to get the actual dimensionality.
max_pooling – Whether to use max-pooling operation to get context (response) vector representation. If False, the last hidden state of the RNN will be used.
triplet_loss – Whether to use a model with triplet loss. If False, a model with crossentropy loss will be used.
margin – A margin parameter for triplet loss. Only required if triplet_loss is set to True.
hard_triplets – Whether to use hard triplets sampling to train the model i.e. to choose negative samples close to positive ones. If set to False random sampling will be used. Only required if triplet_loss is set to True.

class deeppavlov.models.ranking.bilstm_gru_siamese_network.BiLSTMGRUSiameseNetwork(len_vocab: int, seed: int = None, shared_weights: bool = True, embedding_dim: int = 300, reccurent: str = 'bilstm', hidden_dim: int = 300, max_pooling: bool = True, triplet_loss: bool = True, margin: float = 0.1, hard_triplets: bool = False, *args, **kwargs)[source]¶

The class implementing a siamese neural network with BiLSTM, GRU and max pooling.

GRU is used to take into account multi-turn dialogue context.

Parameters:

len_vocab – A size of the vocabulary to build embedding layer.
seed – Random seed.
shared_weights – Whether to use shared weights in the model to encode contexts and responses.
embedding_dim – Dimensionality of token (word) embeddings.
reccurent – A type of the RNN cell. Possible values are lstm and bilstm.
hidden_dim – Dimensionality of the hidden state of the RNN cell. If reccurent equals bilstm hidden_dim should be doubled to get the actual dimensionality.
max_pooling – Whether to use max-pooling operation to get context (response) vector representation. If False, the last hidden state of the RNN will be used.
triplet_loss – Whether to use a model with triplet loss. If False, a model with crossentropy loss will be used.
margin – A margin parameter for triplet loss. Only required if triplet_loss is set to True.
hard_triplets – Whether to use hard triplets sampling to train the model i.e. to choose negative samples close to positive ones. If set to False random sampling will be used. Only required if triplet_loss is set to True.

class deeppavlov.models.ranking.keras_siamese_model.KerasSiameseModel(learning_rate: float = 0.001, use_matrix: bool = True, emb_matrix: numpy.ndarray = None, max_sequence_length: int = None, dynamic_batch: bool = False, attention: bool = False, *args, **kwargs)[source]¶

The class implementing base functionality for siamese neural networks in keras.

Parameters:

learning_rate – Learning rate.
use_matrix – Whether to use a trainable matrix with token (word) embeddings.
emb_matrix – An embeddings matrix to initialize an embeddings layer of a model. Only used if use_matrix is set to True.
max_sequence_length – A maximum length of text sequences in tokens. Longer sequences will be truncated and shorter ones will be padded.
dynamic_batch – Whether to use dynamic batching. If True, the maximum length of a sequence for a batch will be equal to the maximum of all sequences lengths from this batch, but not higher than max_sequence_length.
attention – Whether any attention mechanism is used in the siamese network.
*args – Other parameters.
**kwargs – Other parameters.

class deeppavlov.models.ranking.mpm_siamese_network.MPMSiameseNetwork(dense_dim: int = 50, perspective_num: int = 20, aggregation_dim: int = 200, recdrop_val: float = 0.0, inpdrop_val: float = 0.0, ldrop_val: float = 0.0, dropout_val: float = 0.0, *args, **kwargs)[source]¶

The class implementing a siamese neural network with bilateral multi-Perspective matching.

The network architecture is based on https://arxiv.org/abs/1702.03814.

Parameters:

dense_dim – Dimensionality of the dense layer.
perspective_num – Number of perspectives in multi-perspective matching layers.
dim (aggregation) – Dimensionality of the hidden state in the second BiLSTM layer.
inpdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the inputs.
recdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the recurrent state.
ldrop_val – A dropout value of the dropout layer before the second BiLSTM layer.
dropout_val – A dropout value of the dropout layer after the second BiLSTM layer.

class deeppavlov.models.ranking.siamese_model.SiameseModel(batch_size: int, num_context_turns: int = 1, *args, **kwargs)[source]¶

The class implementing base functionality for siamese neural networks.

Parameters:	batch_size – A size of a batch. num_context_turns – A number of `context` turns in data samples. args – Other parameters. *kwargs – Other parameters.

class deeppavlov.models.ranking.siamese_predictor.SiamesePredictor(model: deeppavlov.models.ranking.siamese_model.SiameseModel, batch_size: int, num_context_turns: int = 1, ranking: bool = True, attention: bool = False, responses: deeppavlov.core.data.simple_vocab.SimpleVocabulary = None, preproc_func: Callable = None, interact_pred_num: int = 3, *args, **kwargs)[source]¶

The class for ranking or paraphrase identification using the trained siamese network in the interact mode.

Parameters:

batch_size – A size of a batch.
num_context_turns – A number of context turns in data samples.
ranking – Whether to perform ranking. If it is set to False paraphrase identification will be performed.
attention – Whether any attention mechanism is used in the siamese network. If False then calculated in advance vectors of responses will be used to obtain similarity score for the input context; Otherwise the whole siamese architecture will be used to obtain similarity score for the input context and each particular response. The parameter will be used if the ranking is set to True.
responses – A instance of SimpleVocabulary with all possible responses to perform ranking. Will be used if the ranking is set to True.
preproc_func – A __call__ function of the SiamesePreprocessor.
interact_pred_num – The number of the most relevant responses which will be returned. Will be used if the ranking is set to True.
**kwargs – Other parameters.