deeppavlov.models.ranking¶
Ranking classes.
-
class
deeppavlov.models.ranking.bilstm_siamese_network.
BiLSTMSiameseNetwork
(len_vocab: int, seed: int = None, shared_weights: bool = True, embedding_dim: int = 300, reccurent: str = 'bilstm', hidden_dim: int = 300, max_pooling: bool = True, triplet_loss: bool = True, margin: float = 0.1, hard_triplets: bool = False, *args, **kwargs)[source]¶ The class implementing a siamese neural network with BiLSTM and max pooling.
There is a possibility to use a binary cross-entropy loss as well as a triplet loss with random or hard negative sampling.
Parameters: - len_vocab – A size of the vocabulary to build embedding layer.
- seed – Random seed.
- shared_weights – Whether to use shared weights in the model to encode
contexts
andresponses
. - embedding_dim – Dimensionality of token (word) embeddings.
- reccurent – A type of the RNN cell. Possible values are
lstm
andbilstm
. - hidden_dim – Dimensionality of the hidden state of the RNN cell. If
reccurent
equalsbilstm
hidden_dim
should be doubled to get the actual dimensionality. - max_pooling – Whether to use max-pooling operation to get
context
(response
) vector representation. IfFalse
, the last hidden state of the RNN will be used. - triplet_loss – Whether to use a model with triplet loss.
If
False
, a model with crossentropy loss will be used. - margin – A margin parameter for triplet loss. Only required if
triplet_loss
is set toTrue
. - hard_triplets – Whether to use hard triplets sampling to train the model
i.e. to choose negative samples close to positive ones.
If set to
False
random sampling will be used. Only required iftriplet_loss
is set toTrue
.
-
class
deeppavlov.models.ranking.bilstm_gru_siamese_network.
BiLSTMGRUSiameseNetwork
(len_vocab: int, seed: int = None, shared_weights: bool = True, embedding_dim: int = 300, reccurent: str = 'bilstm', hidden_dim: int = 300, max_pooling: bool = True, triplet_loss: bool = True, margin: float = 0.1, hard_triplets: bool = False, *args, **kwargs)[source]¶ The class implementing a siamese neural network with BiLSTM, GRU and max pooling.
GRU is used to take into account multi-turn dialogue
context
.Parameters: - len_vocab – A size of the vocabulary to build embedding layer.
- seed – Random seed.
- shared_weights – Whether to use shared weights in the model to encode
contexts
andresponses
. - embedding_dim – Dimensionality of token (word) embeddings.
- reccurent – A type of the RNN cell. Possible values are
lstm
andbilstm
. - hidden_dim – Dimensionality of the hidden state of the RNN cell. If
reccurent
equalsbilstm
hidden_dim
should be doubled to get the actual dimensionality. - max_pooling – Whether to use max-pooling operation to get
context
(response
) vector representation. IfFalse
, the last hidden state of the RNN will be used. - triplet_loss – Whether to use a model with triplet loss.
If
False
, a model with crossentropy loss will be used. - margin – A margin parameter for triplet loss. Only required if
triplet_loss
is set toTrue
. - hard_triplets – Whether to use hard triplets sampling to train the model
i.e. to choose negative samples close to positive ones.
If set to
False
random sampling will be used. Only required iftriplet_loss
is set toTrue
.
-
class
deeppavlov.models.ranking.keras_siamese_model.
KerasSiameseModel
(learning_rate: float = 0.001, use_matrix: bool = True, emb_matrix: numpy.ndarray = None, max_sequence_length: int = None, dynamic_batch: bool = False, attention: bool = False, *args, **kwargs)[source]¶ The class implementing base functionality for siamese neural networks in keras.
Parameters: - learning_rate – Learning rate.
- use_matrix – Whether to use a trainable matrix with token (word) embeddings.
- emb_matrix – An embeddings matrix to initialize an embeddings layer of a model.
Only used if
use_matrix
is set toTrue
. - max_sequence_length – A maximum length of text sequences in tokens. Longer sequences will be truncated and shorter ones will be padded.
- dynamic_batch – Whether to use dynamic batching. If
True
, the maximum length of a sequence for a batch will be equal to the maximum of all sequences lengths from this batch, but not higher thanmax_sequence_length
. - attention – Whether any attention mechanism is used in the siamese network.
- *args – Other parameters.
- **kwargs – Other parameters.
-
class
deeppavlov.models.ranking.mpm_siamese_network.
MPMSiameseNetwork
(dense_dim: int = 50, perspective_num: int = 20, aggregation_dim: int = 200, recdrop_val: float = 0.0, inpdrop_val: float = 0.0, ldrop_val: float = 0.0, dropout_val: float = 0.0, *args, **kwargs)[source]¶ The class implementing a siamese neural network with bilateral multi-Perspective matching.
The network architecture is based on https://arxiv.org/abs/1702.03814.
Parameters: - dense_dim – Dimensionality of the dense layer.
- perspective_num – Number of perspectives in multi-perspective matching layers.
- dim (aggregation) – Dimensionality of the hidden state in the second BiLSTM layer.
- inpdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the inputs.
- recdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the recurrent state.
- ldrop_val – A dropout value of the dropout layer before the second BiLSTM layer.
- dropout_val – A dropout value of the dropout layer after the second BiLSTM layer.
-
class
deeppavlov.models.ranking.siamese_model.
SiameseModel
(batch_size: int, num_context_turns: int = 1, *args, **kwargs)[source]¶ The class implementing base functionality for siamese neural networks.
Parameters: - batch_size – A size of a batch.
- num_context_turns – A number of
context
turns in data samples. - *args – Other parameters.
- **kwargs – Other parameters.
-
class
deeppavlov.models.ranking.siamese_predictor.
SiamesePredictor
(model: deeppavlov.models.ranking.siamese_model.SiameseModel, batch_size: int, num_context_turns: int = 1, ranking: bool = True, attention: bool = False, responses: deeppavlov.core.data.simple_vocab.SimpleVocabulary = None, preproc_func: Callable = None, interact_pred_num: int = 3, *args, **kwargs)[source]¶ The class for ranking or paraphrase identification using the trained siamese network in the
interact
mode.Parameters: - batch_size – A size of a batch.
- num_context_turns – A number of
context
turns in data samples. - ranking – Whether to perform ranking.
If it is set to
False
paraphrase identification will be performed. - attention – Whether any attention mechanism is used in the siamese network.
If
False
then calculated in advance vectors ofresponses
will be used to obtain similarity score for the inputcontext
; Otherwise the whole siamese architecture will be used to obtain similarity score for the inputcontext
and each particularresponse
. The parameter will be used if theranking
is set toTrue
. - responses – A instance of
SimpleVocabulary
with all possibleresponses
to perform ranking. Will be used if theranking
is set toTrue
. - preproc_func – A
__call__
function of theSiamesePreprocessor
. - interact_pred_num – The number of the most relevant
responses
which will be returned. Will be used if theranking
is set toTrue
. - **kwargs – Other parameters.