deeppavlov.models.classifiers¶
-
class
deeppavlov.models.classifiers.torch_classification_model.
TorchTextClassificationModel
(n_classes: int, model_name: str, embedding_size: Optional[int] = None, multilabel: bool = False, criterion: str = 'CrossEntropyLoss', optimizer: str = 'AdamW', optimizer_parameters: dict = {'lr': 0.1}, lr_scheduler: Optional[str] = None, lr_scheduler_parameters: dict = {}, embedded_tokens: bool = True, vocab_size: Optional[int] = None, lr_decay_every_n_epochs: Optional[int] = None, learning_rate_drop_patience: Optional[int] = None, learning_rate_drop_div: Optional[float] = None, return_probas: bool = True, **kwargs)[source]¶ Class implements torch model for classification of texts. Input can either be embedded tokenized texts OR indices of words in the vocabulary. Number of tokens is not fixed while the samples in batch should be padded to the same (e.g. longest) lengths.
- Parameters
n_classes – number of classes
model_name – name of TorchTextClassificationModel methods which initializes model architecture
embedding_size – size of vector representation of words
multilabel – is multi-label classification (if so, sigmoid activation will be used, otherwise, softmax)
criterion – criterion name from torch.nn
optimizer – optimizer name from torch.optim
optimizer_parameters – dictionary with optimizer’s parameters, e.g. {‘lr’: 0.1, ‘weight_decay’: 0.001, ‘momentum’: 0.9}
lr_scheduler – string name of scheduler class from torch.optim.lr_scheduler
lr_scheduler_parameters – parameters for scheduler
embedded_tokens – True, if input contains embedded tokenized texts; False, if input containes indices of words in the vocabulary
vocab_size – vocabulary size in case of embedded_tokens=False, and embedding is a layer in the Network
lr_decay_every_n_epochs – how often to decay lr
learning_rate_drop_patience – how many validations with no improvements to wait
learning_rate_drop_div – the divider of the learning rate after learning_rate_drop_patience unsuccessful validations
return_probas – whether to return probabilities or index of classes (only for multilabel=False)
-
opt
¶ dictionary with all model parameters
-
n_classes
¶ number of considered classes
-
model
¶ torch model itself
-
epochs_done
¶ number of epochs that were done
-
optimizer
¶ torch optimizer instance
-
criterion
¶ torch criterion instance
-
__call__
(texts: List[numpy.ndarray], *args) → Union[List[List[float]], List[int]][source]¶ Infer on the given data.
- Parameters
texts – list of tokenized text samples
labels – labels
*args – additional arguments
- Returns
vector of probabilities to belong with each class or list of labels sentence belongs with
- Return type
for each sentence
-
cnn_model
(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, dropout_rate: float = 0.0, **kwargs) → torch.nn.Module[source]¶ Build un-compiled model of shallow-and-wide CNN.
- Parameters
kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
dropout_rate – dropout rate, after convolutions and between dense.
kwargs – other parameters
- Returns
instance of torch Model
- Return type
torch.models.Model
-
class
deeppavlov.models.classifiers.keras_classification_model.
KerasClassificationModel
(*args, **kwargs)[source]¶ Class implements Keras model for classification task for multi-class multi-labeled data.
- Parameters
embedding_size – embedding_size from embedder in pipeline
n_classes – number of considered classes
model_name – particular method of this class to initialize model configuration
optimizer – function name from keras.optimizers
loss – function name from keras.losses.
last_layer_activation – parameter that determines activation function after classification layer. For multi-label classification use sigmoid, otherwise, softmax.
restore_lr – in case of loading pre-trained model whether to init learning rate with the final learning rate value from saved opt
classes – list or generator of considered classes
text_size – maximal length of text in tokens (words), longer texts are cut, shorter ones are padded with zeros (pre-padding)
padding –
pre
orpost
padding to use
-
opt
¶ dictionary with all model parameters
-
n_classes
¶ number of considered classes
-
model
¶ keras model itself
-
epochs_done
¶ number of epochs that were done
-
batches_seen
¶ number of epochs that were seen
-
train_examples_seen
¶ number of training samples that were seen
-
sess
¶ tf session
-
optimizer
¶ keras.optimizers instance
-
classes
¶ list of considered classes
-
padding
¶ pre
orpost
padding to use
-
__call__
(data: List[List[numpy.ndarray]]) → List[List[float]][source]¶ Infer on the given data
- Parameters
data – list of tokenized text samples
- Returns
vector of probabilities to belong with each class or list of labels sentence belongs with
- Return type
for each sentence
-
bigru_model
(units_gru: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Method builds uncompiled model BiGRU.
- Parameters
units_gru – number of units for GRU.
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for GRU. Default:
0.0
.coef_reg_den – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate – dropout rate to be used after BiGRU and between dense layers. Default:
0.0
.rec_dropout_rate – dropout rate for GRU. Default:
0.0
.input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
bigru_with_max_aver_pool_model
(units_gru: int, dense_size: int, coef_reg_gru: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, **kwargs) → tensorflow.keras.models.Model[source]¶ Method builds uncompiled model Bidirectional GRU with concatenation of max and average pooling after BiGRU.
- Parameters
units_gru – number of units for GRU.
dense_size – number of units for dense layer.
coef_reg_gru – l2-regularization coefficient for GRU. Default:
0.0
.coef_reg_den – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate – dropout rate to be used after BiGRU and between dense layers. Default:
0.0
.rec_dropout_rate – dropout rate for GRU. Default:
0.0
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
bilstm_bilstm_model
(units_lstm_1: int, units_lstm_2: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Build un-compiled two-layers BiLSTM.
- Parameters
units_lstm_1 – number of units for the first LSTM layer.
units_lstm_2 – number of units for the second LSTM layer.
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default:
0.0
.coef_reg_den – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default:
0.0
.rec_dropout_rate – dropout rate for LSTM. Default:
0.0
.input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
bilstm_cnn_model
(units_lstm: int, kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Build un-compiled BiLSTM-CNN.
- Parameters
units_lstm – number of units for LSTM.
kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default:
0.0
.coef_reg_cnn – l2-regularization coefficient for convolutions. Default:
0.0
.coef_reg_den – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default:
0.0
.rec_dropout_rate – dropout rate for LSTM. Default:
0.0
.input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
bilstm_model
(units_lstm: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Build un-compiled BiLSTM.
- Parameters
units_lstm (int) – number of units for LSTM.
dense_size (int) – number of units for dense layer.
coef_reg_lstm (float) – l2-regularization coefficient for LSTM. Default:
0.0
.coef_reg_den (float) – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate (float) – dropout rate to be used after BiLSTM and between dense layers. Default:
0.0
.rec_dropout_rate (float) – dropout rate for LSTM. Default:
0.0
.input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
bilstm_self_add_attention_model
(units_lstm: int, dense_size: int, self_att_hid: int, self_att_out: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Method builds uncompiled model of BiLSTM with self additive attention.
- Parameters
units_lstm – number of units for LSTM.
self_att_hid – number of hidden units in self-attention
self_att_out – number of output units in self-attention
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default:
0.0
.coef_reg_den – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default:
0.0
.rec_dropout_rate – dropout rate for LSTM. Default:
0.0
.input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
bilstm_self_mult_attention_model
(units_lstm: int, dense_size: int, self_att_hid: int, self_att_out: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Method builds uncompiled model of BiLSTM with self multiplicative attention.
- Parameters
units_lstm – number of units for LSTM.
self_att_hid – number of hidden units in self-attention
self_att_out – number of output units in self-attention
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default:
0.0
.coef_reg_den – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default:
0.0
.rec_dropout_rate – dropout rate for LSTM. Default:
0.0
.input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
check_input
(texts: List[List[numpy.ndarray]]) → numpy.ndarray[source]¶ Check and convert input to array of tokenized embedded samples
- Parameters
texts – list of tokenized embedded text samples
- Returns
array of tokenized embedded texts samples that are cut and padded
-
cnn_bilstm_model
(kernel_sizes_cnn: List[int], filters_cnn: int, units_lstm: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Build un-compiled BiLSTM-CNN.
- Parameters
kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
units_lstm – number of units for LSTM.
dense_size – number of units for dense layer.
coef_reg_cnn – l2-regularization coefficient for convolutions. Default:
0.0
.coef_reg_lstm – l2-regularization coefficient for LSTM. Default:
0.0
.coef_reg_den – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default:
0.0
.rec_dropout_rate – dropout rate for LSTM. Default:
0.0
.input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
cnn_model
(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Build un-compiled model of shallow-and-wide CNN.
- Parameters
kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
coef_reg_cnn – l2-regularization coefficient for convolutions.
coef_reg_den – l2-regularization coefficient for dense layers.
dropout_rate – dropout rate used after convolutions and between dense layers.
input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
cnn_model_max_and_aver_pool
(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Build un-compiled model of shallow-and-wide CNN where average pooling after convolutions is replaced with concatenation of average and max poolings.
- Parameters
kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
coef_reg_cnn – l2-regularization coefficient for convolutions. Default:
0.0
.coef_reg_den – l2-regularization coefficient for dense layers. Default:
0.0
.dropout_rate – dropout rate used after convolutions and between dense layers. Default:
0.0
.input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
compile
(model: tensorflow.keras.models.Model, optimizer_name: str, loss_name: str, learning_rate: Optional[Union[float, List[float]]], learning_rate_decay: Optional[Union[float, str]]) → tensorflow.keras.models.Model[source]¶ Compile model with given optimizer and loss
- Parameters
model – keras uncompiled model
optimizer_name – name of optimizer from keras.optimizers
loss_name – loss function name (from keras.losses)
learning_rate – learning rate.
learning_rate_decay – learning rate decay.
Returns:
-
dcnn_model
(kernel_sizes_cnn: List[int], filters_cnn: List[int], dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶ Build un-compiled model of deep CNN.
- Parameters
kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
coef_reg_cnn – l2-regularization coefficient for convolutions.
coef_reg_den – l2-regularization coefficient for dense layers.
dropout_rate – dropout rate used after convolutions and between dense layers.
input_projection_size – if not None, adds Dense layer (with
relu
activation) right after input layer to the sizeinput_projection_size
. Useful for input dimentionaliry recuction. Default:None
.kwargs – other non-used parameters
- Returns
uncompiled instance of Keras Model
- Return type
keras.models.Model
-
init_model_from_scratch
(model_name: str) → tensorflow.keras.models.Model[source]¶ Initialize uncompiled model from scratch with given params
- Parameters
model_name – name of model function described as a method of this class
- Returns
compiled model with given network and learning parameters
-
pad_texts
(sentences: List[List[numpy.ndarray]]) → Union[numpy.ndarray, Tuple[numpy.ndarray, numpy.ndarray]][source]¶ Cut and pad tokenized texts to self.opt[“text_size”] tokens
- Parameters
sentences – list of lists of tokens
- Returns
array of embedded texts
-
class
deeppavlov.models.classifiers.cos_sim_classifier.
CosineSimilarityClassifier
(top_n: int = 1, save_path: Optional[str] = None, load_path: Optional[str] = None, **kwargs)[source]¶ Classifier based on cosine similarity between vectorized sentences
- Parameters
save_path – path to save the model
load_path – path to load the model
-
__call__
(q_vects: Union[scipy.sparse.csr.csr_matrix, List]) → Tuple[List[str], List[int]][source]¶ Found most similar answer for input vectorized question
- Parameters
q_vects – vectorized questions
- Returns
Tuple of Answer and Score
-
class
deeppavlov.models.classifiers.proba2labels.
Proba2Labels
(max_proba: Optional[bool] = None, confident_threshold: Optional[float] = None, top_n: Optional[int] = None, **kwargs)[source]¶ Class implements probability to labels processing using the following ways: choosing one or top_n indices with maximal probability or choosing any number of indices which probabilities to belong with are higher than given confident threshold
- Parameters
max_proba – whether to choose label with maximal probability
confident_threshold – boundary probability value for sample to belong with the class (best use for multi-label)
top_n – how many top labels with the highest probabilities to return
-
max_proba
¶ whether to choose label with maximal probability
-
confident_threshold
¶ boundary probability value for sample to belong with the class (best use for multi-label)
-
top_n
¶ how many top labels with the highest probabilities to return
-
__call__
(data: Union[numpy.ndarray, List[List[float]], List[List[int]]], *args, **kwargs) → Union[List[List[int]], List[int]][source]¶ Process probabilities to labels
- Parameters
data – list of vectors with probability distribution
- Returns
list of labels (only label classification) or list of lists of labels (multi-label classification)