deeppavlov.models.classifiers¶

class deeppavlov.models.classifiers.torch_classification_model.TorchTextClassificationModel(n_classes: int, model_name: str, embedding_size: Optional[int] = None, multilabel: bool = False, criterion: str = 'CrossEntropyLoss', optimizer: str = 'AdamW', optimizer_parameters: dict = {'lr': 0.1}, lr_scheduler: Optional[str] = None, lr_scheduler_parameters: dict = {}, embedded_tokens: bool = True, vocab_size: Optional[int] = None, lr_decay_every_n_epochs: Optional[int] = None, learning_rate_drop_patience: Optional[int] = None, learning_rate_drop_div: Optional[float] = None, return_probas: bool = True, **kwargs)[source]¶

Class implements torch model for classification of texts. Input can either be embedded tokenized texts OR indices of words in the vocabulary. Number of tokens is not fixed while the samples in batch should be padded to the same (e.g. longest) lengths.

Parameters

n_classes – number of classes
model_name – name of TorchTextClassificationModel methods which initializes model architecture
embedding_size – size of vector representation of words
multilabel – is multi-label classification (if so, sigmoid activation will be used, otherwise, softmax)
criterion – criterion name from torch.nn
optimizer – optimizer name from torch.optim
optimizer_parameters – dictionary with optimizer’s parameters, e.g. {‘lr’: 0.1, ‘weight_decay’: 0.001, ‘momentum’: 0.9}
lr_scheduler – string name of scheduler class from torch.optim.lr_scheduler
lr_scheduler_parameters – parameters for scheduler
embedded_tokens – True, if input contains embedded tokenized texts; False, if input containes indices of words in the vocabulary
vocab_size – vocabulary size in case of embedded_tokens=False, and embedding is a layer in the Network
lr_decay_every_n_epochs – how often to decay lr
learning_rate_drop_patience – how many validations with no improvements to wait
learning_rate_drop_div – the divider of the learning rate after learning_rate_drop_patience unsuccessful validations
return_probas – whether to return probabilities or index of classes (only for multilabel=False)

opt¶: dictionary with all model parameters

n_classes¶: number of considered classes

model¶: torch model itself

epochs_done¶: number of epochs that were done

optimizer¶: torch optimizer instance

criterion¶: torch criterion instance

__call__(texts: List[numpy.ndarray], *args) → Union[List[List[float]], List[int]][source]¶

Infer on the given data.

Parameters

texts – list of tokenized text samples
labels – labels
*args – additional arguments

Returns

vector of probabilities to belong with each class or list of labels sentence belongs with

Return type

for each sentence

cnn_model(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, dropout_rate: float = 0.0, **kwargs) → torch.nn.Module[source]¶

Build un-compiled model of shallow-and-wide CNN.

Parameters

kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
dropout_rate – dropout rate, after convolutions and between dense.
kwargs – other parameters

Returns

instance of torch Model

Return type

torch.models.Model

process_event(event_name: str, data: dict)[source]¶

Process event after epoch

Parameters

event_name – whether event is send after epoch or batch. Set of values: "after_epoch", "after_batch"
data – event data (dictionary)

Returns

None

train_on_batch(texts: List[List[numpy.ndarray]], labels: list) → Union[float, List[float]][source]¶

Train the model on the given batch.

Parameters

texts – vectorized texts
labels – list of labels

Returns

metrics values on the given batch

class deeppavlov.models.classifiers.keras_classification_model.KerasClassificationModel(*args, **kwargs)[source]¶

Class implements Keras model for classification task for multi-class multi-labeled data.

Parameters

embedding_size – embedding_size from embedder in pipeline
n_classes – number of considered classes
model_name – particular method of this class to initialize model configuration
optimizer – function name from keras.optimizers
loss – function name from keras.losses.
last_layer_activation – parameter that determines activation function after classification layer. For multi-label classification use sigmoid, otherwise, softmax.
restore_lr – in case of loading pre-trained model whether to init learning rate with the final learning rate value from saved opt
classes – list or generator of considered classes
text_size – maximal length of text in tokens (words), longer texts are cut, shorter ones are padded with zeros (pre-padding)
padding – pre or post padding to use

opt¶: dictionary with all model parameters

n_classes¶: number of considered classes

model¶: keras model itself

epochs_done¶: number of epochs that were done

batches_seen¶: number of epochs that were seen

train_examples_seen¶: number of training samples that were seen

sess¶: tf session

optimizer¶: keras.optimizers instance

classes¶: list of considered classes

padding¶: pre or post padding to use

__call__(data: List[List[numpy.ndarray]]) → List[List[float]][source]¶

Infer on the given data

Parameters: data – list of tokenized text samples
Returns: vector of probabilities to belong with each class or list of labels sentence belongs with
Return type: for each sentence

bigru_model(units_gru: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Method builds uncompiled model BiGRU.

Parameters

units_gru – number of units for GRU.
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for GRU. Default: 0.0.
coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate – dropout rate to be used after BiGRU and between dense layers. Default: 0.0.
rec_dropout_rate – dropout rate for GRU. Default: 0.0.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bigru_with_max_aver_pool_model(units_gru: int, dense_size: int, coef_reg_gru: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, **kwargs) → tensorflow.keras.models.Model[source]¶

Method builds uncompiled model Bidirectional GRU with concatenation of max and average pooling after BiGRU.

Parameters

units_gru – number of units for GRU.
dense_size – number of units for dense layer.
coef_reg_gru – l2-regularization coefficient for GRU. Default: 0.0.
coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate – dropout rate to be used after BiGRU and between dense layers. Default: 0.0.
rec_dropout_rate – dropout rate for GRU. Default: 0.0.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_bilstm_model(units_lstm_1: int, units_lstm_2: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Build un-compiled two-layers BiLSTM.

Parameters

units_lstm_1 – number of units for the first LSTM layer.
units_lstm_2 – number of units for the second LSTM layer.
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.
coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.
rec_dropout_rate – dropout rate for LSTM. Default: 0.0.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_cnn_model(units_lstm: int, kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Build un-compiled BiLSTM-CNN.

Parameters

units_lstm – number of units for LSTM.
kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.
coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.
coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.
rec_dropout_rate – dropout rate for LSTM. Default: 0.0.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_model(units_lstm: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Build un-compiled BiLSTM.

Parameters

units_lstm (int) – number of units for LSTM.
dense_size (int) – number of units for dense layer.
coef_reg_lstm (float) – l2-regularization coefficient for LSTM. Default: 0.0.
coef_reg_den (float) – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate (float) – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.
rec_dropout_rate (float) – dropout rate for LSTM. Default: 0.0.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_self_add_attention_model(units_lstm: int, dense_size: int, self_att_hid: int, self_att_out: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Method builds uncompiled model of BiLSTM with self additive attention.

Parameters

units_lstm – number of units for LSTM.
self_att_hid – number of hidden units in self-attention
self_att_out – number of output units in self-attention
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.
coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.
rec_dropout_rate – dropout rate for LSTM. Default: 0.0.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_self_mult_attention_model(units_lstm: int, dense_size: int, self_att_hid: int, self_att_out: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Method builds uncompiled model of BiLSTM with self multiplicative attention.

Parameters

units_lstm – number of units for LSTM.
self_att_hid – number of hidden units in self-attention
self_att_out – number of output units in self-attention
dense_size – number of units for dense layer.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.
coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.
rec_dropout_rate – dropout rate for LSTM. Default: 0.0.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

check_input(texts: List[List[numpy.ndarray]]) → numpy.ndarray[source]¶

Check and convert input to array of tokenized embedded samples

Parameters: texts – list of tokenized embedded text samples
Returns: array of tokenized embedded texts samples that are cut and padded

cnn_bilstm_model(kernel_sizes_cnn: List[int], filters_cnn: int, units_lstm: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Build un-compiled BiLSTM-CNN.

Parameters

kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
units_lstm – number of units for LSTM.
dense_size – number of units for dense layer.
coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.
coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.
coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.
rec_dropout_rate – dropout rate for LSTM. Default: 0.0.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

cnn_model(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Build un-compiled model of shallow-and-wide CNN.

Parameters

kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
coef_reg_cnn – l2-regularization coefficient for convolutions.
coef_reg_den – l2-regularization coefficient for dense layers.
dropout_rate – dropout rate used after convolutions and between dense layers.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

cnn_model_max_and_aver_pool(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Build un-compiled model of shallow-and-wide CNN where average pooling after convolutions is replaced with concatenation of average and max poolings.

Parameters

kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.
coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.
dropout_rate – dropout rate used after convolutions and between dense layers. Default: 0.0.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

compile(model: tensorflow.keras.models.Model, optimizer_name: str, loss_name: str, learning_rate: Optional[Union[float, List[float]]], learning_rate_decay: Optional[Union[float, str]]) → tensorflow.keras.models.Model[source]¶

Compile model with given optimizer and loss

Parameters

model – keras uncompiled model
optimizer_name – name of optimizer from keras.optimizers
loss_name – loss function name (from keras.losses)
learning_rate – learning rate.
learning_rate_decay – learning rate decay.

Returns:

dcnn_model(kernel_sizes_cnn: List[int], filters_cnn: List[int], dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]¶

Build un-compiled model of deep CNN.

Parameters

kernel_sizes_cnn – list of kernel sizes of convolutions.
filters_cnn – number of filters for convolutions.
dense_size – number of units for dense layer.
coef_reg_cnn – l2-regularization coefficient for convolutions.
coef_reg_den – l2-regularization coefficient for dense layers.
dropout_rate – dropout rate used after convolutions and between dense layers.
input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.
kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

get_optimizer()[source]¶: Return an instance of keras optimizer

init_model_from_scratch(model_name: str) → tensorflow.keras.models.Model[source]¶

Initialize uncompiled model from scratch with given params

Parameters: model_name – name of model function described as a method of this class
Returns: compiled model with given network and learning parameters

pad_texts(sentences: List[List[numpy.ndarray]]) → Union[numpy.ndarray, Tuple[numpy.ndarray, numpy.ndarray]][source]¶

Cut and pad tokenized texts to self.opt[“text_size”] tokens

Parameters: sentences – list of lists of tokens
Returns: array of embedded texts

save(fname: Optional[str] = None) → None [source]¶

Save the model parameters into <<fname>>_opt.json (or <<ser_file>>_opt.json) and model weights into <<fname>>.h5 (or <<ser_file>>.h5) :param fname: file_path to save model. If not explicitly given seld.opt[“ser_file”] will be used

Returns: None

train_on_batch(texts: List[List[numpy.ndarray]], labels: list) → Union[float, List[float]][source]¶

Train the model on the given batch

Parameters

texts – list of tokenized embedded text samples
labels – list of labels

Returns

metrics values on the given batch

class deeppavlov.models.classifiers.cos_sim_classifier.CosineSimilarityClassifier(top_n: int = 1, save_path: Optional[str] = None, load_path: Optional[str] = None, **kwargs)[source]¶

Classifier based on cosine similarity between vectorized sentences

Parameters

save_path – path to save the model
load_path – path to load the model

__call__(q_vects: Union[scipy.sparse.csr.csr_matrix, List]) → Tuple[List[str], List[int]][source]¶

Found most similar answer for input vectorized question

Parameters: q_vects – vectorized questions
Returns: Tuple of Answer and Score

fit(x_train_vects: Tuple[Union[scipy.sparse.csr.csr_matrix, List]], y_train: Tuple[str]) → None [source]¶

Train classifier

Parameters

x_train_vects – vectorized question for train dataset
y_train – answers for train dataset

Returns

None

load() → None [source]¶: Load classifier parameters

save() → None [source]¶: Save classifier parameters

class deeppavlov.models.classifiers.proba2labels.Proba2Labels(max_proba: Optional[bool] = None, confident_threshold: Optional[float] = None, top_n: Optional[int] = None, **kwargs)[source]¶

Class implements probability to labels processing using the following ways: choosing one or top_n indices with maximal probability or choosing any number of indices which probabilities to belong with are higher than given confident threshold

Parameters

max_proba – whether to choose label with maximal probability
confident_threshold – boundary probability value for sample to belong with the class (best use for multi-label)
top_n – how many top labels with the highest probabilities to return

max_proba¶: whether to choose label with maximal probability

confident_threshold¶: boundary probability value for sample to belong with the class (best use for multi-label)

top_n¶: how many top labels with the highest probabilities to return

__call__(data: Union[numpy.ndarray, List[List[float]], List[List[int]]], *args, **kwargs) → Union[List[List[int]], List[int]][source]¶

Process probabilities to labels

Parameters: data – list of vectors with probability distribution
Returns: list of labels (only label classification) or list of lists of labels (multi-label classification)