deeppavlov.models.squad¶
-
class
deeppavlov.models.squad.squad.
SquadModel
(word_emb: numpy.ndarray, char_emb: numpy.ndarray, context_limit: int = 450, question_limit: int = 150, char_limit: int = 16, train_char_emb: bool = True, char_hidden_size: int = 100, encoder_hidden_size: int = 75, attention_hidden_size: int = 75, keep_prob: float = 0.7, min_learning_rate: float = 0.001, noans_token: bool = False, **kwargs)[source]¶ SquadModel predicts answer start and end position in given context by given question.
High level architecture: Word embeddings -> Contextual embeddings -> Question-Context Attention -> Self-attention -> Pointer Network
If noans_token flag is True, then special noans_token is added to output of self-attention layer. Pointer Network can select noans_token if there is no answer in given context.
Parameters: - word_emb – pretrained word embeddings
- char_emb – pretrained char embeddings
- context_limit – max context length in tokens
- question_limit – max question length in tokens
- char_limit – max number of characters in token
- char_hidden_size – hidden size of charRNN
- encoder_hidden_size – hidden size of encoder RNN
- attention_hidden_size – size of projection layer in attention
- keep_prob – dropout keep probability
- min_learning_rate – minimal learning rate, is used in learning rate decay
- noans_token – boolean, flags whether to use special no_ans token to make model able not to answer on question
-
__call__
(c_tokens: numpy.ndarray, c_chars: numpy.ndarray, q_tokens: numpy.ndarray, q_chars: numpy.ndarray, *args, **kwargs) → Tuple[numpy.ndarray, numpy.ndarray, List[float]][source]¶ Predicts answer start and end positions by given context and question.
Parameters: - c_tokens – batch of tokenized contexts
- c_chars – batch of tokenized contexts, each token split on chars
- q_tokens – batch of tokenized questions
- q_chars – batch of tokenized questions, each token split on chars
Returns: answer_start, answer_end positions, answer logits which represent models confidence
-
train_on_batch
(c_tokens: numpy.ndarray, c_chars: numpy.ndarray, q_tokens: numpy.ndarray, q_chars: numpy.ndarray, y1s: Tuple[List[int], ...], y2s: Tuple[List[int], ...]) → float[source]¶ This method is called by trainer to make one training step on one batch.
Parameters: - c_tokens – batch of tokenized contexts
- c_chars – batch of tokenized contexts, each token split on chars
- q_tokens – batch of tokenized questions
- q_chars – batch of tokenized questions, each token split on chars
- y1s – batch of ground truth answer start positions
- y2s – batch of ground truth answer end positions
Returns: value of loss function on batch