Question Answering Model for SQuAD dataset¶
Task definition¶
Question Answering on SQuAD dataset is a task to find an answer on question in a given context (e.g, paragraph from Wikipedia), where the answer to each question is a segment of the context:
Context:
In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls under gravity. The main forms of precipitation include drizzle, rain, sleet, snow, graupel and hail… Precipitation forms as smaller droplets coalesce via collision with other rain drops or ice crystals within a cloud. Short, intense periods of rain in scattered locations are called “showers”.
Question:
Where do water droplets collide with ice crystals to form precipitation?
Answer:
within a cloud
Datasets, which follow this task format:
- Stanford Question Answering Dataset (SQuAD) (EN)
- SDSJ Task B (RU)
Model¶
Question Answering Model is based on R-Net, proposed by Microsoft Research Asia (“R-NET: Machine Reading Comprehension with Self-matching Networks”) and its implementation by Wenxuan Zhou.
Model documentation: SquadModel
Configuration¶
Default config could be found at deeppavlov/configs/squad/squad.json
Model usage¶
Training¶
Warning: training with default config requires about 9Gb on GPU. Run following command to train the model:
python -m deeppavlov train deeppavlov/configs/squad/squad.json
Interact mode¶
Interact mode provides command line interface to already trained model.
To run model in interact mode run the following command:
python -m deeppavlov interact deeppavlov/configs/squad/squad.json
Model will ask you to type in context and question.
Pretrained models:¶
SQuAD¶
Pretrained model is available and can be downloaded:
python -m deeppavlov download deeppavlov/configs/squad/squad.json
It achieves ~80 F-1 score and ~71 EM on SQuAD-v1.1 dev set.
In the following table you can find comparison with published results. Results of the most recent competitive solutions could be found on SQuAD Leadearboad.
Model (single model) | EM (dev) | F-1 (dev) |
DeepPavlov | 71.41 | 80.26 |
BiDAF + Self Attention + ELMo | – | 85.6 |
QANet | 75.1 | 83.8 |
FusionNet | 75.3 | 83.6 |
R-Net | 71.1 | 79.5 |
BiDAF | 67.7 | 77.3 |
SDSJ Task B¶
Pretrained model is available and can be downloaded:
python -m deeppavlov download deeppavlov/configs/squad/squad_ru.json
Model config | EM (dev) | F-1 (dev) | |
DeepPavlov | 60.58 | 80.22 |