Skip to content
/ ViQG Public

This paper introduces a systematic and large-scale study of the Vietnamese question generation task. Different from prior work that only investigates the task with a small number (1-2) of datasets, the study reports the performance of question generation models on a wide range of Vietnamese machine reading comprehension corpora in different setting

License

Notifications You must be signed in to change notification settings

Shaun-le/ViQG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Question Generation (QG): An Experimental Study for Vietnamese Text

Directory

Please note that you should prepare a folder to store the data as shown below:

├── datasets/
  ├── ViNewsQA/
    ├── train.json
    ├── dev.jon
    ├── test.jon
  ├── ViQuAD/
├── parser_data/
├── seq2seq/
├── cli.py
└── main.py

Data

The available datasets for this source code include: ViNewsQA, ViQuAD, ViCoQA, ViMMRC1.0, and ViMMRC2.0.

*If you want to train a model on your own dataset, convert that dataset to a format similar to one of the five datasets provided.

Usage

Install

git clone https://github.com/Shaun-le/ViQG.git
cd ViQG

Prerequisite

To install dependencies, run:

pip install -r requirements.txt

CLI

To proceed with model training, please run the following code snippets:

  • ViT5 and BARTpho
python cli.py _evaluate --model_name 'ViT5' --dataset 'ViNewsQA' --answer 'y'

Note

--dataset: name of dataset

--answer: include an answer or not? 'y' for yes, 'n' for no. default='y'.

python cli.py _evaluate --model_name 'ViT5' --dataset 'ViNewsQA' --lr 1e-5 --batch_size 16 --epochs_num 10

System

Comming soon!

Citation

@inproceedings{inproceedings,
author = {Quoc-Hung, Pham and Le, Huu-Loi and Minh, Dang and Tran, Khang and Vu, Huy-The and Nguyen, Minh-Tien and Phan, Xuan-Hieu},
year = {2023},
month = {12},
pages = {324-329},
title = {Question Generation: An Experimental Study for Vietnamese Text},
doi = {10.1109/RIVF60135.2023.10471875}
}

About

This paper introduces a systematic and large-scale study of the Vietnamese question generation task. Different from prior work that only investigates the task with a small number (1-2) of datasets, the study reports the performance of question generation models on a wide range of Vietnamese machine reading comprehension corpora in different setting

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages