Snippets Groups Projects

"git@gitlab.doc.ic.ac.uk:aa19418/nlp-cw.git" did not exist on "ef85116b580c183608aa9afadc93b1550dc01348"

Different model results

efb4518 authored 3 years ago

ef85116b

ef85116b 3 years ago

Name	Last commit	Last update
.vscode
augmented_data
outputs
practice_splits
runs
.DS_Store
README.md
Reconstruct_and_RoBERTa_baseline_train_dev_dataset.ipynb
dontpatronizeme_categories.tsv
dontpatronizeme_pcl.tsv

What to do:

tokenisation

- auto tokeniser

data augmentation / preprocessing

- synonym (wordnet) (azhara)
- remove all caps
- add text to samples
- back translation (ella)
- feature space synonym replacemnet (emily)

hyperparameter tuning on roberta-base azhara

learning rate [0.0001, 0.0002, 0.0005, 0.001, 0.002, 0.005, 0.01]
optimizer on [AdamW, Adafactor]

ella

early stopping (only req for longer epochs)
num_train_epochs [1, 5, 10, 15, 20]

emily

train_batch_size [8, 16, 32, 64, 128]
scheduler ["linear_schedule_with_warmup", "polynomial_decay_schedule_with_warmup", "constant_schedule_with_warmup"]

cased or uncased
augmentation parameter tuning

percentage of word embeddings replaced in BERT (em)
- how much percentage of all sentences
synonym (azhara)
- percentage of words replacing
back translation (ella)
- which languages, and amount of languages

other

model ["facebook/bart-large-cnn", "distilroberta-base", "bert-base-cased"]
do a larger/smaller version for each model above
evaluate

one other newer model than roberta

sam