## What to do:

# tokenisation
    - auto tokeniser
# data augmentation / preprocessing
    - synonym (wordnet) (azhara)
    - remove all caps
    - add text to samples
    - back translation (ella)
    - feature space synonym replacemnet (emily)

# hyperparameter
azhara
- learning rate [0.0001, 0.0002, 0.0005, 0.001, 0.002, 0.005, 0.01] 
- optimizer on [AdamW, Adafactor]

ella
- early stopping (only req for longer epochs)
- num_train_epochs [1, 5, 10, 15, 20]

emily
- train_batch_size [8, 16, 32, 64, 128]
- scheduler ["linear_schedule_with_warmup", "polynomial_decay_schedule_with_warmup", "constant_schedule_with_warmup"]

# creative stuff
- model ["facebook/bart-large-cnn", "distilroberta-base", "bert-base-cased"]


# for augmentation
- percentage of word embeddings replaced in BERT (em)
    - how much percentage of all sentences

- synonym (azhara)
    - percentage of words replacing

- back translation (ella)
    - which languages, and amount of languages


- evaluate 


# one other newer model than roberta


# sam