1. Duration

Monday, November 21st, 2022 - Saturday, November 26th, 2022

2. Learning Record

2.1 Fine-tuned Models

When images have only two classes which means the labels are "0" and "1", the `label_mode` of `tf.keras.utils.image_dataset_from_directory` should be `binary` and the loss_fn should be `BinaryCrossentropy` instead of `SparseCategoricalCrossentropy`. Otherwise, the accuracy will jitter around 50%, which means the model learns nothing.

2.2 Learned Swin Transformer

I watched the paper [1] and read the code.

It took me a few days to understand the `Window-based Self-Attention & Shifted Window-based Self-Attention` and the `Swin Transformer Block`.

2.3 Refactored the Code

I built a py file to store all functions for loading the datasets. Consequently, the code could work only by changing the basee_dir of the dataset.

3. Feeling

3.1 Glad

I was glad that the models ran well.

3.2 Perplexed

The models seemed to generate relatively good results on some small datasets. But they didn't work that well on the micro-expression dataset.

Besides ViT, SL-ViT, Swin Transformer, I also found that there are lots of other types of transformer models. It seems impossible to learn all of them.

Eddie's Learning Record 25

Eddie He