Eddie's Learning Record 25

1. Duration

Monday, November 21st, 2022 - Saturday, November 26th, 2022


2. Learning Record

2.1 Fine-tuned Models

When images have only two classes which means the labels are "0" and "1", the `label_mode` of `tf.keras.utils.image_dataset_from_directory` should be `binary` and the loss_fn should be `BinaryCrossentropy` instead of `SparseCategoricalCrossentropy`. Otherwise, the accuracy will jitter around 50%, which means the model learns nothing.

2.2 Learned Swin Transformer

I watched the paper [1] and read the code.

It took me a few days to understand the `Window-based Self-Attention & Shifted Window-based Self-Attention` and the `Swin Transformer Block`.

2.3 Refactored the Code

I built a py file to store all functions for loading the datasets. Consequently, the code could work only by changing the basee_dir of the dataset.


3. Feeling

3.1 Glad

I was glad that the models ran well.

3.2 Perplexed

The models seemed to generate relatively good results on some small datasets. But they didn't work that well on the micro-expression dataset.

Besides ViT, SL-ViT, Swin Transformer, I also found that there are lots of other types of transformer models. It seems impossible to learn all of them.

Eddie's Learning Records logo
Subscribe to Eddie's Learning Records and never miss a post.