1. Duration
Saturday, October 8th, 2022 - Friday, October 14th, 2022
2. Learning Record
2.1 Code Refactoring
The refactoring is done on Sunday, October 9th, 2022, I supposed. I did the first experiment on the server, which is unbelievable. It was really too late for a postgraduate to start the project.
Sadly I encountered a problem, not because of my code. The CUDA version is v11.1 which is much too old. I contact the manager and version 11.6 is successfully installed on Wednesday, October 12th. But the problem of missing the zlibwapi.dll also occurred. Did this demonstrate that there is no one using the CUDA on the server? Otherwise, they must realize this problem.
Anyway, the code run well and I got the result the author mentioned in the paper. It's time to learn the attention mechanism.
2.2 Learning the Attention Mechanism
I understood what the videos said but the code seems to be so different from what the tutorials taught in the videos. It is far more difficult than CNN and RNN.
Besides, I found that the d2l is not a package but some .py files with amazing functions. The d2l python files are the most amazing python files I ever see.
3. Feelings
3.1 Relaxed
At least, I have done a successful experiment.
3.2 Worried
The notion of applying the attention mechanism is great and it is supposed to work well. But I worried that I could not succeed as ViT is quite new.