Hyper-parameters | Settings |
---|---|
\(D_{d1}, \ D_{d2}\) | 64, 128 |
\(D_{t1}, \ D_{t2}, \ D_{t3}\) | 64, 6, 15 |
Learning rate | [1e-6, 5e-6, 8e-6, 1e-5](Davis), 2e-5(KIBA) |
Batch size | [32, 64, 128, 256](Davis), 128(KIBA) |
Interaction Block Head Number | [2, 4, 8, 16] |
Cross Attention Head Number | [2, 4, 8, 16] |