INDEX
Negative Logits
insensitive
0.46
εἰ
0.45
contribute
0.42
本当に
0.41
ignore
0.40
忍
0.40
guess
0.40
contributes
0.40
guessing
0.39
góp
0.39
POSITIVE LOGITS
supervision
1.14
Supervision
1.05
supervised
1.04
supervise
0.96
supervis
0.94
supervised
0.92
supervising
0.90
Supervis
0.85
监督
0.77
chaperone
0.76
Activations Density 0.040%