INDEX
Explanations
references to national identities and notable figures
New Auto-Interp
Negative Logits
arsi
-0.16
rfl
-0.15
ito
-0.15
#ad
-0.15
Ø
-0.15
GiỼi
-0.14
vet
-0.14
AD
-0.14
Gate
-0.14
attend
-0.14
POSITIVE LOGITS
arts
0.23
lid
0.23
leider
0.21
historic
0.20
advise
0.20
advoc
0.19
assist
0.19
dirig
0.18
lid
0.18
autoc
0.17
Activations Density 0.029%