INDEX
Explanations
references to specific universities and academic institutions
New Auto-Interp
Negative Logits
acho
-0.17
Rua
-0.15
CLS
-0.14
è¹
-0.14
elsing
-0.14
vard
-0.14
Craft
-0.14
craft
-0.14
Pessoa
-0.14
ismo
-0.13
POSITIVE LOGITS
Bad
0.27
Burg
0.27
Pam
0.25
Ext
0.25
Cad
0.24
Mur
0.24
Zar
0.23
Cast
0.23
Vill
0.23
Tud
0.23
Activations Density 0.023%