INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
ripp
-0.17
ÑĢÑİ
-0.16
kud
-0.15
erdem
-0.15
Goth
-0.14
bias
-0.14
ibia
-0.14
compos
-0.14
rub
-0.14
hal
-0.14
POSITIVE LOGITS
Sofia
0.17
Telerik
0.17
Blvd
0.16
Bulgarian
0.15
имÑĥ
0.15
Dimit
0.15
soft
0.14
Bulgaria
0.14
Î
0.14
utherland
0.14
Activations Density 0.000%
No Known Activations
This feature has no known activations.