INDEX
Explanations
instances of the word "other" and its variations
New Auto-Interp
Negative Logits
-equiv
-0.16
OfYear
-0.14
bÃŃr
-0.14
æĪIJ人
-0.14
ÑĢина
-0.14
zcze
-0.14
Seks
-0.13
ENCIL
-0.13
ëįĶëĭĪ
-0.13
atic
-0.13
POSITIVE LOGITS
ones
0.23
ones
0.19
maal
0.19
Ones
0.15
ONES
0.15
chter
0.15
others
0.14
ôm
0.14
jen
0.14
åij¢
0.14
Activations Density 0.045%