INDEX
Explanations
repeated pronouns in sentences
New Auto-Interp
Negative Logits
oon
-0.16
icket
-0.15
eyim
-0.15
oor
-0.15
олом
-0.15
jspx
-0.14
بÙĪØ§Ø¨Ø©
-0.14
azon
-0.14
emand
-0.14
_lite
-0.14
POSITIVE LOGITS
Bir
0.18
Bea
0.15
bir
0.15
clist
0.15
ensa
0.14
none
0.14
416
0.14
her
0.13
VI
0.13
_classification
0.13
Activations Density 0.089%