INDEX
Explanations
pronouns and possessive adjectives that indicate a personal connection or reference
New Auto-Interp
Negative Logits
477
-0.17
amage
-0.15
prototypes
-0.15
EXTERN
-0.15
-basket
-0.14
grade
-0.14
iname
-0.14
éĤ¦
-0.14
azer
-0.13
ibold
-0.13
POSITIVE LOGITS
ocz
0.17
uo
0.16
hardt
0.15
ourg
0.14
oust
0.14
éªĮ
0.14
.setSelected
0.14
upal
0.14
BUG
0.14
æĦı
0.14
Activations Density 0.003%