INDEX
Explanations
occurrences of personal pronouns, particularly "me" and "us."
New Auto-Interp
Negative Logits
ogui
-0.17
ught
-0.16
lant
-0.16
HashCode
-0.15
olet
-0.15
æĮĤ
-0.14
ifice
-0.14
apur
-0.14
iset
-0.14
âŁ
-0.14
POSITIVE LOGITS
ROC
0.15
ãĤ·ãĥ¼
0.15
/us
0.15
761
0.14
Tiffany
0.13
220
0.13
Morr
0.13
nÃło
0.13
752
0.13
unsus
0.13
Activations Density 0.023%