INDEX
Explanations
occurrences of the pronouns "I" and "I'm" in various forms
New Auto-Interp
Negative Logits
ouce
-0.15
anness
-0.14
roperties
-0.14
ieren
-0.14
lege
-0.14
úng
-0.13
longleftrightarrow
-0.13
isure
-0.13
TORT
-0.13
_png
-0.13
POSITIVE LOGITS
face
0.20
recently
0.20
have
0.18
18
0.18
facing
0.18
hava
0.17
face
0.17
éĿ¢
0.17
faced
0.17
develop
0.16
Activations Density 0.048%