INDEX
Explanations
personal pronouns or possessive determiners combined with past tense verbs
the first-person singular pronoun
New Auto-Interp
Negative Logits
ICO
-0.78
acan
-0.73
Lat
-0.71
VD
-0.71
Highlands
-0.71
UCK
-0.71
INO
-0.69
AZ
-0.69
SUP
-0.68
UTE
-0.68
POSITIVE LOGITS
bum
0.74
phasis
0.74
ayer
0.73
disson
0.68
actionGroup
0.68
andum
0.66
alys
0.66
holiest
0.65
osph
0.64
etting
0.64
Activations Density 0.000%