INDEX
Explanations
first-person singular pronouns followed by modal verbs
first-person pronouns and expressions of personal thoughts or intentions
New Auto-Interp
Negative Logits
adra
-0.84
ante
-0.69
iment
-0.69
mentioned
-0.66
holding
-0.64
Footnote
-0.64
Same
-0.63
catentry
-0.62
reminders
-0.62
edition
-0.60
POSITIVE LOGITS
osaurs
0.79
ovie
0.78
iuses
0.69
uchi
0.65
uin
0.64
usha
0.64
Bout
0.62
ould
0.61
RAND
0.61
provoking
0.61
Activations Density 0.196%