INDEX
Explanations
personal pronouns and verbs related to personal actions, especially in a narrative context
references to the subject "he" in various contexts
New Auto-Interp
Negative Logits
earch
-0.77
arsity
-0.69
ormal
-0.63
berra
-0.62
ienne
-0.62
atlantic
-0.62
rame
-0.61
ãĥ©ãĥ³
-0.61
Prem
-0.61
ullivan
-0.59
POSITIVE LOGITS
'll
1.10
'd
1.08
've
0.94
knew
0.90
're
0.89
didnt
0.86
drank
0.85
wandered
0.84
encount
0.83
wand
0.83
Activations Density 0.700%