INDEX
Explanations
the phrase "has" indicating possession or completion in various contexts
references to people's actions or accomplishments
New Auto-Interp
Negative Logits
Interested
-0.80
OOL
-0.73
seless
-0.71
typ
-0.65
winner
-0.64
Apart
-0.63
eem
-0.62
—-
-0.61
selves
-0.60
Recap
-0.59
POSITIVE LOGITS
been
1.46
been
1.22
undergone
1.16
risen
1.11
gotten
1.09
vowed
1.08
become
1.07
amassed
1.06
gone
1.05
spoken
1.03
Activations Density 0.216%