INDEX
Explanations
proper nouns or specific entities
instances of the verb "have" in various grammatical contexts
New Auto-Interp
Negative Logits
arter
-0.71
apology
-0.62
ocol
-0.60
Apart
-0.60
Recap
-0.59
Anyway
-0.59
³³
-0.58
********
-0.57
dispatch
-0.57
Saying
-0.57
POSITIVE LOGITS
been
1.31
been
1.30
undergone
1.06
Been
0.98
endured
0.96
arisen
0.95
become
0.94
kell
0.90
existed
0.89
worked
0.88
Activations Density 0.181%