INDEX
Explanations
verbs denoting a change or transition
instances of the word "became" in various contexts
New Auto-Interp
Negative Logits
oning
-0.79
inately
-0.79
inarily
-0.77
ramid
-0.76
enger
-0.72
aging
-0.65
orneys
-0.64
alian
-0.64
hov
-0.63
atching
-0.63
POSITIVE LOGITS
accustomed
0.95
extinct
0.93
entangled
0.89
embroiled
0.89
disillusion
0.83
acquainted
0.80
oslav
0.80
aware
0.79
undone
0.78
obsessed
0.77
Activations Density 0.051%