INDEX
Explanations
phrases related to surprise or realization
the phrase "Was" in the context of narratives or reflections
New Auto-Interp
Negative Logits
ogle
-0.66
aim
-0.61
afford
-0.59
disruptive
-0.58
come
-0.58
passively
-0.58
ICA
-0.57
acci
-0.57
taking
-0.56
investing
-0.56
POSITIVE LOGITS
Was
3.42
Was
2.91
Were
1.93
Did
1.88
was
1.73
Didn
1.65
WAS
1.64
Had
1.60
Is
1.59
Would
1.57
Activations Density 0.009%