INDEX
Explanations
the word "Instead" within a sentence
instances of the word "Instead."
New Auto-Interp
Negative Logits
ental
-0.61
SF
-0.60
vation
-0.57
emate
-0.56
Condition
-0.53
stad
-0.53
"},"
-0.52
AG
-0.52
efe
-0.52
ggles
-0.52
POSITIVE LOGITS
of
0.72
we
0.71
opting
0.71
,
0.70
thereof
0.70
terness
0.68
preferring
0.66
ilon
0.66
,.
0.66
they
0.65
Activations Density 0.030%