INDEX
Explanations
hypothetical questions using the phrase "What if."
conditional phrases that prompt hypothetical scenarios
New Auto-Interp
Negative Logits
omi
-0.72
nect
-0.69
================================================================
-0.69
arm
-0.66
depths
-0.66
bird
-0.65
WAYS
-0.65
avor
-0.64
horse
-0.64
psey
-0.63
POSITIVE LOGITS
Gutenberg
0.74
thou
0.72
you
0.72
they
0.72
yip
0.70
fy
0.69
hypot
0.69
landlords
0.64
Melania
0.63
we
0.62
Activations Density 0.031%