INDEX
Explanations
sentences discussing hypothetical scenarios or consequences
conditional and future tense verbs related to potential outcomes
New Auto-Interp
Negative Logits
jam
-0.64
Trap
-0.64
-0.63
ament
-0.62
DAQ
-0.62
weed
-0.61
monog
-0.60
reality
-0.60
Times
-0.58
ZI
-0.58
POSITIVE LOGITS
be
0.98
arrive
0.90
also
0.86
likewise
0.85
doubtless
0.84
undoubtedly
0.83
derive
0.82
furthermore
0.81
become
0.80
begin
0.79
Activations Density 0.340%