INDEX
Explanations
instances of sentences ending with the word "when"
instances of "when" indicating conditional or temporal contexts
New Auto-Interp
Negative Logits
awar
-0.72
hack
-0.70
tip
-0.68
vantage
-0.67
ilan
-0.67
yp
-0.67
vari
-0.66
oire
-0.66
ve
-0.66
fighter
-0.65
POSITIVE LOGITS
they
0.79
nobody
0.75
compared
0.72
everyone
0.72
there
0.71
everybody
0.69
THEY
0.69
we
0.69
simultaneously
0.67
expecting
0.66
Activations Density 0.118%