INDEX
Explanations
expressions related to anticipation or expectation
New Auto-Interp
Negative Logits
look
-0.23
looking
-0.22
Look
-0.21
Looking
-0.21
looking
-0.21
look
-0.19
looks
-0.19
Looking
-0.19
Look
-0.18
LOOK
-0.18
POSITIVE LOGITS
forward
0.30
forward
0.24
FORWARD
0.23
-forward
0.21
forwarding
0.21
Forward
0.20
Forward
0.20
forwarded
0.19
fwd
0.19
.forward
0.18
Activations Density 0.015%