INDEX
Explanations
imperative verbs related to encouragement or action
New Auto-Interp
Negative Logits
ìĬµ
-0.08
è·¡
-0.07
gate
-0.07
byss
-0.07
ì§ij
-0.06
tingham
-0.06
came
-0.06
yster
-0.06
gard
-0.06
ưỡng
-0.06
POSITIVE LOGITS
ahead
0.14
Ahead
0.11
ahead
0.11
figure
0.10
Ahead
0.09
Go
0.09
Go
0.09
go
0.08
-go
0.08
åIJ§
0.08
Activations Density 0.008%