INDEX
Explanations
phrases related to progress or advancement
New Auto-Interp
Negative Logits
pmwiki
-0.69
uded
-0.68
iciency
-0.67
ilies
-0.65
bryce
-0.63
iuses
-0.61
atur
-0.60
krit
-0.59
ateur
-0.59
unts
-0.58
POSITIVE LOGITS
forward
1.41
forwards
1.20
toward
1.16
towards
1.16
onward
1.11
forward
1.10
ahead
1.03
Forward
1.02
onwards
0.95
backwards
0.94
Activations Density 0.143%