INDEX
Explanations
expressions related to progress and momentum
New Auto-Interp
Negative Logits
roat
-0.15
Norm
-0.14
eren
-0.14
indsight
-0.14
Å¥
-0.14
é¨ĵ
-0.14
sons
-0.14
utherford
-0.14
reen
-0.13
sth
-0.13
POSITIVE LOGITS
momentum
0.20
iglia
0.18
heading
0.16
rrha
0.15
oku
0.15
ertino
0.15
иÑİ
0.15
Momentum
0.15
ê±°ëŀĺ
0.14
toward
0.14
Activations Density 0.056%