INDEX
Explanations
dynamic verbs that convey movement or action
New Auto-Interp
Negative Logits
rips
-0.15
oh
-0.15
udd
-0.14
redi
-0.14
going
-0.14
Instructions
-0.14
reds
-0.14
pas
-0.13
Libert
-0.13
statistics
-0.13
POSITIVE LOGITS
into
0.26
onto
0.18
away
0.17
into
0.17
oggle
0.16
Into
0.16
tle
0.16
.yy
0.15
akedown
0.15
Gon
0.15
Activations Density 0.220%