INDEX
Explanations
verbs related to taking action or providing guidance
New Auto-Interp
Negative Logits
itself
-0.16
/from
-0.15
certain
-0.14
ovaný
-0.14
themselves
-0.14
eview
-0.14
ле
-0.13
Certain
-0.13
stood
-0.13
zano
-0.13
POSITIVE LOGITS
yourself
0.33
your
0.24
åIJ§
0.24
yourselves
0.21
lah
0.21
ä½łçļĦ
0.21
Yourself
0.19
your
0.18
ye
0.17
ä¸Ģä¸ĭ
0.15
Activations Density 0.380%