INDEX
Explanations
instances of the verb "do" and its various conjugations in different contexts
New Auto-Interp
Negative Logits
urette
-0.17
egin
-0.17
orro
-0.16
paring
-0.15
paragraph
-0.15
rema
-0.14
avra
-0.14
лÑİб
-0.14
urge
-0.14
rror
-0.14
POSITIVE LOGITS
unto
0.28
battle
0.24
right
0.24
damage
0.23
everything
0.22
favors
0.22
what
0.22
things
0.22
whatever
0.22
battle
0.22
Activations Density 0.080%