INDEX
Explanations
actions related to physical confrontations or violence
New Auto-Interp
Negative Logits
&o
-0.14
tÃŃch
-0.14
Arrow
-0.14
nipples
-0.14
çŃĭ
-0.13
tik
-0.13
Forever
-0.13
tuk
-0.13
_PROTO
-0.13
bullet
-0.13
POSITIVE LOGITS
struggle
0.22
struggling
0.20
struggles
0.19
physical
0.18
physically
0.17
struggled
0.17
aspers
0.16
onica
0.16
compress
0.15
wrestling
0.15
Activations Density 0.090%