INDEX
Explanations
instances of violent actions and their descriptions
punching and attacking
New Auto-Interp
Negative Logits
FetchType
-0.32
ikbaar
-0.29
dientemente
-0.27
需要
-0.27
tabilité
-0.26
ibilidade
-0.26
</tfoot>
-0.26
hörigen
-0.24
ność
-0.24
productId
-0.24
POSITIVE LOGITS
punches
0.71
punch
0.68
attack
0.66
punching
0.64
hitting
0.63
attack
0.62
Punch
0.61
attacking
0.60
enfans
0.59
shots
0.59
Activations Density 0.201%