INDEX
Explanations
references to violence or physical confrontation
New Auto-Interp
Negative Logits
солю
-0.45
ConstraintMaker
-0.44
AnchorStyles
-0.33
+#+#
-0.32
muualla
-0.32
TryDecodeAsNil
-0.32
ETTE
-0.31
↖
-0.31
EnableWeb
-0.31
Lip
-0.31
POSITIVE LOGITS
hammer
0.93
hammers
0.79
hammer
0.73
mallet
0.72
hammering
0.66
Hammer
0.65
Hammer
0.61
hitting
0.60
banging
0.57
hammered
0.57
Activations Density 0.288%