INDEX
Explanations
references to sharp objects and weapons
New Auto-Interp
Negative Logits
Bullock
-0.67
bli
-0.53
Wre
-0.51
Volu
-0.49
nsyn
-0.49
MathML
-0.48
ballast
-0.48
hul
-0.48
coaster
-0.48
hup
-0.46
POSITIVE LOGITS
knife
1.34
knives
1.30
sword
1.20
blades
1.19
blade
1.17
Knives
1.16
swords
1.15
Knife
1.15
Knife
1.13
Blades
1.11
Activations Density 0.392%