INDEX
Explanations
references to knives and stabbing actions
knife, knives
New Auto-Interp
Negative Logits
Greet
-0.53
sqcup
-0.49
Beal
-0.49
Urqu
-0.48
Popp
-0.48
Eas
-0.47
Leighton
-0.47
Beatty
-0.46
Prosper
-0.46
Erb
-0.46
POSITIVE LOGITS
Knife
1.33
knife
1.32
Knife
1.28
knife
1.26
knives
1.13
Knives
1.11
couteau
1.03
cuchillo
0.92
Kni
0.91
kni
0.73
Activations Density 0.004%