INDEX
Explanations
references to sharp objects, particularly knives
references to knives in various contexts
New Auto-Interp
Negative Logits
mberg
-0.89
ICLE
-0.78
rian
-0.77
leep
-0.76
tainment
-0.76
Gutenberg
-0.75
ysical
-0.74
rians
-0.73
rix
-0.71
alse
-0.71
POSITIVE LOGITS
blades
1.02
blade
1.01
knife
0.98
knives
0.98
scissors
0.97
knife
0.93
slicing
0.89
cutter
0.88
wielding
0.85
powder
0.84
Activations Density 0.026%