INDEX
Explanations
instances of the word "pepper" in the text
references to pepper spray
New Auto-Interp
Negative Logits
ADRA
-0.83
hematic
-0.75
ģĸ
-0.71
ĸļ
-0.71
UE
-0.69
Objective
-0.67
urai
-0.66
Cannot
-0.63
ür
-0.63
Ĵ
-0.62
POSITIVE LOGITS
mint
1.58
flakes
0.95
spray
0.91
cone
0.89
vine
0.88
sprayed
0.87
oni
0.87
onis
0.86
pepper
0.84
cake
0.83
Activations Density 0.035%