INDEX
Explanations
references to spicy food and seasoning
New Auto-Interp
Negative Logits
adow
-0.15
croll
-0.15
ç¯
-0.15
yclic
-0.15
esson
-0.14
strav
-0.14
dis
-0.14
ajor
-0.14
discrepan
-0.14
ÏĢλα
-0.14
POSITIVE LOGITS
peppers
0.40
chili
0.39
chill
0.36
pepper
0.34
Caps
0.34
Chili
0.32
spicy
0.30
Pepper
0.30
spice
0.29
hotter
0.29
Activations Density 0.097%