INDEX
Explanations
This neuron activates on occurrences of the word “Hot” (in headings or as an adjective).
New Auto-Interp
Negative Logits
358
-0.07
sieve
-0.07
stance
-0.07
PACE
-0.07
Rape
-0.07
Raven
-0.07
-case
-0.06
aida
-0.06
learn
-0.06
Lace
-0.06
POSITIVE LOGITS
hot
0.18
Hot
0.17
Hot
0.14
HOT
0.12
hotspot
0.11
-hot
0.11
hot
0.11
hotline
0.10
hotter
0.10
hott
0.09
Activations Density 0.011%