INDEX
Explanations
The neuron specifically detects occurrences of the word “colon” (as in colon, colonoscopy, colorectal, etc.).
New Auto-Interp
Negative Logits
unh
-0.07
Brent
-0.07
mrt
-0.07
ROLE
-0.07
ffa
-0.07
Іван
-0.07
ITTLE
-0.07
闪
-0.07
میک
-0.06
Jade
-0.06
POSITIVE LOGITS
colon
0.14
Colon
0.09
Clinton
0.08
τον
0.08
alon
0.08
Clinton
0.07
colonization
0.07
Colon
0.07
ON
0.07
ör
0.07
Activations Density 0.004%