INDEX
Explanations
This neuron activates on the adjective “unique” (and close variants), i.e., it spotlights the word “unique” describing something distinctive.
New Auto-Interp
Negative Logits
Sharper
-0.07
Pages
-0.07
Fi
-0.07
.gif
-0.06
carousel
-0.06
发布
-0.06
Hard
-0.06
Thumbnail
-0.06
dense
-0.06
态
-0.06
POSITIVE LOGITS
diese
0.06
venture
0.06
liberalism
0.06
elim
0.06
argin
0.06
ennessee
0.06
_ORIENTATION
0.06
heels
0.06
izont
0.06
awaited
0.06
Activations Density 0.035%