INDEX
Explanations
This neuron is sensitive to occurrences of the word “feel” (and its immediate context), flagging descriptions of how something feels.
New Auto-Interp
Negative Logits
uke
-0.06
Sub
-0.06
出す
-0.06
rectangles
-0.06
saint
-0.06
взаєм
-0.06
Artem
-0.06
_two
-0.06
acker
-0.06
-mails
-0.06
POSITIVE LOGITS
feel
0.07
-original
0.07
gist
0.07
vibe
0.07
Characters
0.07
eur
0.07
countryside
0.07
conclus
0.06
endPoint
0.06
_CNTL
0.06
Activations Density 0.005%