INDEX
Explanations
requests for suggestions or recommendations.
This neuron responds to negation phrases that express someone lacking or not having something (e.g. “don’t have one yet”).
New Auto-Interp
Negative Logits
-icons
-0.07
_train
-0.07
old
-0.07
Collider
-0.06
Problem
-0.06
snap
-0.06
thed
-0.06
fit
-0.06
Olymp
-0.06
(Http
-0.06
POSITIVE LOGITS
enefit
0.07
iPad
0.06
strugg
0.06
WORD
0.06
Agents
0.06
ención
0.06
يلا
0.06
ريف
0.06
edi
0.06
울
0.06
Activations Density 0.215%