INDEX
Explanations
This neuron detects mentions of women in the context of their social roles, status, or rights.
New Auto-Interp
Negative Logits
яти
-0.07
struction
-0.07
imals
-0.06
Vect
-0.06
rms
-0.06
एल
-0.06
Server
-0.06
skepticism
-0.06
�
-0.06
表示
-0.06
POSITIVE LOGITS
(tr
0.08
spy
0.07
=_('0.07
"}
0.07
:\\
0.06
(ball
0.06
(mc
0.06
jav
0.06
etmiş
0.06
Concurrent
0.06
Activations Density 0.031%