INDEX
Explanations
graphic sexual content and themes of eroticism.
The neuron activates on second-person pronouns (e.g., “you,” “your”).
New Auto-Interp
Negative Logits
arising
-0.08
以
-0.07
finds
-0.07
thy
-0.07
itated
-0.06
PLL
-0.06
zb
-0.06
cries
-0.06
.every
-0.06
Khi
-0.06
POSITIVE LOGITS
.AddField
0.07
Essentials
0.06
getType
0.06
store
0.06
bailout
0.06
Authenticated
0.06
valueType
0.06
λαν
0.06
@_;↵
0.06
روم
0.06
Activations Density 0.001%