INDEX
Explanations
sensitive topics
The neuron activates on tokens involved in asking about personal financial information—especially salary or income inquiries.
content that is sexually explicit, private, or otherwise policy-sensitive (requests about genitalia, sexual acts, personal/medical details) and triggers moderation/refusal language.
mentions of sexual or anatomically intimate topics and taboo personal questions, especially references to genitals, sexual activity, or stigmatized subjects.
New Auto-Interp
Negative Logits
工業
-0.06
Graphics
-0.06
підпис
-0.06
863
-0.06
}",
-0.06
63
-0.06
��
-0.06
forensic
-0.06
Produk
-0.06
Grid
-0.06
POSITIVE LOGITS
->{'0.08
ILT
0.07
\Migrations
0.07
setError
0.07
ABS
0.06
.sin
0.06
.symmetric
0.06
CDDL
0.06
UIViewController
0.06
.AutoScaleDimensions
0.06
Activations Density 0.128%