INDEX
Explanations
articles/prepositions
This neuron detects normative and policy‐oriented terms (e.g. responsible, ethical, transparent, monitored) that discuss how AI should be governed or used.
New Auto-Interp
Negative Logits
<Image
-0.07
imageSize
-0.07
ficken
-0.06
Adoles
-0.06
paced
-0.06
SIGN
-0.06
ê
-0.06
(IConfiguration
-0.06
deniz
-0.06
Sweat
-0.06
POSITIVE LOGITS
belly
0.07
(limit
0.07
=q
0.07
-upper
0.06
inidad
0.06
Okay
0.06
/sl
0.06
/small
0.06
totals
0.06
هر
0.06
Activations Density 0.028%