INDEX
Explanations
Document snippets
This neuron detects system or user instructions that define or assign an AI model’s role or capabilities (e.g. “You are about to immerse yourself into the role of another AI model known as…”).
New Auto-Interp
Negative Logits
Polar
-0.07
eties
-0.06
Měst
-0.06
polar
-0.06
Side
-0.06
crets
-0.06
Agency
-0.06
Day
-0.06
frame
-0.06
해
-0.06
POSITIVE LOGITS
التع
0.07
.ribbon
0.07
ImageUrl
0.07
dashed
0.06
Arithmetic
0.06
.enumer
0.06
((&___
0.06
Serialized
0.06
subcontract
0.06
групи
0.06
Activations Density 0.004%