INDEX
Explanations
conversational exchanges that involve humor and sarcasm.
This neuron activates on the phrase “AI assistant,” flagging references to the system or assistant role.
New Auto-Interp
Negative Logits
reusable
-0.07
Zones
-0.07
Suns
-0.06
Operations
-0.06
Chr
-0.06
-button
-0.06
芬
-0.06
monday
-0.06
(Float
-0.06
mothers
-0.06
POSITIVE LOGITS
oultry
0.08
wallpapers
0.08
MenuBar
0.06
garg
0.06
zdravot
0.06
*time
0.06
TreeSet
0.06
�
0.06
jorn
0.06
frontend
0.06
Activations Density 0.002%