INDEX
Explanations
The neuron strongly activates on conditional “If…” disclaimers (e.g. “If a question…,” “If I don’t know…”) indicating uncertainty or caveats.
New Auto-Interp
Negative Logits
所以
-0.06
Περι
-0.06
-0.06
قرار
-0.06
پار
-0.06
デ
-0.06
泰
-0.06
statement
-0.06
застосування
-0.06
спад
-0.06
POSITIVE LOGITS
Jackets
0.07
fim
0.06
ΗΣ
0.06
Mansion
0.06
Courts
0.06
Deals
0.06
DebugEnabled
0.06
comprehensive
0.06
_instruction
0.06
σσα
0.06
Activations Density 0.009%