INDEX
Explanations
hypothetical scenarios
The neuron detects past‐conditional or counterfactual constructions (e.g. “would have,” “had been”).
New Auto-Interp
Negative Logits
نیست
-0.06
-cons
-0.06
诊
-0.06
uğ
-0.06
horrific
-0.06
rifle
-0.06
े↵
-0.06
.age
-0.06
entered
-0.06
momentos
-0.06
POSITIVE LOGITS
DEF
0.07
zą
0.07
acute
0.06
อห
0.06
(ball
0.06
_ARROW
0.06
_PED
0.06
bridal
0.06
dereg
0.06
actice
0.06
Activations Density 0.039%