INDEX
Explanations
The neuron activates on occurrences of the word “reason,” i.e. expressions of justification or cause.
New Auto-Interp
Negative Logits
anche
-0.07
criticised
-0.07
giatan
-0.06
metres
-0.06
terrorists
-0.06
image
-0.06
热
-0.06
furnishings
-0.06
_codec
-0.06
.experimental
-0.06
POSITIVE LOGITS
اگر
0.08
uestra
0.07
++]=
0.06
/column
0.06
oub
0.06
原因
0.06
rv
0.06
%@",
0.06
าจ
0.06
출장
0.06
Activations Density 0.011%