INDEX
Explanations
The neuron detects words that denote outcomes or consequences (e.g. “effect,” “side-effect,” “product,” etc.).
New Auto-Interp
Negative Logits
araştırma
-0.09
ToLocal
-0.07
oldemort
-0.06
�
-0.06
مبار
-0.06
ویرایش
-0.06
Garr
-0.06
istringstream
-0.06
ptive
-0.06
์↵↵
-0.06
POSITIVE LOGITS
CRM
0.07
TT
0.07
consequences
0.07
otos
0.07
Percentage
0.07
two
0.07
;amp
0.06
/c
0.06
consequence
0.06
surtout
0.06
Activations Density 0.025%