INDEX
Explanations
This neuron detects negation or contrast cues (e.g. words like “but,” “without,” “not”) in the text.
New Auto-Interp
Negative Logits
-0.06
asa
-0.06
_bw
-0.06
Robotics
-0.06
touchdowns
-0.06
=&
-0.06
_LOW
-0.06
Logged
-0.06
dumped
-0.06
inflicted
-0.06
POSITIVE LOGITS
surf
0.07
mysql
0.06
nevertheless
0.06
معد
0.06
클
0.06
_interest
0.06
Nevertheless
0.06
太阳城
0.06
attern
0.06
şk
0.06
Activations Density 0.037%