INDEX
Explanations
The neuron selectively activates on the hedge word “certain,” flagging qualifying or tentative statements.
New Auto-Interp
Negative Logits
playoffs
-0.08
launch
-0.07
ripple
-0.07
ัวอย
-0.07
只是
-0.07
leaps
-0.07
746
-0.07
ピ
-0.06
персп
-0.06
пе
-0.06
POSITIVE LOGITS
certain
0.14
Certain
0.12
Certain
0.12
ein
0.08
某
0.08
an
0.08
Rat
0.07
Kenneth
0.07
.den
0.07
Cath
0.07
Activations Density 0.019%