INDEX
Explanations
uncertain
This neuron activates on words expressing uncertainty or unpredictability (e.g., “uncertain,” “uncertainty,” “unpredictable”).
New Auto-Interp
Negative Logits
445
-0.07
Loy
-0.07
vou
-0.07
вор
-0.07
"display
-0.07
296
-0.06
Loves
-0.06
بإ
-0.06
Shown
-0.06
appName
-0.06
POSITIVE LOGITS
uncertainty
0.13
uncertain
0.11
uncertainties
0.10
cy
0.08
tomorrow
0.07
ypical
0.07
uncert
0.07
urray
0.07
doubt
0.07
erty
0.07
Activations Density 0.009%