INDEX
Explanations
The neuron is primarily activated by the word “black” (including compounds that start with “black”).
New Auto-Interp
Negative Logits
fmt
-0.08
_trim
-0.07
fstream
-0.07
interpolate
-0.07
提
-0.07
WebDriverWait
-0.07
Trem
-0.07
namoro
-0.07
uyum
-0.07
Willow
-0.07
POSITIVE LOGITS
Black
0.17
Black
0.14
black
0.14
BLACK
0.13
black
0.13
blacklist
0.11
BLACK
0.10
blackout
0.09
Blacks
0.08
/black
0.08
Activations Density 0.020%