INDEX
Explanations
important
This neuron activates on occurrences of the adjective “important.”
New Auto-Interp
Negative Logits
32
-0.07
266
-0.07
scan
-0.07
wake
-0.07
Ethernet
-0.07
34
-0.07
agation
-0.07
392
-0.07
sea
-0.07
340
-0.07
POSITIVE LOGITS
important
0.10
Important
0.09
importance
0.09
IMPORTANT
0.08
Important
0.07
Importance
0.07
vriend
0.07
_REQUIRE
0.07
gehört
0.07
Hancock
0.07
Activations Density 0.049%