INDEX
Explanations
The neuron primarily responds to action words—i.e. lexical verbs denoting processes or operations.
New Auto-Interp
Negative Logits
,total
-0.07
added
-0.07
three
-0.06
Boh
-0.06
couldn
-0.06
Nr
-0.06
dried
-0.06
ł
-0.06
Ca
-0.06
šť
-0.06
POSITIVE LOGITS
Connect
0.09
Assign
0.08
Convert
0.08
Upload
0.07
Kill
0.07
Persist
0.07
喝
0.07
0.07
weave
0.07
(connect
0.07
Activations Density 0.347%