INDEX
Explanations
The neuron fires on occurrences of the modal verb “can” (often in conditional/ability constructions).
New Auto-Interp
Negative Logits
-model
-0.07
Fetch
-0.06
SYSTEM
-0.06
skin
-0.06
Mode
-0.06
_dialog
-0.06
afety
-0.06
Bedford
-0.06
Span
-0.06
Benny
-0.06
POSITIVE LOGITS
Identifier
0.06
ाहत
0.06
ERGE
0.06
abile
0.06
_NONNULL
0.06
由
0.06
Mehmet
0.06
जम
0.06
ーブ
0.05
olve
0.05
Activations Density 0.018%