INDEX
Explanations
qualities or characteristics
This neuron detects nominalizations ending in “-ment.”
New Auto-Interp
Negative Logits
antibiotics
-0.07
chill
-0.06
lx
-0.06
hips
-0.06
네
-0.06
newSize
-0.06
ा.
-0.06
шила
-0.06
paypal
-0.06
ন
-0.06
POSITIVE LOGITS
GUID
0.07
"D
0.07
Muon
0.06
fakt
0.06
Optionally
0.06
.Business
0.06
idunt
0.06
JR
0.06
Bond
0.06
;;↵↵
0.06
Activations Density 0.065%