INDEX
Explanations
This neuron strongly activates on tokens that occur inside quoted speech.
New Auto-Interp
Negative Logits
ozy
-0.06
該
-0.06
ainty
-0.06
scaler
-0.06
invert
-0.06
Goat
-0.06
PostExecute
-0.06
ulus
-0.06
“My
-0.06
944
-0.06
POSITIVE LOGITS
čá
0.06
Thunder
0.06
lodging
0.06
herb
0.06
cél
0.06
chef
0.06
_SECTION
0.06
_WITH
0.06
Shopping
0.06
ippines
0.06
Activations Density 0.070%