INDEX
Explanations
The neuron primarily activates on the Unicode replacement character “�,” i.e. unrecognized or garbled tokens.
New Auto-Interp
Negative Logits
qml
-0.07
crushing
-0.07
-cluster
-0.06
rai
-0.06
utron
-0.06
ści
-0.06
THEIR
-0.06
Making
-0.06
CLEAN
-0.06
Craft
-0.06
POSITIVE LOGITS
โ
0.07
واح
0.07
로서
0.07
บ
0.06
mund
0.06
베
0.06
Oilers
0.06
dö
0.06
돌
0.06
ipline
0.06
Activations Density 0.006%