INDEX
Explanations
Explanation of neuron 4 behavior: the main thing this neuron does is find personal and possessive pronouns (e.g. “his,” “own,” “our”).
legal terminology related to theft and punishment.
New Auto-Interp
Negative Logits
وأن
-0.06
fazla
-0.06
作品
-0.06
Stad
-0.06
Baton
-0.06
etat
-0.06
přiz
-0.06
Drv
-0.06
励
-0.06
Goat
-0.06
POSITIVE LOGITS
yc
0.08
foreground
0.07
eração
0.06
υμ
0.06
Bluetooth
0.06
reprint
0.06
disciplined
0.06
guarantees
0.06
uniforms
0.06
$↵
0.06
Activations Density 0.000%