INDEX
Explanations
Comparisons and Qualifications
The neuron activates on mentions of “divergence” (and closely related discussion terms, e.g. types of divergence, hidden divergence, regular divergence).
New Auto-Interp
Negative Logits
Browser
-0.07
>).
-0.06
aget
-0.06
usband
-0.06
ipment
-0.06
unemployment
-0.06
Knight
-0.06
Henderson
-0.06
وزارت
-0.06
unseren
-0.06
POSITIVE LOGITS
/as
0.07
progn
0.07
-available
0.06
길
0.06
Traits
0.06
ret
0.06
nond
0.06
莎
0.06
εγκα
0.06
μου
0.06
Activations Density 0.128%