INDEX
Explanations
technical classifications/descriptions
This neuron activates on section‐heading phrases that introduce classification criteria—especially headings beginning with “By …” (e.g. “By affected component”).
New Auto-Interp
Negative Logits
Revision
-0.07
prognosis
-0.07
bottle
-0.07
значения
-0.07
History
-0.06
Rings
-0.06
Parking
-0.06
[--
-0.06
sociedad
-0.06
libertine
-0.06
POSITIVE LOGITS
beim
0.08
فق
0.07
_VERIFY
0.07
clears
0.06
taşın
0.06
답
0.06
.poly
0.06
ionale
0.06
kolem
0.06
ственное
0.06
Activations Density 0.150%