INDEX
Explanations
diverse topics
This neuron activates on substantive nouns—especially longer, content-carrying terms—highlighting key topic words in the text.
New Auto-Interp
Negative Logits
OTS
-0.07
سم
-0.07
득
-0.07
mnoha
-0.06
azi
-0.06
Greatest
-0.06
Scope
-0.06
hots
-0.06
irical
-0.06
стен
-0.06
POSITIVE LOGITS
nitel
0.07
�
0.06
953
0.06
gonna
0.06
(property
0.06
�
0.06
.ToArray
0.06
dividing
0.06
>>
0.06
paycheck
0.06
Activations Density 0.109%