INDEX
Explanations
Ratings/Numbers
The neuron primarily detects numeric expressions—especially ratings, percentages, and viewer‐count figures.
New Auto-Interp
Negative Logits
cou
-0.08
/ion
-0.07
�
-0.06
bp
-0.06
BP
-0.06
amort
-0.06
combines
-0.06
Soviets
-0.06
xấu
-0.06
담
-0.06
POSITIVE LOGITS
(Syntax
0.07
MIC
0.07
widow
0.07
Purple
0.06
βε
0.06
-Bar
0.06
_NUM
0.06
doc
0.06
عة
0.06
(Locale
0.06
Activations Density 0.006%