INDEX
Explanations
This neuron selectively activates on machine-learning jargon—especially references to “model,” “pre-trained,” “fine-tuning,” and similar training-related terms.
New Auto-Interp
Negative Logits
أنا
-0.08
SZ
-0.07
_REPORT
-0.07
similarities
-0.06
mushrooms
-0.06
їна
-0.06
Rating
-0.06
given
-0.06
Amb
-0.06
veget
-0.06
POSITIVE LOGITS
ับร
0.06
Nano
0.06
herit
0.06
串
0.06
828
0.06
závě
0.06
subscriptions
0.06
full
0.06
.Global
0.06
位
0.06
Activations Density 0.016%