INDEX
Explanations
phone components
The neuron activates on structural/metadata tokens (e.g. header markers) and URL/domain-name fragments in the text.
New Auto-Interp
Negative Logits
�
-0.07
avant
-0.07
نسخه
-0.06
_Delay
-0.06
zheimer
-0.06
Logo
-0.06
thé
-0.06
reklam
-0.06
posure
-0.06
Collect
-0.06
POSITIVE LOGITS
selves
0.06
Billing
0.06
Dangerous
0.06
appeals
0.06
Pablo
0.06
{!!0.06
ipy
0.06
convincing
0.06
demol
0.06
будь
0.06
Activations Density 0.026%