INDEX
Explanations
stop words
This neuron detects special control/formatting tokens (e.g. header or end‐of‐text markers) rather than natural language words.
New Auto-Interp
Negative Logits
idia
-0.07
никами
-0.07
Mais
-0.07
alunos
-0.07
(Collision
-0.07
UDIO
-0.07
(pm
-0.07
_below
-0.06
shepherd
-0.06
(My
-0.06
POSITIVE LOGITS
aspberry
0.06
acionales
0.06
chrono
0.06
Raspberry
0.06
оброб
0.06
raspberry
0.06
pakistan
0.06
wrapper
0.06
']!='
0.06
المدينة
0.06
Activations Density 0.071%