INDEX
Explanations
font size
This neuron never fires—its activations are all zero—so it isn’t detecting any particular token or pattern.
New Auto-Interp
Negative Logits
voice
-0.07
_json
-0.07
center
-0.07
records
-0.07
But
-0.07
字幕
-0.07
waves
-0.06
-ф
-0.06
Blood
-0.06
.handlers
-0.06
POSITIVE LOGITS
伸
0.07
ρη
0.06
fos
0.06
حذف
0.06
resa
0.06
meses
0.06
.firstChild
0.06
vido
0.06
FormsModule
0.06
การส
0.06
Activations Density 0.002%