INDEX
Explanations
This neuron is activated by properly spelled words containing "ion"
the presence of specific characters or symbols typically associated with formatting or encoding issues in text
New Auto-Interp
Negative Logits
Engel
-0.67
Chop
-0.65
åį
-0.64
Ridley
-0.64
Dempsey
-0.62
Raf
-0.61
DragonMagazine
-0.60
Ãľ
-0.60
Rez
-0.59
srfAttach
-0.59
POSITIVE LOGITS
nesota
0.79
ith
0.75
lated
0.73
gress
0.72
der
0.72
duct
0.72
stru
0.71
ject
0.71
duc
0.71
cus
0.70
Activations Density 0.020%