INDEX
Explanations
names and forum posts
references to games and gaming experiences.
This neuron activates on personal names (proper nouns) in the text.
New Auto-Interp
Negative Logits
灯
-0.07
shaded
-0.07
gerekmektedir
-0.07
.putString
-0.06
drying
-0.06
component
-0.06
chten
-0.06
ま
-0.06
adam
-0.06
edom
-0.06
POSITIVE LOGITS
」
0.07
queues
0.07
項目
0.07
Discrim
0.07
crawled
0.06
problém
0.06
Staten
0.06
Fix
0.06
_GF
0.06
stumbled
0.06
Activations Density 0.002%