INDEX
Explanations
This neuron responds to informal self-referential language—casual first-person pronouns and filler words (e.g. “I,” “just,” “another,” “guy”) that signal a conversational, colloquial tone.
New Auto-Interp
Negative Logits
BTTag
-0.07
Iran
-0.07
offenses
-0.07
onChange
-0.07
.ArgumentParser
-0.07
workplaces
-0.06
Pages
-0.06
repeated
-0.06
очист
-0.06
letion
-0.06
POSITIVE LOGITS
FXML
0.07
PEAR
0.07
,以及
0.06
zens
0.06
gaz
0.06
CKET
0.06
rámci
0.06
Caucas
0.06
(credentials
0.06
�
0.06
Activations Density 0.075%