INDEX
Explanations
This neuron detects mentions of “our team” or similarly phrased group references indicating the speaker’s own collaborative group.
New Auto-Interp
Negative Logits
Sick
-0.07
ارهای
-0.06
shipped
-0.06
Shapes
-0.06
oking
-0.06
horr
-0.06
arius
-0.06
blow
-0.06
chew
-0.06
apore
-0.06
POSITIVE LOGITS
team
0.12
Team
0.11
-team
0.07
TEAM
0.07
Teams
0.07
teams
0.07
ce
0.07
Loader
0.07
suite
0.07
team
0.07
Activations Density 0.030%