INDEX
Explanations
The neuron is looking for words related to political figures and societal issues
connective phrases and conjunctions that indicate relationships or transitions between ideas
New Auto-Interp
Negative Logits
ãĥ¢
-0.74
orthy
-0.65
taker
-0.64
lein
-0.63
uria
-0.62
omal
-0.62
imester
-0.61
pee
-0.61
rison
-0.60
myra
-0.60
POSITIVE LOGITS
etc
1.55
whatever
1.25
blah
1.06
whatever
1.05
etc
0.94
assorted
0.93
everything
0.91
whichever
0.89
respectively
0.85
anything
0.84
Activations Density 0.195%