INDEX
Explanations
making the world better
The neuron activates strongly on optimistic calls to “make the world a better place,” i.e. uplifting, world-improvement slogans.
New Auto-Interp
Negative Logits
قر
-0.07
Military
-0.07
declar
-0.07
ERROR
-0.06
_message
-0.06
][:
-0.06
849
-0.06
_aux
-0.06
.Render
-0.06
cred
-0.06
POSITIVE LOGITS
Lok
0.06
ToProps
0.06
IDEOS
0.06
うん
0.06
[contains
0.06
ülk
0.06
ừng
0.06
원의
0.06
ève
0.06
有些
0.05
Activations Density 0.021%