INDEX
Explanations
non-English
The neuron activates on institution or affiliation names in academic paper metadata.
New Auto-Interp
Negative Logits
.charset
-0.07
:{-0.06
contrary
-0.06
world
-0.06
Multimedia
-0.06
_movies
-0.06
371
-0.06
світ
-0.06
cast
-0.06
اید
-0.06
POSITIVE LOGITS
velik
0.07
hood
0.06
Vibr
0.06
Дж
0.06
PLEASE
0.06
uppe
0.06
Fortunately
0.06
Igor
0.06
ripe
0.06
mutation
0.06
Activations Density 0.009%