INDEX
Explanations
medical studies
The neuron fires on phrases that state the study’s key findings or mechanistic conclusions.
New Auto-Interp
Negative Logits
sex
-0.07
callbacks
-0.07
حقوق
-0.07
bicycle
-0.06
.state
-0.06
olate
-0.06
Sig
-0.06
survey
-0.06
.images
-0.06
mixture
-0.06
POSITIVE LOGITS
addChild
0.06
edeyse
0.06
birlikte
0.06
长
0.06
мають
0.06
신
0.06
:"↵
0.06
TextStyle
0.06
Wiley
0.06
κατά
0.06
Activations Density 0.085%