INDEX
Explanations
, the main thing this neuron does is find sentences related to health topics and medical conditions
New Auto-Interp
Negative Logits
=#
-0.58
"]=>
-0.55
ļéĨĴ
-0.55
cott
-0.54
Respons
-0.52
weeds
-0.51
Camel
-0.50
num
-0.49
illion
-0.49
robe
-0.49
POSITIVE LOGITS
albeit
1.14
moreover
0.88
however
0.87
alas
0.86
albeit
0.80
somew
0.79
therefore
0.75
perhaps
0.74
though
0.74
notwithstanding
0.72
Activations Density 0.303%