INDEX
Explanations
references to biological concepts or distinctions between biological entities
references to biological concepts and classifications related to gender
New Auto-Interp
Negative Logits
ership
-0.84
naire
-0.82
esty
-0.77
eeper
-0.77
eca
-0.71
ebook
-0.70
rising
-0.69
adra
-0.69
essee
-0.67
urat
-0.65
POSITIVE LOGITS
plaus
1.05
sciences
0.95
biologist
0.89
biologists
0.87
anthropology
0.84
biology
0.83
immortality
0.82
organisms
0.80
organism
0.80
evolution
0.80
Activations Density 0.021%