INDEX
Explanations
mentions of the word "Nature"
references to the scientific journal "Nature."
New Auto-Interp
Negative Logits
acters
-0.81
oning
-0.76
onest
-0.75
rano
-0.74
rapnel
-0.71
rican
-0.71
aged
-0.70
itone
-0.70
rons
-0.70
rative
-0.67
POSITIVE LOGITS
Conserv
0.89
Nature
0.83
Tree
0.79
Conservation
0.79
preserves
0.79
nature
0.77
Plants
0.73
conservation
0.73
uces
0.73
Nature
0.70
Activations Density 0.019%