INDEX
Explanations
mentions of elephants and related terms, with a focus specifically on elephant ivory
references to elephants and related terminology
New Auto-Interp
Negative Logits
pring
-0.88
ndra
-0.84
nd
-0.83
nder
-0.83
nda
-0.80
lly
-0.79
nces
-0.79
vous
-0.78
nergy
-0.75
lying
-0.74
POSITIVE LOGITS
elephant
1.49
elephants
1.42
iasis
1.22
Elephant
1.20
poaching
0.97
herds
0.96
ivory
0.92
calf
0.90
monary
0.88
penis
0.88
Activations Density 0.015%