INDEX
Explanations
mentions of animal-related entities or topics
references to animals and animal-related organizations
New Auto-Interp
Negative Logits
chool
-0.87
creen
-0.85
Compton
-0.82
hips
-0.81
paces
-0.76
pring
-0.75
gaard
-0.72
ession
-0.71
lain
-0.71
jong
-0.71
POSITIVE LOGITS
carc
0.90
welfare
0.85
aclysm
0.83
kingdom
0.83
Animal
0.82
cruelty
0.81
mammal
0.79
animal
0.79
alogue
0.79
species
0.79
Activations Density 0.029%