INDEX
Explanations
mentions of animals or references related to animals
references to animals and animal welfare topics
New Auto-Interp
Negative Logits
jong
-0.74
orship
-0.66
Compton
-0.66
unda
-0.65
nance
-0.64
inen
-0.64
sugg
-0.63
lining
-0.63
heit
-0.62
pai
-0.61
POSITIVE LOGITS
animal
1.20
animals
1.06
animal
1.02
carc
1.02
mammals
1.01
mammal
1.01
Animal
0.97
Animal
0.89
species
0.86
Anim
0.85
Activations Density 0.012%