INDEX
Explanations
animal-related terms and organizations
references to animal welfare and rights
New Auto-Interp
Negative Logits
heit
-0.79
unda
-0.76
creen
-0.76
hips
-0.70
lain
-0.69
outer
-0.69
gren
-0.68
uden
-0.68
Sutherland
-0.65
chool
-0.65
POSITIVE LOGITS
cruelty
1.13
welf
1.07
kingdom
1.04
welfare
1.02
carc
1.02
Welfare
0.98
Cruel
0.97
domest
0.96
arium
0.96
species
0.91
Activations Density 0.057%