INDEX
Explanations
terms related to dogs
terms related to animal welfare and treatment
New Auto-Interp
Negative Logits
ricular
-0.66
trustworthy
-0.65
FUL
-0.64
ILY
-0.64
href
-0.64
nonexistent
-0.61
Sah
-0.59
indebted
-0.58
obligatory
-0.58
mobility
-0.58
POSITIVE LOGITS
Turtles
0.76
Rats
0.74
chickens
0.68
goats
0.67
Bees
0.67
orld
0.66
turtles
0.65
Cosponsors
0.65
zees
0.64
chicks
0.64
Activations Density 0.358%