INDEX
Explanations
phrases related to the reproductive and nurturing behaviors of animals
New Auto-Interp
Negative Logits
Women
-0.16
Women
-0.16
women
-0.15
Married
-0.15
vrouwen
-0.15
married
-0.15
horses
-0.14
zbo
-0.14
Dogs
-0.14
grandfather
-0.14
POSITIVE LOGITS
pup
0.29
pups
0.28
hatch
0.26
suck
0.26
cub
0.26
lings
0.26
baby
0.25
ling
0.25
nurs
0.24
ä»Ķ
0.23
Activations Density 0.093%