INDEX
Explanations
pictures or mentions of cute animals, especially kittens and puppies, as well as related terms such as adoption and care for pets
terms related to pets, particularly cats and dogs, along with their adoption and care
New Auto-Interp
Negative Logits
unda
-0.85
ijn
-0.75
Sachs
-0.71
idan
-0.71
nce
-0.69
states
-0.69
Shap
-0.68
ulkan
-0.68
iating
-0.67
ially
-0.66
POSITIVE LOGITS
puppies
1.13
euth
1.11
pets
1.09
kittens
1.08
paws
1.04
puppy
1.04
pup
1.03
zoo
0.97
veterinarian
0.97
kitten
0.97
Activations Density 0.137%