INDEX
Explanations
mentions or descriptions of puppies
mentions of puppies and related terms
New Auto-Interp
Negative Logits
NetMessage
-0.82
reens
-0.76
Sins
-0.75
inav
-0.73
lor
-0.71
ORGE
-0.71
arta
-0.71
76561
-0.69
dos
-0.69
auntlets
-0.68
POSITIVE LOGITS
puppy
1.29
pup
1.12
puppies
1.11
kitten
1.01
Pupp
0.89
riages
0.77
etsk
0.73
stakes
0.73
dog
0.71
kittens
0.71
Activations Density 0.012%