INDEX
Explanations
mentions of pets and related terms
references to pets or animals
New Auto-Interp
Negative Logits
xual
-0.81
éĹĺ
-0.79
kefeller
-0.76
IDER
-0.74
hower
-0.68
mith
-0.67
Methodist
-0.67
doub
-0.65
ider
-0.65
seeded
-0.64
POSITIVE LOGITS
abyte
1.08
ertodd
1.04
abytes
1.00
pet
0.99
pee
0.98
rified
0.91
lyak
0.85
ri
0.81
unia
0.80
riages
0.79
Activations Density 0.014%