INDEX
Explanations
references to domestic animals, particularly pets
references to pets
New Auto-Interp
Negative Logits
xual
-0.80
éĹĺ
-0.79
IDER
-0.76
kefeller
-0.75
Methodist
-0.74
hower
-0.71
doub
-0.68
seeded
-0.67
ider
-0.67
mith
-0.67
POSITIVE LOGITS
ertodd
1.06
pet
1.02
abyte
1.01
pee
0.97
abytes
0.93
lyak
0.87
rified
0.82
pets
0.78
apixel
0.78
WOR
0.78
Activations Density 0.011%