INDEX
Explanations
words related to postal mail or delivery
references to mail and postal services
New Auto-Interp
Negative Logits
Haram
-0.69
Vulkan
-0.66
abama
-0.64
Argon
-0.64
Sioux
-0.63
LORD
-0.62
oulos
-0.61
Vas
-0.61
Guth
-0.61
ivia
-0.60
POSITIVE LOGITS
boxes
1.55
bag
1.35
box
1.23
bags
1.13
1.07
letter
0.98
0.98
mailbox
0.97
boxing
0.93
trap
0.92
Activations Density 0.041%