INDEX
Explanations
mentions of headwear items, particularly hats
the word "hat" and its variations in different contexts
New Auto-Interp
Negative Logits
hyde
-0.75
Vaugh
-0.65
ULTS
-0.64
lect
-0.64
FORMATION
-0.62
idge
-0.62
naire
-0.62
taining
-0.62
á½
-0.62
ADE
-0.61
POSITIVE LOGITS
chet
1.07
chery
1.04
ia
0.87
ullah
0.80
tha
0.77
ibu
0.77
ched
0.76
ewater
0.76
ulhu
0.76
nikov
0.74
Activations Density 0.013%