INDEX
Explanations
references to various types of clothing, with a strong preference for the word 'coat'
references to coats and jackets
New Auto-Interp
Negative Logits
KNOWN
-0.69
FER
-0.69
ENE
-0.68
quist
-0.68
icts
-0.67
erent
-0.67
iates
-0.66
iating
-0.65
chuk
-0.64
Refer
-0.61
POSITIVE LOGITS
coat
0.96
tails
0.96
sleeves
0.93
jacket
0.92
sleeve
0.88
pins
0.87
worn
0.84
wearer
0.83
hair
0.81
maker
0.81
Activations Density 0.043%