INDEX
Explanations
words related to clothing items
instances of the word "coat."
New Auto-Interp
Negative Logits
nir
-0.71
Zucker
-0.67
Prob
-0.66
Ambro
-0.66
die
-0.66
sett
-0.65
icult
-0.64
raltar
-0.64
KNOWN
-0.63
Kut
-0.62
POSITIVE LOGITS
coat
1.00
coats
0.95
Coat
0.91
pins
0.88
anguage
0.80
coated
0.79
creen
0.79
coating
0.77
idon
0.75
rust
0.73
Activations Density 0.014%