INDEX
Explanations
words related to clothing or accessories
words related to wearing
New Auto-Interp
Negative Logits
����
-0.87
UI
-0.70
Luc
-0.64
TA
-0.64
MAS
-0.63
dem
-0.62
PD
-0.62
Niet
-0.61
Delta
-0.60
Tok
-0.59
POSITIVE LOGITS
earing
1.12
onite
0.92
eared
0.82
pless
0.82
earance
0.80
antly
0.78
ily
0.77
unci
0.77
iston
0.77
anguage
0.75
Activations Density 0.008%