INDEX
Explanations
words related to style or materials, particularly in the context of fashion or design
New Auto-Interp
Negative Logits
WARN
-0.76
DERR
-0.75
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.64
circulation
-0.63
Participant
-0.62
ajor
-0.62
Mutual
-0.62
女
-0.62
THR
-0.61
agher
-0.59
POSITIVE LOGITS
led
1.21
les
1.12
sty
1.06
gian
1.06
rette
1.04
lish
1.03
ling
1.03
rene
1.00
rian
0.90
rend
0.87
Activations Density 0.003%