INDEX
Explanations
references to luxury and products associated with high status
New Auto-Interp
Negative Logits
aru
-0.16
omor
-0.15
sted
-0.15
ıza
-0.15
memberOf
-0.14
oble
-0.14
EDIA
-0.14
iced
-0.14
rame
-0.14
ream
-0.13
POSITIVE LOGITS
icana
0.19
ville
0.16
eration
0.16
ãĥĭãĥ¼
0.15
meth
0.15
erosis
0.15
uml
0.15
ifix
0.14
iverz
0.14
odus
0.14
Activations Density 0.007%