INDEX
Explanations
references to luxury and high-end goods
New Auto-Interp
Negative Logits
rale
-0.15
oves
-0.15
isan
-0.15
abouts
-0.14
tery
-0.14
isters
-0.14
ike
-0.14
-called
-0.14
athon
-0.14
æģ
-0.14
POSITIVE LOGITS
urious
0.17
ARIANT
0.17
/gpl
0.17
kovi
0.16
ritt
0.15
SPA
0.15
kova
0.15
uries
0.15
Yao
0.14
chten
0.14
Activations Density 0.015%