INDEX
Explanations
references to cigars or related luxury items
New Auto-Interp
Negative Logits
oil
-0.23
u
-0.22
o
-0.22
arf
-0.20
r
-0.20
L
-0.20
енÑĤ
-0.20
ultureInfo
-0.20
am
-0.19
ar
-0.19
POSITIVE LOGITS
ouncill
0.19
lique
0.19
ogn
0.19
heet
0.19
ycles
0.18
ANNOT
0.18
innamon
0.18
ramer
0.18
rossover
0.18
urr
0.18
Activations Density 0.055%