INDEX
Explanations
mentions of smoking-related items like cigars and cigarette related terms
references to cigars and related smoking items or contexts
New Auto-Interp
Negative Logits
tics
-0.80
Ü
-0.75
Livingston
-0.73
yll
-0.69
MN
-0.67
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.67
PASS
-0.66
Bicycle
-0.66
Cry
-0.66
earch
-0.65
POSITIVE LOGITS
cigars
1.05
cigar
0.95
illo
0.91
ensis
0.89
wrapper
0.88
Wra
0.84
ango
0.84
Maduro
0.83
inals
0.80
rito
0.80
Activations Density 0.021%