INDEX
Explanations
words related to cigarettes and smoking
references to cigarettes and smoking
New Auto-Interp
Negative Logits
ede
-0.76
Ü
-0.71
herty
-0.69
MacDonald
-0.68
Jean
-0.68
ansson
-0.66
Ake
-0.66
mathematic
-0.65
pmwiki
-0.65
assian
-0.62
POSITIVE LOGITS
arette
1.36
arettes
1.25
cigarettes
1.14
cig
1.10
smoked
1.09
smoking
1.09
smoker
1.07
cigarette
1.07
smokers
1.05
smoke
1.03
Activations Density 0.040%