INDEX
Explanations
mentions of smoking or related activities
references to smoking and its effects
New Auto-Interp
Negative Logits
assian
-0.86
Vector
-0.78
ngth
-0.69
Defenders
-0.68
ensional
-0.68
translation
-0.68
HCR
-0.67
ousse
-0.66
Ake
-0.65
Nou
-0.65
POSITIVE LOGITS
cessation
1.36
smoking
1.19
cigarettes
1.12
smoker
1.11
smoked
1.07
smoke
1.02
cigars
0.99
smokers
0.95
habits
0.94
cig
0.91
Activations Density 0.013%