INDEX
Explanations
mentions of smoking
words related to smoking and its effects
New Auto-Interp
Negative Logits
heit
-0.63
rants
-0.62
Letter
-0.60
Lama
-0.60
DragonMagazine
-0.60
Tud
-0.60
Vanguard
-0.60
Highlander
-0.59
Cerberus
-0.59
quo
-0.58
POSITIVE LOGITS
ooth
1.38
iley
1.34
oky
1.32
ugg
1.30
oked
1.28
oke
1.27
okers
1.26
oking
1.25
okes
1.25
oker
1.23
Activations Density 0.010%