INDEX
Explanations
mentions of specific medical conditions, especially alcoholism
themes related to addiction and political positions
New Auto-Interp
Negative Logits
ty
-0.76
Bey
-0.71
Pont
-0.69
kus
-0.65
nipples
-0.63
ne
-0.61
necks
-0.61
pee
-0.61
abet
-0.60
har
-0.59
POSITIVE LOGITS
ously
0.89
ousing
0.74
incarn
0.74
ilial
0.73
":""},{"0.70
gdala
0.69
lectic
0.69
revolving
0.68
arises
0.68
collided
0.67
Activations Density 0.237%