INDEX
Explanations
mentions of substance abuse
references to substance abuse
New Auto-Interp
Negative Logits
Blazers
-0.80
Slate
-0.75
rium
-0.69
DIR
-0.68
aldi
-0.67
Pratt
-0.65
doms
-0.64
ersen
-0.64
OY
-0.64
estamp
-0.64
POSITIVE LOGITS
abuse
1.06
abusers
0.96
Abuse
0.95
abuser
0.90
abuse
0.88
fulness
0.88
substances
0.88
amphetamine
0.84
itute
0.83
uality
0.82
Activations Density 0.024%