INDEX
Explanations
references to substance abuse and addiction
New Auto-Interp
Negative Logits
šov
-0.16
-prepend
-0.15
agal
-0.15
_PF
-0.15
iad
-0.14
ancell
-0.14
.ribbon
-0.14
zl
-0.14
ipel
-0.14
ento
-0.14
POSITIVE LOGITS
heroine
0.32
crack
0.29
drugs
0.29
stim
0.28
substances
0.28
pot
0.27
mari
0.26
illegal
0.26
dr
0.25
opi
0.25
Activations Density 0.137%