INDEX
Explanations
references to substances or substance abuse
New Auto-Interp
Negative Logits
aits
-0.15
aval
-0.14
ÑģÑĤаÑĤи
-0.14
zsche
-0.14
chner
-0.14
ÑĤож
-0.14
λογή
-0.14
uan
-0.14
ов
-0.14
ollo
-0.13
POSITIVE LOGITS
ively
0.19
sts
0.15
ignet
0.14
icro
0.14
Sierra
0.14
controvers
0.13
mÃŃ
0.13
ÑĸÑĤÑĤÑı
0.13
aire
0.13
νÏĦ
0.13
Activations Density 0.014%