INDEX
Explanations
references to suicide and mental health issues
New Auto-Interp
Negative Logits
UGIN
-0.17
ucch
-0.15
.yahoo
-0.15
riba
-0.15
-mf
-0.15
illis
-0.14
illage
-0.14
GMT
-0.14
virtual
-0.14
íĮĮ
-0.14
POSITIVE LOGITS
suicide
0.41
Suicide
0.39
suicidal
0.36
suicides
0.35
suic
0.33
Su
0.30
Su
0.29
su
0.28
_su
0.24
SU
0.23
Activations Density 0.108%