INDEX
Explanations
words related to suicide
words related to suicide and self-harm
New Auto-Interp
Negative Logits
directions
-0.81
Turks
-0.73
Fathers
-0.67
tightening
-0.66
gasoline
-0.63
Lyme
-0.63
Gutenberg
-0.63
torque
-0.62
heads
-0.62
earnest
-0.61
POSITIVE LOGITS
zanne
1.28
arez
1.28
icide
1.17
pper
1.08
ply
1.04
su
1.03
icides
1.02
itable
1.02
itton
1.02
pped
1.01
Activations Density 0.012%