INDEX
Explanations
phrases related to risks to lives or health
phrases and words related to risk and life-threatening situations
New Auto-Interp
Negative Logits
Smooth
-0.67
*/(
-0.66
ugal
-0.65
olid
-0.64
stride
-0.63
ALSE
-0.63
enium
-0.62
Deadline
-0.62
motions
-0.62
soType
-0.61
POSITIVE LOGITS
endangered
1.12
forfeit
1.11
jeopard
1.07
jeopardy
1.06
angering
1.05
endanger
1.03
sacrificed
1.02
harmed
1.00
hostage
1.00
peril
0.99
Activations Density 0.262%