INDEX
Explanations
phrases related to potential risks or threats
phrases that indicate the risk or fear of certain actions or consequences
New Auto-Interp
Negative Logits
isSpecialOrderable
-0.79
optimism
-0.70
onement
-0.68
preparation
-0.65
Begin
-0.64
respecting
-0.63
openness
-0.63
preparations
-0.61
improving
-0.60
practise
-0.60
POSITIVE LOGITS
attacked
1.18
overrun
1.16
robbed
1.10
bitten
1.09
subjected
1.08
sucked
1.07
swept
1.06
ridiculed
1.05
victimized
1.05
harmed
1.05
Activations Density 0.174%