INDEX
Explanations
discussions about critical issues and potential threats, including those related to public health, economic challenges, and environmental impact
phrases related to public health and safety concerns
New Auto-Interp
Negative Logits
Nope
-0.90
nicer
-0.80
prett
-0.77
Style
-0.76
Funny
-0.75
Pretty
-0.72
Pretty
-0.72
cheerful
-0.71
Nice
-0.71
classy
-0.70
POSITIVE LOGITS
jeopard
1.20
impacting
1.19
adversely
1.19
destabil
1.13
impacts
1.08
exacerbated
1.06
exacerbate
1.02
disruption
1.01
endanger
1.00
disrupting
1.00
Activations Density 0.987%