INDEX
Explanations
phrases related to public health and safety
New Auto-Interp
Negative Logits
Zen
-0.71
confess
-0.65
pen
-0.60
married
-0.60
ellipt
-0.60
approving
-0.59
udicrous
-0.59
sort
-0.58
ANN
-0.58
homosexuality
-0.58
POSITIVE LOGITS
wellbeing
0.93
rity
0.81
.</
0.77
uality
0.75
prospects
0.75
itals
0.74
livelihood
0.74
jriwal
0.73
deterior
0.72
giene
0.72
Activations Density 0.225%