INDEX
Explanations
instances of negative outcomes or critiques in relation to health or social interventions
New Auto-Interp
Negative Logits
.
-0.45
_
-0.44
adec
-0.41
perature
-0.37
getAdapter
-0.35
setSize
-0.35
seiten
-0.35
cej
-0.34
मि
-0.34
ourselves
-0.34
POSITIVE LOGITS
saraba
0.79
FTFY
0.75
PMailer
0.74
ویکیپدیای
0.73
LabelTagHelper
0.72
鄄
0.71
”-
0.71
]-[
0.70
“-
0.70
يكب
0.69
Activations Density 0.162%