INDEX
Explanations
statements relating to health interventions and their effectiveness
New Auto-Interp
Negative Logits
gle
-0.16
flush
-0.15
iber
-0.15
à¸ģà¸ķ
-0.15
cuales
-0.15
quals
-0.15
560
-0.15
relent
-0.15
621
-0.15
680
-0.15
POSITIVE LOGITS
wanting
0.18
want
0.17
Pane
0.16
Upper
0.16
ison
0.15
alike
0.15
duo
0.15
oi
0.15
ãĥ¼ãĥĹ
0.14
лаÑģÑĤи
0.14
Activations Density 0.232%