INDEX
Explanations
phrases related to health interventions and their implications
New Auto-Interp
Negative Logits
all
-0.54
triple
-0.53
triple
-0.48
Triple
-0.45
etc
-0.43
Triple
-0.42
.
-0.41
❹
-0.41
set
-0.40
tre
-0.40
POSITIVE LOGITS
secondly
1.09
findpost
1.08
AndEndTag
1.03
estekak
0.99
مرئيه
0.95
RenderAtEndOf
0.93
Geplaatst
0.89
referenties
0.89
المعيارى
0.86
FunctionFlags
0.86
Activations Density 0.450%