INDEX
Explanations
phrases related to improving and increasing effectiveness in treatment strategies
New Auto-Interp
Negative Logits
.
-0.51
latter
-0.51
l
-0.46
later
-0.45
<eos>
-0.44
</strong>
-0.43
-
-0.43
ok
-0.42
pourtant
-0.42
so
-0.42
POSITIVE LOGITS
كومونز
0.94
Increase
0.94
Increase
0.91
increase
0.90
Increases
0.86
ICAGO
0.86
increases
0.86
increase
0.84
Increases
0.83
increases
0.83
Activations Density 0.324%