INDEX
Explanations
like and as for comparisons
New Auto-Interp
Negative Logits
surtout
0.23
Added
0.22
、
0.22
Announces
0.22
(
0.22
incluindo
0.22
↵↵
0.21
、
0.21
。
0.21
incluyendo
0.20
POSITIVE LOGITS
we
0.36
आपण
0.34
you
0.30
было
0.30
they
0.29
любят
0.29
ocurre
0.29
it
0.29
happened
0.28
bạn
0.28
Activations Density 0.045%